Why is the HDD so slow on the “4K” speed tests

benchmarkinghard drivehardware-failurespeedstorage

What is wrong with my speed at 4K? Why is it so slow? Or is it supposed to be like that?

Screenshot of benchmark

Is that speed okay? Why do I have such low speed at 4K?

Best Answer

What you are running into is typical of mechanical HDDs, and one of the major benefits of SSDs: HDDs have terrible random access performance.

In CrystalDiskMark, "Seq" means sequential access while "4K" means random access (in chunks of 4kB at a time, because single bytes would be far too slow and unrealistic1).


Definitions

There are, broadly, two different ways you might access a file.

Sequential access

Sequential access means you read or write the file more or less one byte after another. For example, if you're watching a video, you would load the video from beginning to end. If you're downloading a file, it gets downloaded and written to disk from beginning to end.

From the disk's perspective, it's seeing commands like "read block #1, read block #2, read block #3, read byte block #4"1.

Random access

Random access means there's no obvious pattern to the reads or writes. This doesn't have to mean truly random; it really means "not sequential". For example, if you're starting lots of programs at once they'll need to read lots of files scattered around your drive.

From the drive's perspective, it's seeing commands like "read block #56, read block #5463, read block #14, read block #5"

Blocks

I've mentioned blocks a couple of times. Because computers deal with such large sizes (1 MB ~= 1000000 B), even sequential access is inefficient if you have to ask the drive for each individual byte - there's too much chatter. In practice, the operating system requests blocks of data from the disk at a time.

A block is just a range of bytes; for example, block #1 might be bytes #1-#512, block #2 might be bytes #513-#1024, etc. These blocks are either 512 Bytes or 4096 Bytes big, depending on the drive. But even after dealing with blocks rather than individual bytes, sequential block access is faster than random block access.


Performance

Sequential

Sequential access is generally faster than random access. This is because sequential access lets the operating system and the drive predict what will be needed next, and load up a large chunk in advance. If you've requested blocks "1, 2, 3, 4", the OS can guess you'll want "5, 6, 7, 8" next, so it tells the drive to read "1, 2, 3, 4, 5, 6, 7, 8" in one go. Similarly, the drive can read off the physical storage in one go, rather than "seek to 1, read 1,2,3,4, seek to 5, read 5,6,7,8".

Oh, I mentioned seeking to something. Mechanical HDDs have a very slow seek time because of how they're physically laid out: they consist of a number of heavy metalised disks spinning around, with physical arms moving back and forth to read the disk. Here is a video of an open HDD where you can see the spinning disks and moving arms.

Diagram of HDD internals
Image from http://www.realtechs.net/data%20recovery/process2.html

This means that at any one time, only the bit of data under the head at the end of the arm can be read. The drive needs to wait for two things: it needs to wait for the arm to move to the right ring ("track") of the disk, and also needs to wait for the disk to spin around so the needed data is under the reading head. This is known as seeking2. Both the spinning and the moving arms take physical time to move, and they can't be sped up by much without risking damage.

This typically takes a very very long time, far longer than the actual reading. We're talking >5ms just to get to where the requested byte lives, while the actual reading of the byte averages out to about 0.00000625ms per sequential byte read (or 0.003125ms per 512 B block).

Random

Random access, on the other hand, don't have that benefit of predictability. So if you want to read 8 random bytes, maybe from blocks "8,34,76,996,112,644,888,341", the drive needs to go "seek to 8, read 8, seek to34, read 34, seek to76, read 76, ...". Notice how it needs seek again for every single block? Instead of an average of 0.003125ms per sequential 512 B block, it's now an average of (5ms seek + 0.003125ms read) = 5.003125ms per block. That's many, many times slower. Thousands of times slower, in fact.

SSDs

Fortunately, we have a solution now: SSDs.

A SSD, a solid state drive, is, as its name implies, solid state. That means it has no moving parts. More, the way a SSD is laid out means there is (effectively3) no need to look up the location of a byte; it already knows. That's why a SSD has much less of a performance gap between sequential and random access.

There still is a gap, but that can be largely attributed to not being able to predict what comes next and preloading that data before it's asked for.


1 More accurately, with LBA drives are addressed in blocks of 512 bytes (512n/512e) or 4kB (4Kn) for efficiency reasons. Also, real programs almost never need just a single byte at a time.

2 Technically, seek only refers to the arm travel. The waiting for the data to rotate under the head is rotational latency on top of the seek time.

3 Technically, they do have lookup tables and remap for other reasons, e.g. wear levelling, but these are completely negligible compared to a HDD...