“Weak bits” test for flash drive

hardware-failureusb-flash-drive

Flash memory has a limited number of write cycles. In a recent question, @Rsya Studios discussed problems with reads affecting neighboring bits, which is correctable up to a point. Neither of these problems is like a switch; there is some period where performance is marginal.

Back in the days of floppy drives, there was a method of copy protection called "weak bits". Marginal bits were purposely written to the disk, which required special equipment. The bits could not be duplicated by copying the disk on your home computer. These were tested by doing multiple reads. If the results did not come back the same every time, the disk was recognized as original.

Does anyone know if a similar technique has been applied to testing flash drives for imminent failure–looking for marginal bits through multiple reads? (I'm not talking about writing marginal bits or writing bits and seeing whether they are marginal; just reading existing bits to see if any are marginal.)

Edit: This question is about a testing method and its efficacy for flash drives. Please focus on that and refrain from discussion of whether it is worth testing flash drives or whether flash drives should be used at all for one purpose or another.

Best Answer

Floppy disks and modern flash memory are two totally different things.

A USB flash drive has a flash controller chip with complex logic to handle things like wear leveling and error correction. The underlying complexity is hidden from the computer so it only sees logical blocks and not physical ones. Accessing a floppy disk on old systems was much more low level because you could read the physical blocks directly and any error would be obvious and repeatable.

The built-in error correction detects errors and moves the data to another block. So, the errors that might be caught by trying to use a method like comparing repeated reads have already been corrected if the drive is still usable.

As mentioned in the answer you linked to, the flash memory controller may periodically move data around from one cell to another to prevent data corruption, so repeatedly reading the same block to test it is just going to help wear it out.

So there is a limit to what you can do with a pen drive but it might make sense to run something like chkdsk or whatever your OS's equivalent is once in a while to check the for errors. Even better would be to take regular backups in case it does actually fail.

A normal pen drive should never be trusted for critical data. A proper SSD or HDD is better because it will normally support the S.M.A.R.T error reporting system and can give you some idea about the physical status of the device and whether a failure is likely. Also, some cheap pen drive use low quality flash chips which probably won't last long.

People who really worry about data integrity use something on the opposite end of the scale like a ZFS RAID array on a PC with ECC memory where there is plenty of room to detect and repair most errors.