Backup / Disaster Recovery, should I store RAR-compressed files?


I'm in the process of recovering files from an accidentally formated Ext4 partition using Photorec. It had about 300Gb of data, of which I've already got hold of about 30Gb.

So far, it seems to me that the recovery of RAR-compressed files has been much more successful than the recovery of individual uncompressed files and ZIP compressed files – in the sense that a lot of recovered files/zips were unreadable, and pretty much all of the RAR files were intact.

Is there such a relation? Are RAR-compressed files really less prone to corruption and thus easier to recover?

Best Answer

Are RAR-compressed files really less prone to corruption and thus easier to recover?

To some degree, yes.

When you are trying to recover data, you are likely to get the start of a file more than the end. This is because the file-system always stores a pointer to the beginning of the file in the directory entry. After that, there is some sort of chain (FAT, MFT, inodes, etc.) that keeps track of each subsequent piece of the file. If you have unfragmented files (files that are stored in a block of contiguous clusters), then 100% recovery of the file is almost assured so long as you have the starting cluster and the size (you may end up with some junk padding the end of the file depending on the recovery program; PhotoRec falls into this category). However if the file is fragmented (part of it is stored here, part there, etc.) then recovery is not likely and you will end up with only the first cluster (512/4096 bytes) of the file.

The reason that the RAR files seem to be okay is because they store most of their meta-data (e.g., the details of the contained files) at the beginning of the file. As such, any RAR files you find that are in fact RAR files (that is, they actually start out as a RAR file instead of containing data from something else) will look like they work. On the other hand, ZIP files store their meta-data at the end of the file, so if you do not have the whole file, it cannot be read at all.

However, if you are recovering files from a volume in which any given file may or may not be corrupt, then chances are equally good (equal to any other file-type) that the RAR file is also corrupt and if you perform an archive-test on it, it will report that at least some of the contained files are corrupt.

Should I store RAR-compressed files?

Like pretty much everything else in the universe, it has its pros and cons. Here are a few, but they are not really specific to RARs, they apply to archive files in general:

  • Pros
    • The actual data is compressed, so you can recover more data per cluster
    • Many archive formats include the innate ability (when selected) to recover from a certain level of corruption since they were designed for archiving to unreliable media
  • Cons
    • Binary data is pretty much always more difficult to recover than plain-text
    • Because archives are compressed, even a slight corruption can ruin the whole file as opposed to for example a plain-text or graphics file which can still be usable if it has a bit of corruption

If you are wondering if you should store your files as RARs in order to increase recovery success, then I would say no. If you are planning for the future, then you are much better off using more traditional and proven techniques like making backups. If you do use RARs for archiving important data, add a recovery record.

Related Question