How to do an efficient incremental backup to cloud


I have a folder that contain files of current and previous projects that I plan backup using versioned rsync. For more a more robust backup strategy I want to store a monthly snapshot offsite (eg amazon glacier) at regular intervals.

To save space and bandwidth I want to compress the the backup before sending it offsite. However, since only a small fraction of the total number of files change from month to month, sending the whole compressed library each backup will also be a huge waste of bandwidth.

Ideally what I want to do, is to compress the backup into volumes of 500mb (or some other size) and upload them to my offsite storage. Next time I backup, most of these volumes should be identical to the previous backup, except for those containing files that have been changed since the last backup. In this scenario I only need to upload the changed volumes, saving bandwidth (and file write requests).

Is it possible to do what I describe using a combination of tar and gzip (split maybe?). Or other command line tools?

One issue I can imagine is that if a change happens to a file contained in some volume, the content of all the subsequent volumes may be offset, requiring a re-upload of the changed volume and the subsequent volumes. Perhaps its better to segment the volumes by folders somehow?

I would love to hear any input or suggestion you have
Best regards

Best Answer

tar can do this with the --listed-incremental flag so as described I would probably do that. You can use whatever compressors tar supports to compress it (or just pipe it through an arbitrary compressor). See

I'm not sure what sort of projects these are, but if it's code or some other text-based format I'd probably look into using git or some other source control system.

I should also point out that this is GNU tar. If you are on a BSD or other unix, you might need to install gnutar because I don't think bsdtar supports this.

Related Question