Richard Bucker

Giving Privacy a Pass

Posted at — Feb 28, 2012

As I’ve written recently written I am in the process of moving my backups to TimeMachine, SuperDuper and BackBlaze. One of the things that I was thinking about was the amount of time that it was going to take for the initial upload to BackBlaze. When I was looking at other solutions the initial upload was going to be on the order of months. BackBlaze, on the other hand, is said to auto-throttle.I’m assuming that auto-throttle means that they watch the system load and then throttle the amount of bandwidth that it takes on the single system. Since I’m backing up 2 systems with individual accounts (one master account) to BackBlaze both machines seem to be humming alone nicely.Back to my thought… in what order are the files backed up during the initial upload and thereafter. I’m certain that there have been studies on the average user’s average file size and the number of edits over it’s lifetime. And this information would be key.During the initial upload I’d probably sort the files by size and not date. Doing all of the smallest files first. There are two reasons. (1) because it shows that the backup is making progress and the user is less likely to abandon the upload if they make quick progress. (2) in the event of a crash during the initial backup you might have a better chance of recovering more of the system in terms of individual files.┬áThe same cannot be said for the incremental backups; the largest files might get starved.Anyway, BackBlaze does not appear to be taking months to achieve the online backup I was hoping for. Let’s hope I never have to perform a restore… but there is something to be said for purchasing the occasional snapshot.But now for privacy. Let’s say for the sake of argument that I have elected not to encrypt my data that is being backed up. Now immediately before the backup, if BackBlaze generated some sort of signature of each target file and compared it to the entire dataset on it’s servers it could reduce the backup time and duplicate storage costs by consolidating duplicates. This would work well for movies, videos, music but not individual unique titles. However, at this exact moment BackBlaze is backing up my iTunes library. I know I have about 8,000+ files and BackBlaze is reporting that it has 8,000 files to backup. I do not care if anyone knows what’s in my music library. It’s all paid for. Seems to me that there is no reason to upload them (same for iTunes match).