These plans already have been foiled:Amazon Drive failure to launch
Having recently reformatted my desktop to document my setup in my dotfiles and hackintosh Git repositories, I needed to reinitiate my backup process. It became clear Backblaze was not going to work for me any longer due to the fragility of their architecture. After admittedly minimal research (I have pursued far too many tangents lately!), I am ditching Backblaze and trying Arq+Amazon Cloud Drive instead. Backblaze was at least kind enough to give a refund for my unused time.
Arq only has one job - to backup - and v6 broke that Been using Arq v5 for a while and it is okay. Does what it is supposed to and is pretty easy to recover files. 6 is just garbage, lost my old backups and will not backup my PC anymore. Arq works with Amazon cloud, S3, Glacier, Backblaze B2, Google Cloud storage, Dropbox, basically anything. Arq also lets you hold encryption keys, so you have control of your data.
Backblaze Inherit Backup State / Backblaze failed to make this computer inherit your backup state. / ERR_error_unknown
Since I have many terabytes of data I plan to backup, it remains to be seen if I will really be able to upload it all to my new unlimited storage Amazon Drive. If it works, I will be paying about $60 per year to store 5 to 10 terabytes of data, plus the one time purchase of Arq for $50. If there is an issue with Amazon, I will at least be able to easily configure Arq to upload all or part of my data to another service. This is about the same cost as Backblaze, but is far more configurable and should hopefully also help me avoid needing to upload all my data repeatedly due to system changes. I also hope Arq is not designed such that it requires keeping more than a gigabyte of information in memory at all times due to my having selected several million files for backup.
For the record, below is my correspondence with Backblaze. The first truly helpful reply, I have to say, also made my mind up about quitting, as it made it sound like they were actively working on the problem, but I found that response posted verbatim in a forum years ago!
Failed inherit (Request #268142)
Sent April 25, 2017 18:06:
Is there anything you can tell me about my account or that I can try before I start from scratch again uploading 5 terabytes of data?
Screen Shot 2017-04-25 at 18.01.34.png
(30 KB)Screen Shot 2017-04-25 at 18.05.52.png
(20 KB)
Received April 26, 2017 10:03:
Hello,
Thanks for writing in.
Sometimes Backblaze’s background processes can make an Inherit Backup State fail.
Here’s what I’d like you to do:
- Open the Backblaze system preference / control panel
- Click Settings -> Schedule
- Change the schedule to Backup on Click
- Disconnect all external drives
- Reboot your computer
- Attempt the Inherit Backup State again
- If it’s successful, connect your external drives, open the Backblaze settings again and select the externals you’d like backed up
Regards,
Ryan
Sent April 26, 2017 14:17:
Thanks, but unfortunately that made no difference. I also tried uninstalling and reinstalling. There isn’t a log file somewhere I can look at?
I am disappointed I have somehow had to start from scratch about annually due to the transfer process never working. I excused it when you guys were new, but at this point I have spent a huge amount of time exposed to data loss because I chose to stick with Backblaze for all these years. If I am going to have to upload 5TB and growing every time I make an upgrade or configuration change to my computer (and pay Comcast hundreds of dollars for data overage fees), the very least Backblaze could do is not delete my previous backup until however many months it takes me to upload all the data again! It is also funny I actually have about 12TB of data I would like to back up, but I have never been able to get through my more critical files!
Is there any hope this madness will end? It seems I would be far better off simply running rsync to Amazon Glacier if I have to start over. Then I would never need to “transfer state” again…
Received April 28, 2017 00:19:
Hello,
I think I know what is going on, but you’ll have to help fill in some details. Something has gone slightly haywire on your system that we are seeing in very few customers backups related to the bzfileids.dat
file.
There is a very specific file on your disk that is part of Backblaze which has bloated up to be too large. It is called bzfileids.dat
and is found here:
Mac: /Library/Backblaze.bzpkg/bzdata/bzbackup/bzfileids.dat
Windows Vista/7: C:ProgramDataBackblazebzdatabzbackupbzfileids.dat
Windows XP: C:Documents and SettingsAll users.windowsApplication databackblazebzdatabzbackupbzfileids.dat
This is a very simple file, it is a mapping from your filenames to a totally unique integer ID that is anonymous that we use to identify your files in the Backblaze datacenter. This means we never know any of your file names, or file contents.
For some reason, your computer wants to backup a few million files and your bzfileids has grown very large (yours is over 1 GB). When bztransmit (the process that runs once per hour) starts up, it reads this bzfileids.dat
file into RAM. On a normal machine, this is about 20 MB, but on your machine something has gone haywire and bzfileids has grown far too large.
Now, there are several things that contribute to this file being large, so you can think about how this happened and let us know. We’re trying to understand this situation better:
A) If you have ever renamed a Time Machine folder at the top of your hard drive, Backblaze will bloat up trying to back it up. It is absolutely not supported to “back up a back up” and Backblaze can only function properly backing up the originals.
B) Lots of files. If you knew of a folder with hundreds of thousands of small files that didn’t change much you could back them up differently or exclude them from Backblaze backups.
C) Renaming top level folders with a lot of files. For example, if your top level folder name is /my_music
and it contains 100,000 file names in it, then when you rename it /my_great_music
Backblaze needs to add all of those filenames to that bzfileids.dat
file which bloats it up. So the best thing you can do is keep your enormous folders the same over a long period.
D) Shorter Path Names. It would be best if your hundreds of thousands of files are on a disk called d
instead of disk_that_contains_files
and the top level folder is called f
instead of folder_for_lots_of_files
. Etc. The shorter the paths, the smaller the bzfileids.dat
file is.
It’s possible to shrink the bzfileids.dat
in case they’ve been temporarily bloated by one of the above situations, however it requires reuploading all data to the Backblaze servers. You can follow these steps to do that:
- Visit https://secure.backblaze.com/user_signin.htm and sign in to your Backblaze account with your email address and password.
- Click on the “Preferences” link in the upper left hand corner
- Select your “old” computer from the list of computers.
- Click the “Delete Computer” link next to it. This will delete the backed up data, the bloated bzfileids.dat and free up the paid license.
- Click on “Overview”
- Click the download link for your operating system in the bottom right corner.
- Install Backblaze.
Here’s the problem. If you cannot reduce the number of files or path names significantly, then you absolutely are going to encounter this issue again. We are working on a fix for it, but it is proving to be very difficult and currently our engineering team has no ETA on a fix. If you uninstall and reinstall without changing anything, then your backup might start working again, but you will reencounter this problem a little ways down the road.
Regards,
Ryan
Failed inherit (Request #270170)
Sent May 04, 2017 15:31:
This is a follow-up to your previous request #268142 “Failed inherit”
Thank you for the more detailed information. It was very helpful. Unfortunately I have determined the number of files I need to backup and other factors precludes me from being able to continue using BackBlaze. I wish I knew about this before I just paid for 2 years a few months ago. Is there a way I can get a refund?
Arq Vs Backblaze
Thank you
Received May 05, 2017 23:28:
Hi Charlie,
Normally we do not issue refunds outside of our 30 day policy. However, I’m prepared to offer a prorated refund in this case. In order to do so, the backup and license will need to be deleted using these steps:
- Visit https://secure.backblaze.com/user_signin.htm and sign in to your Backblaze account with your email address and password.
- Click on the “Preferences” link on the left hand navigation.
- Locate the computer in the list of computers (there may only be one).
- Click the “Delete Computer” link next to it and proceed with the deletion. It will remove that computer’s backup.
- Click on “Overview” link in the upper left hand navigation.
- In the Unused License area, click the Delete link and proceed with the deletion.
Please respond to this email when you’ve deleted that computer and license so we can process the prorated refund.
Regards,
Ryan
Sent May 10, 2017 18:11:
Hello. I have completed the steps. I really appreciate the refund. If you guys need any more information or make changes that could help use cases like mine, feel free to reach out. Thank you!
Received May 11, 2017 16:05:
Hi Charlie,
The prorated refund of $72.72 has been processed, and should reflect on your bank statement within 3-4 business days. Let us know if you have any further questions.
Regards,
Ryan
In this article I share bit of what I’ve learned in putting together a backup anddata synchronization system for myself and my family. My goal is simple enough tostate generally: I want to make sure all of my notes, documents, photos and videosare backed up and available from anywhere. Diving into the details of this goal iswhere things get complex. Happily, I think the end result is simple enough for othersto emulate.
Gathering the requirements
Teasing apart my goal a bit more, I refined it to the following requirements:
- Every document from every device should be backed up to redundant, cloud storage.Losing a device should not mean losing data, nor should losing all of my devices atthe same.
- I’d like immediate access to some documents from all devices with transparentsynchronization. Some documents, like in-progress notes and drafts and to do listsneed to be up to date at all times on all devices.
- Photos are a bit special. I’d like the ability to view and search through everyphoto and video in my entire library from desktop and mobile devices. In addition,many photos need to be shared with others.
- I’d like for my personal data to be encrypted at rest and in transport.
- I’m lazy and forgetful: Whatever system I put together should be easy to maintainand mostly invisible.
- I need to store about 100GB of documents and 500GB of media.
Media is special
The special case of photos caught my attention, and I thought a bit more about how Itake, organize, access, and share pictures. This led to a few more requirements:
Arq Backup Backblaze B2
- An easy, consistent workflow is important to me. Everything captured from either myphone or DSLR should end up in the same places.
- Edits to photos and meta-data on any platform should be synchronized and backed up.
- As mentioned above, easily sharing photos with friends and family is important tome, but in general my pictures are private.
- I favor searching over ongoing organization. Searching for places, dates, people,and tags should be easy to do.
- I usually take pictures in RAW format, and edit to produce variants of theoriginal. Having a photo systems that understands this is important to me.
Finally, I am our family’s IT administrator. Some of this project will support theirneeds, so the solutions need to be relatively simple and inexpensive.
Considering the options
Now that I understand the problem I’m trying to solve, it’s time to consider thetools available to construct a solution.
Assets
I have some assets already available that could potentially form part of a solution:
- A Mac mini running at my home that is always on.
- An older but perfectly good Drobo NAS at home.
- An Amazon Prime account.
- An offsite, small virtual private server at buyvm.net.
(spoiler: my end solution uses only the first of these).
Services
For keeping a set of folders of data in sync across all of my devices, I looked atGoogle Drive, Dropbox,and Syncthing. For backup software and services, Ievaluated Backblaze,Carbonite, and Arq Backupusing Backblaze B2 cloud storage.Finally, I looked at various photo services, includingFlickr, GooglePhotos, Smugmug,and Amazon PrimePhotos.
Data Synchronization
Both Google Drive and Dropbox are easy to setup and use. I already use Dropbox forwork, so adding it for my personal documents is trivial. The same can generally besaid for Google Drive. Both of these services also have excellent mobile applicationsfor accessing the files. However both fail my encrypted at rest requirements. Thereare tools available to build encrypted filesystems on top of these services, but theycomplicate the final product. In the end, I chose to use Syncthing and not store mydocuments in a cloud (though they are backed up to the cloud). Syncthing lets mecontrol the storage – I use my Mac mini server as the “cloud,” and it’s contents arebacked up continuously to a cloud service. Syncthing can be a bid tricky to setup andmanage – certainly more so than Dropbox or Google Drive. If it proves too complex,and encrypted storage is still important, I would look toSpiderOak
Backups
Backblaze’s backup solution seemed great at first analysis, but the costs becomeprohibitive when scaled to my entire family. However Backblaze does offer a servicecalled “B2” which is similar to but cheaper than S3. I selected Arq Backup withBacklaze B2 as the offsite, geo-redundant storage. B2 is cheap, fast, and easy touse, and Arq is cheap enough, light-weight, and easy to configure and manage on allmy systems. Most importantly, it is easy to manage on all of my family computers, andcan be configured to be bandwidth-friendly.
My annual backup bill for offsite backup of all of my data went from $120/yr withCrashPlan to less than $50 with Backblaze B2.
Media
All of my media is stored on my Drobo NAS at home, managed byMylio, which I like, but don’t love as a photo managementapplication. The photos and metadata are completely backed up to Backblaze B2 asdescribed above. Mylio is installed on all of my devices (include my phone), andkeeps the various systems in sync – all photos end up in original quality on myserver, and are accessible from my laptop and phone as needed.
Mylio is adequate for face detection and geolocation cataloging, but Google Photos isjust too impressive to ignore. I’m using the Google Photos Backup tool from my servermac mini to upload all of my photos to a private Google account. This violates myencrypted-at-rest requirement, but I can live with that for now. Using GooglePhotos, I get amazing search, and good limited sharing.
Putting it all together
Arq Backup Vs Backblaze
My daughter was kind enough to draw a diagram of the whole system:
Mylio takes care of photo syncing, and Syncthing handles all other documents.Arq+Backblase B2 handle offsite backups, while Google Photos enables media sharingand searching. I’m not sure what role the duck plays, but it seems to be important.