Despite the importance, and complexity, of the enterprise cloud data archival space there is a dearth of simple, clear and equivalent comparisons of each of the major offerings (Amazon Glacier vs Google Nearline vs Azure Storage), particularly as concerns costs.
Paramount to any foray into exploring these solutions as possible technologies to implement in your scenario is having an accurate understanding of potential costs of working with each provider.
Before we get into the detail, and I do not intent to add anything to this comparison which isn’t essential, as ever, I will keep this as succinct as possible, it is important to understand what need these solutions can effectively and efficiently meet, and thus what they are not designed to do!
Amazon Glacier (pricing), Google Nearline (pricing) and Azure Storage (pricing) are primarily designed to provide cloud based enterprise data archival (hence the imaginative title of this post!). All three a specifically designed to keep reliable long term data storage costs as low as possible. They offer great value for money when storing information into the cloud, however this comes at a cost. The storage model they offer is asynchronous in that retrieving information from these “stores” is expensive. In fact it can get VERY expensive, VERY quickly.
For most organisations these services will be most effective as a “backup tape replacement” system. They are certainly not designed to be alternatives to file servers or NAS. They must be approached as a “store everything, retrieve nothing” (Or at least “retrieve very little”!). This is obviously ideal when we consider the role that tape backups serve.
So now we’ve cleared that up lets look at costs. I’ll come back to differentiating these services, as there are differences between them, but suffice to say that when used as a final / near final tier of a backup one can assume the provide an equivalent service.
Amazon Glacier vs Google Nearline vs Azure Storage – Cost Comparison
So we can see that, at the time of writing (February 2016), Amazon Glacier certainly represents the best value for money as a cloud based data archival / “tape alternative” solution.
There are many nuances to the performance specifics of each solution, for instance the fact that Glacier has a 3 hour minimum wait before you can start downloading a file, or that Google Nearline’s bandwidth provision during download is limited to 4MBps per TB you store. Each of these is potentially a show stopper, but only in certain specific scenarios. If your main purpose is to backup systems / data in a secure way and you can accept both a large cost and significant delay when you need to retrieve this data, then these technical / architectural “details” should not pose too great a problem for you.
As alluded to earlier it is VERY important to consider download / transfer costs. All the services provide FREE upload into their cloud storage, but they all charge large sums to download data back out again. Downloading individual files / small archives is free / inexpensive, but, as can be seen in the table above, once you get into TB (Terabyte) territory costs start climbing quickly. Be very careful with your planning in this regard!
The bandwidth constraints of each system must also, as mentioned above, be factored in. Basically Amazon works very well for smaller files, whilst Google excels for huge files. Downloading a 100GB file from Nearline will take you at least 7 hours (If you store less than a Terabyte with them) whereas it could be done in less than 6 minutes with Amazon! But if you’re storing a Petabyte then getting that same 100GB file would now take 25 seconds with Google, but the same 6 minutes with Amazon!
I hope to have more time shortly to add a much more in-depth comparison of functionality, performance, bandwidth and small print. Until then I’ll have to leave due diligence to you!