Data backup strategy: Putting a copy of your company’s backup data in the cloud
Protecting your company’s data has always been of the utmost importance. Data loss or unavailability can put you out of business, especially as volumes grow, regulations are revised, or when a disaster strikes. That’s why for almost as long as there’s been data, there have been data backups.
Many elite organizations use a three-copy strategy to protect their data. The first copy is the primary data itself. The second copy is a backup of the primary data (normally on site in the same data center as the primary data so it can be restored at the highest possible speed). The third copy is where things get interesting.
It used to be that the third copy – a copy of the data backup – would be made to another device in another location. Now companies are putting this copy in the cloud. This makes sense. However, the reality is, you’ve likely been going about this step inefficiently. I’m here to show you why that is and what you should be doing to improve the process.
Stop treating primary and backup data the same
Your company’s primary data continues to grow. Ordinarily that means more time to back it up and higher operating expenses (opex) and capital expenses (capex). However, backup data is fundamentally different than primary data, even if it’s not always treated that way.
Most corporate data used daily has not changed in the last 24 hours – or even months before. Only a small fraction of Wednesday’s data is new compared to Tuesday’s, or even Monday’s.
Instead of backing up everything, the secret to successful disk backup in a world of ever-growing primary data is to just copy what’s changed since the previous backup.
If your company insists on a full backup every day, the combination of deduplication, compression, and infrequently changing primary data will come to your rescue. Taken together, full backups can be reduced by a factor of 20 to 100. The higher the ratio, the less space it takes to hold a full backup, and the quicker it can be accomplished.
Saving space, reducing costs
Consider this: If you only have a 1 percent daily change in your 100TB of primary data, you only need to back up 1TB. The other 99TB of primary data, which were previously backed up, have not changed and are almost instantly retrieved from software pointers to the previously backed-up data.
The remaining 1TB likely contains common data that you already backed up.
Take this blog post for example: I’ve used the word “the” almost a hundred times. In theory, the word could be low-level deduped at a ratio of 100:1. Once I finish writing this post, the word “the” will never change. Therefore, it makes no sense to back it up every day for the next few years, when a simple pointer to the original will suffice.
While the data is not all the same, this goes to show that 1TB of new data could be deduped and compressed to a fraction of its original size and affordably stored in the cloud.
Solving the ‘backup window’ problem
Normally, a company’s backup window – the amount of time to back up hundreds of terabytes of primary data – would be challenging since all the data would need to be read, sent over a wire, and then written somewhere else (and probably a second time just in case). But there is a solution to this issue: backup software.
It solves the backup window problem by determining what data changed since the last backup. You can also use primary storage snapshots, which capture the changes since the last backup. At that point, it’s easy to back up the changed data to another media.
What about restoring data?
It turns out that a high percentage of data restoration involves data that was recently backed up. That’s why a “hybrid” approach to backups is so popular.
You keep a recent portion of your company’s backup data on premises in a secondary disk system and retain a full backup copy in the cloud. This approach means you can restore something that was backed up yesterday or pull a full image from the cloud in the wake of a disaster. It also addresses the long-term costs of trying to keep everything on premises.
Why you should put the second copy in the cloud
There are obvious benefits to secondary cloud backup. First and foremost, it lowers your company’s cost. The cloud doesn’t use your hardware or location. It takes little or no development, no installation costs, no infrastructure costs, and no maintenance costs.
Furthermore, secondary cloud backup can grow as large as your business needs and as quickly as it needs it.
Considering this approach?
The cloud can help smooth out processing peaks caused by busy seasonal demand, lower capex and opex costs, and reduce the risk that a storm could knock out your operations.
This simple strategy directly addresses the backup window business challenge, the frequency of restores, and the long-term needs of the business:
- Use daily incremental backups to capture primary data changes, or leverage deduplication if full backups of primary data are required.
- Keep only the most recent backups on premises for rapid recovery – perhaps a month’s worth.
- In parallel, place a full set of backups in the cloud.
If you’re in the process of rethinking your data backup strategy, we can help. Contact your SHI account executive to talk through your requirements.