Do you have a DR Plan?
The concept of “DR” (Disaster Recovery) can mean various things to various people. Solutions currently in place range from “blissfully unaware of the risks” to a fully redundant infrastructure and Business Continuity Plan (BCP); including back-office systems and office space. The advent of Cloud services has more people thinking about DR. Which solution is best for you? Only you can answer that question. Choosing the best DR strategy for your company means finding that balance between “costs versus risk.” The following are some key considerations:
Can you afford for your office or website to be offline for more than a few days? A few hours? A few minutes?
How much income will you not receive if your customers can’t reach your website, or can’t reach you? Is your business highly transactional? Does the prospect of being cut-off from your network or systems for extended periods of time raise red flags?
Be realistic. Ultimately how much your DR plan will cost you is more complex, it costs more to have less downtime because it’s more complex to achieve. A good rule of thumb is that a few hours of down time costs a lot more to deploy than a few days, and a few minutes of downtime (or no downtime) will cost you a lot more to deploy than a few hours.
What happens to your data if the systems holding that data are completely (irretrievably) lost?
Do you have a backup of that data? How current is the backup? A few minutes, a week, or a month old? What are the implications? Have you just lost records of multiple payments (or payments due)?
If you do have a usable copy of your data, what happens now? Is it in a format you can use? Do you have access to systems with the necessary network connectivity to resume operations?
If you are archiving data to an offsite location, you may have preserved the bulk of your data but it is only valuable if you can actually use it. How quickly can you recreate necessary systems using this data?
Do your systems or applications support HA (High Availability), or distribution across multiple geographies?
Are you using legacy systems? Will your data be compromised, or can it be accessed if it is loaded in an alternative location? Do you have the correct licensing model to support your DR strategy?
What’s the best technology or methodology to support your DR strategy?
This includes what applications you will use at your production site, the most efficient hardware platforms to run these applications, and the technology used to replicate your applications to an alternative facility on an ongoing basis. Some of these replication methods are native to your applications and some are not.
Once you get past the above questions, you then move into more specific territory. For example:
- What is your RPO (Recovery Point Objective)?
- What is your RTO (Recovery Time Objective)?
Your RPO is the answer to the question “how much data can you afford to lose?” Your RTO is the answer to the question “how long can I afford to be offline.” Again, be realistic to avoid multiple returns to the drawing board.
How will you replicate your data?
Once your RPO and RTO are established there are multiple approaches to replicating your data between two locations, each with it’s own pros and cons. You may decide that you want to utilise both sets of infrastructure in an active-active configuration, however this is usually a very complex and expensive undertaking. Active-passive tends to be the least complex approach, but it will still require careful consideration.
I’ve got a DR site. What now?
The technology required is only part of the solution. You must also devise the processes that will be initiated when you need to use your DR solution. Every organisation should have its own BCP plan consisting of unique recovery procedures, critical resource information, and procedures for staff to follow during an outage. A BCP plan identifies vulnerabilities and recommends necessary measures to prevent extended service outages, right down to where (or if) your staff should show up for work the next day.
Some example objectives of your BCP plan:
- Identify key contacts and recovery team members
- Identify the location of critical data and systems
- Identify suppliers and customers that must be notified in the event of a disaster
- Identify alternate work locations and sources for supplies
- Document storage, safeguarding, and retrieval procedures for vital records
- Document recovery procedures for specific events
To avoid confusion experienced during a crisis, you should periodically test and review your BCP Plan.
The above is an overview. If you are in doubt and are looking for advice on how to implement DR or BCP, consult the Enterprise Services team at NTT ICT. Ask NTT ICT these questions before your customers ask you.