Disaster Recovery for FMs
Do you have a disaster recovery and business continuity plan for your workplace? To help you understand the importance of planning for emergencies and how-to ensure your workplace is prepared for the worst, we sat down with an expert in disaster recovery and business continuity.
Paul Sullivan, SVP Product and Operations at Agility Recovery has over 25 years of experience in the disaster recovery and information technology industries. Prior to joining Agility, Paul was Senior VP of Product Development, Marketing, and Sales with Comdisco Continuity Services and General Manager for Business Resilience and Continuity Services with IBM. We asked Paul to tell us more about his career trajectory and how he ended up in his current role.
PS: I’ve been in this business for just over 25 years. I started with a company called Comdisco which was really at the forefront of looking at both disaster recovery initially (understanding how to help customers recover their data center and their data) and addressing people's needs of how do you actually recover the workplace. I worked for Comdisco up until 2002. I actually went through 9/11 with this company and we supported all of our customers through that event, obviously, the worst event that we had seen at that particular time, and very heart-wrenching as well. Unfortunately, the owner of Comdisco passed away early with Cancer. Ultimately the company was facing bankruptcy, which was the precipitating event that let to my decision to move on.
I moved on to IBM’s Business Continuity Services in Canada – ran that for three years. I had some fun understanding how the bigger companies’ process works and how you don’t get to move very quickly. From there I went to Agility Recovery, and I’ve been here for the last 12 years. The reason I moved to Agility was because they were addressing the needs of the businesses, but also the key there was bringing recovery home. So, you bring a modular facility into the parking lot of where your business is, where you may have had a fire, a flood, anything that has affected that building and all you’re asking the people to do is to keep the same routine. People don’t want to travel in a time of disaster, especially in a regional event, and so they can then come to the same location, but just go into a different facility in order to operate that business.
PS: There are people who talk about disaster recovery and there are people who talk about business continuity, and when we look at disaster recovery, we’re typically talking about the IT infrastructure, how you’re backing their data up, how do you restore that data. When we get into business continuity, we’re starting to look at people. People are the core of every business, you need them to run your business. It's important to have IT backed up and most businesses have been doing IT back up and ensuring the data’s protected since the early ages, I mean 30-40 years. The area that nobody seems to address when you get into business continuity is understanding what are your critical processes, what are your critical staff needs, where are they going to operate from? And that’s how you bring the two of them together – how do they connect back to the data? And on top of that, how are they going to communicate?
There are a couple of different things that you want to look at for communication. How are you going to communicate to your employees? You’ve got to be able to tell them what’s occurred, what you have to do, and then on the other side of it, your customers need to be able to get a hold of you. How are they going to get a hold of you, how are you going to make them aware of what the situation is?
We asked Paul to describe a typical day in his role, for someone who doesn’t know the in’s and out’s of disaster recovery and business continuity.
PS: Well, there are different days and different roles, but a typical day is supporting our customers’ testing activity. That’s probably the major thing that occurs day in and day out. We have people coming into our facilities to test their program and it’s very important that people do test their program, or exercise their program (I hate to use the word “test” because it connotes pass or fail). If they exercise the program, they’re going to learn from that and then they’re going to be able to continually exercise and update their program. That exercise, or testing, of the program, equates to your success at Time of Disaster.
We did a case study on two insurance agencies, about the same size, that were affected by a regional event. One customer was able to get up in just over two days. They had an effective program in place, exercised it every year, kept it up to date. The other insurance agency, while they had our services, had never tested the program, had never exercised it, and it took them five days to actually get back up and operational. The interesting part was that the insurance firm that was up in two days actually gained market share – I think they indicated about a 30% market share and were able to retain about 20% of that after the event. So, certainly for them, it was a valuable program to have in place because it ensured that they could serve their customers, but they got a bonus out of it and were actually able to gain business.
So, our typical day is testing customers. The exception, of course, is when there is a disaster, and that’s basically where my teams will shine. It’s “all hands on deck” to be able to support those customers in a disaster situation. We support, on average, between 80-100 disasters each year. The biggest one that occurred was Superstorm Sandy and on that alone we had 109 customers that we supported just in that event. That year was an exceptional year, we had well over 200 events that occurred that year.
Paul has been instrumental in the development of Agility's ReadySuite solution – a disaster recovery solution that for a small monthly fee, helps small to medium sized organizations plan and prepare for any disaster.
PS: One of the things that we looked at, and really ReadySuite came about in around late 2004, early 2005, and what we were doing at the time, being a former GE company, we were servicing a lot of enterprise, fortune 500 companies and really customizing solutions. But there was a market out there in the SMB arena that was not being served. Typically, if a small-to-medium business wanted to look at any kind of a business continuity capability, they were getting some sort of customized solution that was really designed for the enterprise, and also priced that way. So, we looked at addressing that market – and that market is fairly large, you’re talking about a million businesses out there. We took at look at it and said, why don't we create a solution based on our capabilities, that provides all of the elements of recovery for somebody to have a place to go, that is self-contained, using satellite so they are not dependent on any land-based communication services and then let’s provide them with a tool, which we call myAgility, (a templated continuity plan and an alert notification system that are key to the customer and building their capabilities) so that they also have some tools to be able to build a plan, and be able to alert their customers – and let’s price it effectively for that SMB marketplace.
Further to that, while we packaged the solution, and were able to get some economies of scale and pricing for the customers, we also made it very flexible at Time of Disaster. You can look at it as a very a la carte solution. If all you needed from us was technology because you had a server failure, then we’ll ship you a server. If you had a network outage, then we will send you a network capability and get your internet backed up and operational – and by the way, that seems to be more and more of a trend these days, we’re seeing a lot more communication failures. Then, let’s say you’re a small insurance agency and you’ve got a little stand alone building, we can ship you a generator in a power outage in order to get your building up and operational. Power, being the biggest, number one reason that customers declare with us. Of the disasters we support, depending on the year, somewhere between 40 and 60 percent are based on some sort of a power event that’s occurred.
We’ve now expanded that solution from ReadySuite to something called Ready Complete. One of the things that our customers were indicating to us is that we need to be able to get up and operational very quickly, we’ve got a small set of critical staff that we need, and so we’ve created a Ready Complete program that provides the customer not only seats that they can get in 24 hours or less, but it also gives them a mobile capability where they can transition into the mobile long term at their location, and then, of course, the planning tools and everything that goes with that as well. That was one of the keys that were missing there, is that customers were looking for a quicker time to get up and operational, and that comes from many different reasons: demands of the business, SLA’s with their customer, to the need to be able to respond to different business issues within 24 hours. Typically it’s for a small, critical, number of staff.
We asked Paul to share the most challenging aspect of Agility’s work.
PS: The most challenging part that we will always have is during any type of regional disaster event, and how we’re going to be able to support the customer because we don’t know what we’re going into. If we’re supporting a tornado or a hurricane, are roads open? Can we get there? The other challenge, my staff’s challenge, is that they need to be the calming force at Time of Disaster. Organizations and people are not thinking rationally because their business has been affected, their families may have been affected, and we’re there to make sure that they understand that we’re going to take care of getting that business back up. They should be focussed on the other things that are important to them, but we’re going to make sure that the business is back up and operational. That’s our biggest challenge, to make sure that we get there in a timely manner and are able to get them up and operational the most quickly as possible.
On the flip side, Paul tells us his favorite part is how rewarding their work at Agility is.
PS: My favorite part is talking to the customers after an event and they’ve been able to restore their business. To just see their eyes light up, the smile that they’re back up and operational. That’s their livelihood and they want to feel comfortable, they want to know that the company they’re working for has a business continuity plan in place so that the business is going to survive any kind of an event. It’s just great to be able to see that after any kind of a major event. It’s amazing the smiles that you see on people's faces.
Given Paul’s close to three decades of experience in disaster recovery, we asked him what major changes he has observed in the industry over the years.
PS: I guess a couple of things. From an IT perspective, the major change we’ve seen, although some people will tell you that it’s always been there, just never in the form that people talk about, is the advent of “the cloud”. The challenge most IT organization had was being able to recover data and restore it in a timely manner. If you were recovering it in traditional methods, from tape backup, or from disk, it took a long time. Then if you wanted to have even more readily available, you had to put a lot of cost into the program to replicate that data. So, the advent of the cloud, I think has been the biggest change for protecting data in that it could be your program that not only protects your data on a day to day basis, and that’s your production environment, but it’s so easily connected to from a network perspective at Time of Disaster and there is no such thing as having to store data because it’s always active. That’s one of the changes in the IT environment.
I think the other side is, more and more, the realization that organizations are starting to understand (although the lack of business continuity plans and capabilities that organizations don't have in place today is still pretty surprising) is the realization that finally, PEOPLE are the most critical asset. Stop thinking about just data. People are the other element that are key to restoring your business. The change there is that more and more people are understanding that their staff are not as portable as they think they are. People will not go to another city, or several hundred miles away to restore your business. So, it’s important that stakeholders, or more businesses, are understanding that they need to put a continuity plan in place that ensures the recovery of that business as close to home as possible, or at the same location. There are all sorts of different options for that.
Also, the work-at-home strategy is a big change because there’s more availability in internet speeds, capabilities, and applications that allow that. While it’s a great complimentary-type service, as I said the biggest issue that everybody is going to have is major power outages. It doesn’t matter what strategy you have, if you don’t have the ability to either have a generator come in and re-power your building or an alternative space, like a modular capability, people are not going to be able to work. I think it was pretty prevalent when you saw pictures of Sandy, and people trying to plug in so that they could communicate. The piece that everybody has to understand is that they need to have a very sound communications capability plan because cellular networks can go down - we saw that happen in Katrina, we saw that happen with Sandy, even the Virginia earthquake that happened a couple of years ago. The issue with the Virginia earthquake was that the telephone companies do not design their network so that everybody that has a cell phone or a desk phone can get on the phone at the same time. They base their networks on average usage and that means that there are ratios and not everybody can get on at the same time. During that earthquake in Virginia, everybody tried to get on a cell phone and you got either a fast busy signal, or you couldn’t get through, and all of the sudden you couldn’t communicate. So, it’s important to have some capabilities in place whether it’s satellite solutions like we provide or other alternatives, but they need to look at that as well.
With flexible working on the rise, we were curious to ask if remote working has changed Agility’s approach to disaster recovery and business continuity plans.
PS: Absolutely. I think that more and more organizations are seeing that a larger percentage of their work staff may work from home or on a flexible basis. As I said, that works great in your day to day environment. The area that you need to think about is what do you do in an event where there is a regional outage, and there is no power. Nobody can work without power. Your phone is dependant on it, your laptop, and communications are also dependent on it, so it’s important, while that may be your prime vehicle for recovery, you do need a back up to that strategy because there are some holes in it that can occur. It may be that you’re putting a strategy in place that will at least get your critical staff up and operational and in the same environment. The interesting part is, a lot of organizations say, well I can work from home and I often question them. That might work for a short period of time, but you have all these people in an office for a reason. What are the reasons that your staff are all in a building? If they didn’t need to be there, you wouldn’t need your space, you could all work from home. There are reasons why they’re there. Is it because it’s a call center? Is it because of the collaboration? There’s got to be different reasons and you’ve got to ask yourself why do I have all of these people in an office facility and can I survive long term without that?
Paul tells us the top 3-5 trends that facilities managers should know about in business continuity.
PS: The next 3-5 years are going to be interesting I think. The whole world of a virtualized environment will change how people look at recovery capabilities. We’re not going to be so dependent on the technology at our desk anymore because virtualization will allow you to connect to your environment from any device at any time. Another trend is the enhancements to the world of emergency notification and locations. Ways to be able to track people to make sure everybody’s okay, giving them apps on their phone that will make it very easy for them to hit a button and say here’s my location and give everybody the GPS coordinates. I think that’s going to be important because you want to be able to track your staff. The only other piece in there is the work at home solution which is great, but then power is the only issue, so how do you address that? Certainly, there are some issues with brick and mortar and that's where you need to look at something that’s more flexible to be able to provide recovery. With Sandy there was the issue of fuel shortages, nobody wanted to drive. Well, having some sort of a mobile workspace that you can bring closer to where the people live might be a nice strategy in the long term as well.