Bare Metal Restore for Virtual Machines in Hyper-V (Part 1)

by [Published on 13 Jan. 2011 / Last Updated on 13 Jan. 2011]

This series discusses a real life catastrophe that impacted the author’s network.

If you would like to read the other parts in this article series please go to:

Introduction

The first article in the series explains the disaster recovery initiatives that were used to recover from the catastrophe. Later in the series, the disaster recovery plan’s shortcomings will be discussed along with some techniques for working around those shortcomings.

A couple of months ago, my network was hit by lightning. Although I have invested heavily in anti-lightning equipment, it was simply no match for a direct hit from a bolt of lightning. The end result was that nearly every computer on my network was destroyed.

As you can imagine, the lightning strike turned out to be one very expensive disaster. Not only did I have to purchase a lot of new network hardware, but I also lost a week or productivity and missed out on several potential sales while I was offline.

As bad as this disaster turned out to be however, it could have been much worse. Because I have a multi-tiered disaster recovery plan in place, I did not lose any data. Even so, while working through the recovery process, I discovered that there were some aspects of my disaster recovery plan that could have been better. Therefore, my goal is to take a “notes from the field” approach to this article. I want to share my disaster recovery plan with you, and I want to talk about the parts of that plan that needed to be improved. Finally, I will talk about how my disaster recovery plan has changed based on what I know now.

Some Background Information

Before I begin discussing my disaster recovery initiatives, I need to give you a bit of background information on my network. As some of you know, I work as an IT freelancer. The bulk of my income comes from writing IT books and articles like this one, but I also speak at technology conferences, write whitepapers and marketing material for product vendors, and provide consulting services.

Because of the nature of my business, I work out of my home. When I moved in, I converted the entire second floor of my house into a makeshift datacenter, filled with numerous servers. Most of my servers are connected to a lab network that I use when I write about various technologies. However, I also have a production network, which contains some Exchange Servers, a file server, a couple of domain controllers, and a few other infrastructure and development servers.

The reason why I am telling you this is to illustrate the point that I have a very unique setup. Although there are only two users on my network (my wife and myself), the network itself is more like something that you would find in a much larger organization. Because I write and speak about enterprise networking, I decided to use enterprise class solutions on my own network, even though doing so is probably overkill. After all, how many “home users” do you know that have clustered Exchange mailbox servers on their network?

Even though I am using hardware and software that is intended for a large enterprise, the fact remains that there are only two users on my network. This fact plays into my disaster recovery strategy. As I mentioned earlier, I use a multi-tier disaster recovery strategy. Some of the methods that I use are the same as what you would expect within a large organization. Some of my other methods only work because I have a relatively small network.

My Disaster Recovery Strategy

Now that I have given you some background on my network, I want to talk about the disaster recovery plan that I had in place prior to the lightning strike. My primary mechanism for backing up and restoring the data on my network is a server that is running System Center Data Protection Manager 2007 (DPM 2007). In case you aren’t familiar with DPM 2007, it is Microsoft’s enterprise backup solution. DPM 2007 offers continuous data protection through disk to disk to tape backups.

In my case, I have DPM 2007 configured to check the protected resources on my network for changes once every fifteen minutes. If changes are detected, then DPM 2007 copies the blocks that have changed to a dedicated disk array, where they are retained for three weeks. My backups are also periodically written to tape.

Although my DPM 2007 server is not my only line of defense against data loss, I have to say that I got really lucky in that my DPM 2007 server and the external storage array that is attached to it were among the few components on my network that were not destroyed by the lightning strike.

My second line of defense against data loss involves protecting my data, but not the network infrastructure. I have to admit that I use some really unorthodox techniques as a part of my second line of defense. These techniques have worked extremely well for me, but would be completely impractical in a larger organization.

I developed my second line of defense because I travel a lot, and I wanted a way to take most of my data with me when I am on the go. Connecting to my network over a VPN was not an option for me until a few months ago, because of where I live.

Rather than copying all of my data to an external hard drive, I found an HP laptop that had a second drive bay. I installed a 500 GB hard drive. I then used a registry hack to force Windows to use the second drive for offline storage. The end result is that every time I connect my laptop to my network, Windows synchronizes my offline cache with my file server. In doing so, I have created an always up to date copy of my data that I can use when I am on the go. More importantly though, this copy can act as an offline replica that I can use for disaster recovery purposes should the need ever arise.

The part of the second tier of my disaster recovery plan is to protect the data found in my Exchange mailbox. To do so, I have configured Outlook to use offline caching. That way, if my Exchange Server ever fails and I am unable to restore my backup, I can copy the contents of my offline cache to a PST file. When the Exchange Server is eventually rebuilt, the PST file’s contents can be copied back to the Exchange mailboxes. As I said, this approach wouldn’t be practical for use on a large network, but it works well for me.

There are two parts to the third tier of my disaster recovery plan. First, I use cloud storage to protect my file server. Doing so provides me with an offsite replica of my data. That way, if my house is ever wiped out by a hurricane or a fire, then I haven’t lost all of my data.

The other part of the third tier of my disaster recovery plan is that once every six weeks, I shut down all of my virtual servers and use Hyper-V’s export feature to export the virtual machines to an external hard disk, which I keep locked in a vault. Believe it or not, this seemingly insignificant part of my disaster recovery plan played a huge part in my ability to rebuild my network after the lightning strike. I will tell you why that was the case in Part 2.

Conclusion

Now that I have explained how I backup my network, I plan to talk about how I was able to rebuild my network after the lightning strike. Later in the series, I will talk about the unexpected shortcomings of my disaster recovery plan, and how those shortcomings have been addressed.

If you would like to read the other parts in this article series please go to:

Featured Links