Solutions for Virtualizing Domain Controllers (Part 5)

by [Published on 23 Nov. 2010 / Last Updated on 23 Nov. 2010]

This article discusses steps that you can take to safeguard your domain controllers in a virtualized environment.

If you would like to read the other parts in this article series please go to:

Introduction

So far my discussion of domain controller placement within a virtua

l datacenter has revolved primarily around preventing them from being a single point of failure within your domain infrastructure. There are however, some additional considerations that I have yet to touch on. In this article, I want to talk about some of the steps that are necessary for preserving the integrity of the Active Directory in a virtualized environment.

Right about now, you might be wondering how the simple act of virtualizing a domain controller could threaten the integrity of the Active Directory. There are actually a few different things that could potentially lead to Active Directory corruption if your virtual domain controllers are implemented improperly.

Disk Write Caching

One of the biggest threats to the integrity of the Active Directory is disk write caching. As I'm sure you know, the Active Directory is really nothing more than a database that resides on your domain controllers. If this database becomes corrupted, then the Active Directory as a whole may also become corrupted. Of course this depends on the nature of the corruption, and on the placement and role of the corrupt domain controller.

Because database corruption has the potential to adversely affect the entire Active Directory, Microsoft has taken some precautions to prevent the Active Directory database from becoming corrupted. One of these steps involves disabling disk write caching on the volume that contains the Active Directory database. This is done automatically when you promote a server to domain controller status.

The reason why Microsoft disables disk write caching on the Active Directory database volume is to prevent potential data loss due to power failures or delayed write failures. Eliminating the disk write cache goes a long way toward preventing the loss of Active Directory database transactions in the event that something should go wrong.

When a domain controller is virtualized, Windows still automatically disables disk write caching for the volume containing the Active Directory database. The problem is that Windows has no idea that it is running within a virtual machine. Although disk write caching might be disabled for the virtual hard disk file, the physical volume that the file resides on is under the control of the parent operating system, and therefore disk write caching may still be enabled.

Whether or not disk write caching is automatically disabled for the physical volume depends on the virtualization software that you are running and on the server's hardware configuration. Things can become particularly interesting if a virtual domain controller is configured to use SCSI emulation mode. If the host server offers full support for SCSI Forced Unit Access (FUA) then you do not have to worry about buffered write operations at the host level. Only unbuffered data will be sent to the physical disk. If FUA is not fully supported though, then you may have to manually disable write caching at the host level.

An Added Precaution

Even if your host server is configured to avoid using disk write caching, there is still the possibility that some Active Directory database transactions could be lost in the event of a power failure. The loss probably wouldn’t be as significant as it would be if disk write caching were enabled, but data loss is still something that you should try to avoid. As such, you should always provision each of your host servers with an Uninterruptable Power Supply (UPS).

Other Affected Services

Before I move on, I just want to quickly mention that just as delayed write failures and power failures have the potential to wreck havoc on the Active Directory database, they also have the potential to cause problems for other types of databases. While you probably take steps to protect database driven applications such as Exchange Server, it is easy to forget that there are a number of internal Windows services that use an Extensible Storage Engine database and should therefore be protected against disk write caching and power failures. Besides the Active Directory database, some of these services include the File Replication Service (FRS) and DHCP.

Virtualization Software Misuse

Probably the single biggest threat to a virtualized domain controller comes from human error. Most of the major virtualization vendors include snapshotting features in their wares. In case you are not familiar with snapshotting, creating a snap shot is similar to creating a recovery point. A snap shot allows you to roll back a virtual machine to a particular point in time.

As I’m sure you can imagine, snap shots can be very handy. As a best practice, many administrators will create a snap shot prior to performing any configuration changes to a virtual server. That way, if something goes wrong they can painlessly undo the change. The problem is that you can’t use snap shots (or differencing as they are sometimes called) on domain controllers.

The reason has to do with the distributed nature of the Active Directory. To see why this is such a problem, imagine that you decide to create a snap shot of a virtualized domain controller prior to applying a new patch. Now, imagine that a week later you decide that the patch is buggy and needs to be removed. Rather than uninstalling the patch, you roll the server back to its previous state using the snap shot that you have created.

The problem with the situation that I just described is that Active Directory information is replicated to every domain controller in the entire domain. During the week after the patch was applied, there have almost certainly been changes to the Active Directory. User accounts may have been created, computers might have been added to the domain, or user attributes might have been updated. The type of Active Directory update is irrelevant for our purposes.

As you may know, every Active Directory update is numbered with an Update Sequence Number (USN). Windows uses the USN to determine which Active Directory updates have been replicated to a domain controller. When you roll a domain controller back to a previous state by restoring a snap shot, any Active Directory updates that have occurred since the snapshot was made are removed from the domain controller. Although these updates may exist on other domain controllers, they are never replicated to the domain controller that has just been rolled back, because none of the domain controllers are aware of the role back. In this type of situation, the domain controllers fall into an inconsistent state, and there is no easy way of making them consistent.

You might be wondering how a roll back is any different from restoring a system state backup of a domain controller. When you perform a non authoritative Active Directory restoration, Windows is aware that the Active Directory database has been reverted to an earlier state. As such, any missing transactions are replicated from other domain controllers until the newly restored domain controller is brought up to date.

Conclusion

Any time that you virtualize a domain controller, it is important that you take the necessary steps to prevent domain controller corruption. Some of these steps include avoiding disk write caching, educating administrators on the fact that they should not use snap shots on domain controllers, and equipping each host server with a UPS.

In the next article in the series, I will begin discussing some of the aspects of virtualizing multi domain forests.

If you would like to read the other parts in this article series please go to:

Advertisement

Featured Links