Setting Up Failover Clustering for Hyper-V (Part 7)

by [Published on 14 Sept. 2011 / Last Updated on 14 Sept. 2011]

This article continues the discussion of failover clustering for Hyper-V by exploring the concept of Majority Node Sets and the role played by the File Share Witness.

If you would like to read the other parts of this article series please go to:

Introduction

Now that I have shown you how to connect your cluster nodes to a shared storage pool, it is time to begin creating our failover cluster. Before we can install the Failover Cluster feature, I need to introduce you to the concepts of a Majority Node Set and a file share witness.

In order for a cluster to continue to function after a failover, the cluster must maintain quorum. This is a fancy way of saying that the majority of the nodes in the cluster must continue to be functional (Hence the term Majority Node Set). This is a bit of a problem since we are building a two node cluster, because if either of the two nodes fails then only a single note remains. One single cluster node does not constitute a majority when there are only two nodes in the entire cluster.

Before I show you how to solve this little problem, I want to talk for a moment about why it is important to maintain quorum. For the sake of illustration, let's pretend that we have to cluster nodes and those nodes communicate with each other through a dedicated network link using a dedicated network switch. Now suppose that the switch that links the two nodes together fails. Now, the two cluster nodes are unable to talk to one another. Because communications between the nodes have been disrupted, both cluster nodes assume the other cluster node has failed. This is what Microsoft refers to as split brain syndrome. It is a situation in which both cluster nodes remain online, each thinking that the other has failed.

If a split brain situation were to happen it could wreak havoc on the cluster and on the virtual machines that depend on it. Remember that both cluster nodes are connected to the same storage pool through an unaffected network link. As such, both cluster nodes are still able to interact with the storage pool, and because each node thinks that it is the only remaining node both nodes try to act authoritatively.

As you can see, split brain syndrome is something that needs to be avoided at all costs. That is why it is so important for cluster nodes to maintain quorum. To see what I mean, imagine that a failover cluster had three separate nodes and that one of those nodes were to fail. In this type of situation all of the nodes know that there are a total of three nodes that make up the cluster. The two remaining nodes detect that a single cluster node has failed, but they can also detect that both of the remaining nodes are still online. Because the two remaining nodes constitute a majority of the cluster nodes, the cluster can continue to function without the risk of split brain syndrome. Even if the failed node hasn't actually dropped off line, that node knows that it alone does not make up the majority of the nodes in the cluster and therefore cannot take control over the cluster.

So now that I have explained why it is so important for cluster nodes to be able to maintain quorum during a failure situation, you might be wondering how it is possible for either cluster node to maintain quorum when we are only building a two node cluster.

Well, if the two cluster nodes were the only cluster participants then yes, it would be impossible for either node to claim a majority in a failure situation. Rather than bringing a third node into the cluster though, we are going to create a pseudo-node (that's my term, not Microsoft's).

Obviously, there are significant software licensing and hardware costs involved in adding an extra node to the cluster. However, we can configure one of our existing network servers so that it participates in the cluster whenever necessary without incurring any additional costs. We can accomplish this through the use of a file share witness. In case you're wondering, the impact on the server that will be used for this purpose is negligible.

The basic idea behind a file share witness is that we can create a dedicated file share on a network server. If a failure were to occur within our cluster, then any remaining nodes check to see if they can communicate with the designated file share. If a cluster node is healthy enough to communicate with the file share, then that cluster node establishes quorum.

File Share Witness Location

So now that you know about the file share witness, you may be wondering about the best place to put it. Well, opinions really differ as to the best location for a file share witness. I usually like to place the file share witness on a domain controller. My only reason for doing so is that every domain network has domain controllers, so using a domain controller for the file share witness is a practice that can be done consistently. I have known other people to place the file share witness on a Windows file server, and that's okay too.

I haven't seen any firm guidelines from Microsoft as to the optimal placement of the file share witness. Therefore, I would simply recommend placing it on a server that is known to be reliable. If you have a server that is configured for high availability, you might even consider using that server to store the file share witness (depending upon the server's configuration of course).

As I mentioned at the very beginning of this article series, when you create a failover cluster, each cluster node has its own individual computer name, but there is also another computer name (called the Virtual Computer Object or VCO) that is assigned to the cluster as a whole. Normally when you create a shared folder, you grant access permissions to users. When you create the file share witness folder however, you will be granting permission to the virtual computer object for your cluster instead. As such, we can’t create the file share witness just yet because the virtual computer object has not yet been created. The reason why I went ahead and discussed the file share witness at this point in the series is so that you can begin thinking about where you want to place it when the time comes.

Conclusion

In the next article in the series, we will begin actually building the failover cluster. In the meantime, it is a good idea to verify that both of your cluster nodes and the server that will act as the file share witness are domain members. You should also make sure that both of your cluster nodes are able to access the shared storage pool that will be used to store virtual hard drive files.

If you have decided to use Windows Storage Server to create a storage pool as I have, then it is not necessary for your Windows Storage Server to be a domain member. The cluster will work fine regardless of whether or not the storage server participates in the domain. Having said that, you may be able to achieve better security by joining the storage server to the domain because doing so allows you to enforce group policy settings for the storage server.

If you would like to read the other parts of this article series please go to:

Featured Links