Exchange 2013 DAG with Dynamic Quorum (Part 1)

If you would like to read the next part in this article series please go to Exchange 2013 DAG with Dynamic Quorum (Part 2).

Introduction

When an administrator creates a Database Availability Group [DAG], it is initially created as an empty object in Active Directory [AD]. This object is used to store relevant information about the DAG, such as server membership information. When the first server is added to the DAG, a failover cluster is automatically created for the DAG and used exclusively by the DAG. DAGs make limited use of Windows failover clustering technology, such as the cluster heartbeat, cluster networks and the cluster database (for storing data like database state changes from active to passive or vice versa, or from mounted to dismounted and vice versa). As each, when a subsequent server is added to the DAG, it is joined to the underlying cluster, the cluster’s quorum model is automatically adjusted by Exchange, and the server is added to the DAG object in AD.

Failover clusters use the concept of quorum, which uses a consensus of voters to ensure that only one subset of the cluster members (which could be all members or a majority of members) is functioning at one time. Highly available Mailbox servers in previous versions of Exchange also use failover clustering and its concept of quorum, so this is not a new concept. Quorum represents a shared view of members and resources, and the term quorum is also used to describe the physical data that represents the configuration within the cluster that is shared between all cluster members. As a result, all DAGs require their underlying failover cluster to have quorum. If the cluster loses quorum, all DAG operations terminate and all mounted databases hosted in the DAG are dismounted.

Quorum is important to 1) ensure consistency so each of the members always has a view of the cluster that is consistent with the other members; and 2) to act as a tie-breaker to avoid partitioning (such as split brain syndrome scenarios) and to make sure that only one collection of the members in the DAG is considered official.

Majority Node Set Clustering

Majority Node Set [MNS] is a Windows Clustering model used since early versions of Exchange. This model requires 50% of the voters (servers and/or one file share witness) to be up and running.

DAGs with an even number of members use the failover cluster’s Node and File Share Majority quorum mode, which uses an external witness server that acts as a tie-breaker. In this quorum mode, each DAG member gets a vote. In addition, the witness server is used to provide one DAG member with a weighted vote. The cluster quorum data is stored by default on the system disk of each member of the DAG and is kept consistent across those disks. A file on the witness server (thus the name File Share) is used to keep track of which member has the most updated copy of the data – the witness server does not have a copy of the cluster quorum data.

In this mode, a majority of the voters must be operational and able to communicate with each other to maintain quorum. If a majority of the voters cannot communicate with each other, the DAG’s underlying cluster loses quorum and the DAG will require administrator intervention to become operational again. When the witness server is needed for quorum, any member of the DAG that can communicate with the witness server can place a Server Message Block [SMB] lock on the witness server’s witness.log file. The DAG member that locks the witness server (the locking node) retains an additional vote for quorum purposes. The DAG members in contact with the locking node are in the majority and maintain quorum. Any DAG members that cannot contact the locking node are in the minority and therefore lose quorum.

Consider a DAG with four members. Because this DAG has an even number of members, an external witness server is used to provide one of the cluster members with a fifth, tie-breaking vote. To maintain a majority of voters (and therefore quorum), at least three voters must be able to communicate with each other. At any time, a maximum of two voters can be offline without disrupting service and data access. If three or more voters are offline, the DAG loses quorum and all databases are dismounted.

Image
Figure 1.1: Database Availability Group with an Even Number of Members

The following formula helps administrators calculate how many nodes in a cluster have to be available before the cluster is brought offline: (n / 2) + 1 where n is the number of DAG nodes within the DAG (note that n/2 is always rounded down). So, in this example, we have: (5/2)+1 = 2+1 = 3.

DAGs with an odd number of members use the failover cluster’s Node Majority quorum mode. In this mode, each member gets a vote and each member’s local system disk is used to store the cluster quorum data. If the configuration of the DAG changes, that change is reflected across the different disks. The change is only considered to have been committed and made persistent if that change is made to the disks on half the members (rounding down) plus one. For example, in a three-member DAG, the change must be made on one plus one members, or two members in total. In this scenario, and using the formula above, only one server can be down at one time. If a second server is also offline, the entire cluster will be brought offline.

Image
Figure 1.2: Database Availability Group with an Odd Number of Members

Windows Server 2012

Windows Server 2012 introduced a new model called Failover Clustering Dynamic Quorum, which we can use with Exchange. When using Dynamic Quorum, the cluster dynamically manages the vote assignment to nodes based on the state of each node. When a node shuts down or crashes, it loses its quorum vote. When a node successfully re-joins the cluster, it regains its quorum vote. By dynamically adjusting the assignment of quorum votes, the cluster can increase or decrease the number of quorum votes that are required to keep it running. This enables the cluster to maintain availability during sequential node failures or shutdowns.

With a dynamic quorum, the cluster quorum majority is determined by the set of nodes that are active members of the cluster at any time. This is an important distinction from the cluster quorum in Windows Server 2008 R2 where the quorum majority is fixed, based on the initial cluster configuration.

Important:
The advantage this brings, is that it is now possible for a cluster to run even if the number of nodes remaining in the cluster is less than 50%! By dynamically adjusting the quorum majority requirement, the cluster can sustain sequential node shutdowns down to a single node and still keep running. It does not allow the cluster to sustain a simultaneous failure of a majority of voting members though. To continue running, the cluster must always have a quorum majority at the time of a node shutdown or failure.

The cluster-assigned dynamic vote of a node can be verified with the DynamicWeight property of the cluster node by using the Get-ClusterNode cmdlet. A value of 0 indicates that the node does not have a quorum vote, while a value of 1 indicates that the node has a quorum vote:

Image
Figure 1.3: Dynamic Weight Property of a Dynamic Quorum

To change the quorum configuration in a failover cluster by using the Failover Cluster Manager, follow these steps:

  1. In Failover Cluster Manager, select the cluster that you want to change;
  2. With the cluster selected, under Actions, click More Actions, and then click Configure Cluster Quorum Settings:

Image
Figure 1.4: Configure Cluster Quorum Settings Option

  1. The Configure Cluster Quorum Wizard appears. Click Next:

Image
Figure 1.5: Configure Cluster Quorum Wizard

  1. On the Select Quorum Configuration Option page, the default is to allow the cluster to automatically configure the quorum settings that are optimal for our current cluster configuration (Use typical settings). To configure quorum management settings and to add or change the quorum witness, click Advanced quorum configuration and witness selection and then click Next:

Image
Figure 1.6: Select Quorum Configuration Option

  1. On the Select Voting Configuration page, select All Nodes and click Next. For certain scenarios, you might want to assign votes only to a subset of the nodes or even to No Nodes. This is generally not recommended, because it does not allow nodes to participate in quorum voting and it requires configuring a disk witness which becomes the single point of failure for the cluster.

Image
Figure 1.7: Select Voting Configuration

  1. On the Configure Quorum Management page, you can enable or disable the Allow cluster to dynamically manage the assignment of node votes option. Selecting this option enables dynamic quorum which increases the availability of the cluster by allowing it to continue running in failure scenarios that are not possible when this option is disabled. This option is enabled by default and it is strongly recommended not to disable it:

Image
Figure 1.8: Configure Quorum Management

  1. On the Select Quorum Witness page, select an option to configure a disk witness or a file share witness. The wizard indicates the witness selection options that are recommended for our cluster. In this case, because the current DAG has an odd number of members, no witness is required:

Image
Figure 1.9: Select Quorum Witness

  1. Click Next. Confirm your selections on the confirmation page that appears and then click Next:

Image
Figure 1.10: Confirmation

After the wizard runs and the Summary page appears, if you want to view a report of the tasks that the wizard performed, click View Report. The most recent report will remain in the systemroot\Cluster\Reports folder with the name QuorumConfiguration.mht.

You can also use the Shell to check if dynamic quorum is being used by running the following cmdlet:

Image
Figure 1.11: Checking Dynamic Quorum Configuration

To enable or disable dynamic quorum through the Shell, simply set the DynamicQuorum property to 1 (enabled) or to 0 (disabled) by running:

(Get-Cluster “cluster_name”).DynamicQuorum=0

Conclusion

In the first part of this article series, we had a high level overview of the importance of quorum in a windows cluster and how it affects Database Availability Groups. We also looked at the advantages of the new Dynamic Quorum in Windows Server 2012.

In the second and final part, we will look at how this greatly improves DAGs by demonstrating dynamic quorum in action.

If you would like to read the next part in this article series please go to Exchange 2013 DAG with Dynamic Quorum (Part 2).

About The Author

8 thoughts on “Exchange 2013 DAG with Dynamic Quorum (Part 1)”

  1. Hello Nuna,
    I have 3 node DAG as per below config
    EX-MBX-1 – Mailbox Role
    EX-MBX-2 – Mailbox Role
    EX-MBX-3 – CAS & Mailbox Role
    EX-CAS-1 – CAS Role.

    EX-MBX-1 , EX-MBX-2 & EX-CAS-1 in HQ and EX-MBX-3 in DR. But all servers are in same subnet. When two node fails the clients does not connect to third node. Kindly suggest.

    1. Hi Senthil,

      Do both servers fail at the same time? Is Dynamic Quorum still enabled after the failure? And what are the weights of each node?
      Also, what is the namespace that your clients use to connect to Exchange? Where is it pointing to during that failure?

      Regards,
      Nuno

  2. Hi Nuno Mota,

    This is very nice article.

    I am facing an issue with Exchange DAG witness configuration.

    In our scenario we have 4 Exchange 2016 servers (3 in Primary DC and One in DR DC) Total 4 node DAG. We had only one witness which was configured in DR (This was configured by someone) three days back our WAN got disconnected and file share witness went offiline and we were not bale ping DAG and not able to bring online back. So We decided to change witness to Primary DC and tried to make FSW and DAG online. This we tried to do with Exchange shell but not success then we changed through FCM. As soon as we change the FSW the existing and new file share witness came online and Exchange DAG status shows new FSW server. Now when we check the WitnessInUse, it shows as invalid configuration. We tried to set-DAG -identity DAGname, it gives the error as

    [PS] C:\Windows\system32>Set-DatabaseAvailabilityGroup -Identity DAG

    There was a problem changing the quorum model for database availability group DAG. Error: An error occurred while
    attempting a cluster operation. Error: Cluster API failed: “DeleteClusterResource() failed with 0x139f. Error: The
    group or resource is not in the correct state to perform the requested operation”
    + CategoryInfo : InvalidArgument: (:) [Set-DatabaseAvailabilityGroup], DagTaskProblemChangingQuorumExcept
    ion
    + FullyQualifiedErrorId : [Server=ADEXGAP1,RequestId=0303ae8a-6df1-47f9-8fd2-1390763b2eb2,TimeStamp=12/19/2017 4:4
    4:02 AM] [FailureCategory=Cmdlet-DagTaskProblemChangingQuorumException] 56C85867,Microsoft.Exchange.Management.Sys
    temConfigurationTasks.SetDatabaseAvailabilityGroup
    + PSComputerName : EX01.domain.local

    Sorry for the long story but I want to explain everything.

    Kindly help me resolve the Issue.

  3. Hi,

    A quick help, if exchange 2013 RTM version installed,Till now no CU’s and SP updated.
    Can we update CU 22 directly or we need to install any other CU’s are SP required?

    Regards
    Paramesh

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top