Exchange Cluster Checks With ClusPrep and ClusDiag (Part 2)

If you missed the first part in this article series please go to Exchange Cluster Checks With ClusPrep and ClusDiag (Part 1).

Introduction

In part one of this article, we looked at running ClusPrep before cluster formation in order to ensure that our physical cluster nodes and storage were configured correctly. What happens once the cluster has been formed? What tests can we then run before we put the cluster into production and install Exchange onto it? Historically, there haven’t been that many tools to assist us in the analysis of a cluster but that changed not long ago. Microsoft has a tool available to verify a cluster by running performance tests against it, as well as providing a graphical view of various components such as the network and disk layouts. In addition, this tool helps with the cluster troubleshooting process as it can examine the cluster log files. What is this tool? It’s ClusDiag, and here in the second part of this two-part article we’ll take a look at it.

Obtaining and Installing ClusDiag

The full name for ClusDiag is the Cluster Diagnostics and Verification Tool, but I’m going to refer to it as ClusDiag for simplicity. In part one you’ll remember that I ran ClusPrep against a proposed Exchange 2007 three-node active/passive single copy cluster and therefore I’ll be running ClusDiag against this same cluster once it has been installed and configured as a cluster. Make sure that you download the latest version of ClusDiag from the Microsoft downloads site. The current version at the time of writing this article can be found at the following link:

Cluster Diagnostics and Verification Tool (ClusDiag.exe)

What you actually download is a single file called clusdiag.msi. This can be installed onto a machine running Windows XP or later. Install ClusDiag by running the downloaded file, which will present you with an installation wizard that consists of a license agreement screen followed by an installation location screen. The default installation folder is C:\Program Files\ClusDiag and a shortcut called Cluster Diagnostics Tool is placed into the Start / Programs location.

Running ClusDiag

The running of ClusDiag can be achieved by choosing the Cluster Diagnostics Tool menu option or by running ClusDiag.exe from the installation folder I mentioned previously. Upon running the tool, you’ll be presented with the main opening screen as shown in Figure 1.

Figure 1: ClusDiag Opening Screen

In the Open what field you need to enter the name of the cluster that you wish to test. Notice from Figure 1 that there are two modes of operation, namely Online and Offline. Broadly speaking, the online mode is used to perform the tests and to capture configuration information about the cluster. Offline mode can be used in situations such as when you wish to examine cluster log files from a previous online session, or perhaps after another administrator has emailed you the log files from a remote cluster. You may therefore deduce that, with online mode, the cluster logs are stored for later review.

ClusDiag Tests

As we saw in part one of this article, ClusPrep is useful in ensuring that our cluster components are configured correctly prior to construction of the cluster. One of the first things you should test after cluster construction is its ability to failover correctly. To do this, first ensure that your correct cluster name has been selected and the Online option chosen in the Open window as shown previously in Figure 1. Once you’ve done this, click OK and you will then be presented with a view of your cluster that’s not too dissimilar to that presented via the Cluster Administrator program. A sample of this is shown in Figure 2.

Figure 2: Initial View After Connection to a Cluster

To initiate our failover test, select the Tools menu and then choose Run Test from the available options. This will present the Run Test window as shown in Figure 3.

~
Figure 3: Run Test Window

You will note from Figure 3 that the test name is set to SPFAIL-BASIC. The other test option is SPFAIL-REGULAR. The SPFail tests are designed to test the setup of a cluster. The basic test will see if the resource groups can be moved across nodes within the cluster, whilst the regular test is more of a stress test since it continuously moves the resource groups between the nodes for a period of time. Therefore, you will want to run these tests before you make the cluster a production cluster with users connecting to it. As you might be aware, you can build a one-node cluster but obviously you’re not going to be able to run these tests in such a scenario, since there are no other nodes to move the resource groups to. There is also the option for Enable Sniffing which is useful if you’re looking to analyze the data using Network Monitor. Clicking the Launch button begins the testing.

If this is the first time you’ve run ClusDiag on this machine, you will likely be prompted with a statement that the ‘Logs’ directory cannot be found. You are asked if you would like to create a new ‘Logs’ directory. If you say ‘yes’ to the above question, a folder named ‘Logs’ will be created under the ClusDiag folder created during program installation. Under the logs folder you will find additional folder names for the logs as and when they are collected. Once the tests have started, you’ll see a screen similar to the one shown in Figure 4.

Figure 4: SPFAIL-BASIC Test in Progress

Eventually, you’ll receive a prompt that the tests have finished and that the results will be displayed once you click OK. Figure 5 shows an example of the test that I ran, which informs me that the cluster has passed the SPFAIL-BASIC test. I can therefore be satisfied that this cluster performs a basic resource group move operation between cluster nodes. This report and the associated log files are held in the \ClusDiag\Logs folder; in my case, the full folder path is C:\Program Files\ClusDiag\Logs\04-25-2007\nh-clu10.

Figure 5: SPFAIL-BASIC Test Result

Analyzing Logs

Let’s turn our attention now to analyzing cluster logs using ClusDiag. In fact, let’s look at analyzing log files in offline mode. To do this, follow these steps:

Run ClusDiag and in the Open window make sure that Offline mode is selected. Once this has been done, browse to the relevant log file location. This is shown in Figure 6.

Figure 6: Selecting Log Files

Once the relevant log folder has been selected and the OK button clicked, you should be back to the normal ClusDiag window as shown in Figures 2 and 7.
Right-click the Log Files folder from the left-hand pane and choose Failure Window from the context menu. This is shown in Figure 7.

Figure 7: Failure Window Selection

All the relevant cluster node log files are now shown in a cascaded view as seen in Figure 8. This is useful because each node’s log file can now be easily compared.

Figure 8: Cluster Log View

You can now select the particular cluster node log file that you want to work with and step through the log entries. Note how the log categories are highlighted in different colors. For example, ERR entries are highlighted in red, whilst WARN entries are highlighted in purple. It would therefore make sense here to step through the log file and examine all the ERR and WARN entries first.

To keep track of any important log entries, you can either bookmark the event or make a comment on the event. To bookmark the event, simply highlight the relevant event and then click the left-most blue flag (toggle bookmark) icon from the toolbar, or click in the grey area to the left of the log entry. To make a comment, right-click the event and choose Comment / Edit from the context menu; here you can type in your desired text. Commented events change color to yellow so that you can quickly locate them in the future.

Also note that a default filter is applied to the log files. To access the templates and therefore the filters, choose the Tools menu followed by Options. This brings you to the Clusdiag Options… window. From here, go to the Templates tab where you can edit the relevant template.

Graphical Views

Finally, let’s look at the ability to generate graphical views of network, resource and disk layouts. One thing to watch for is that in order to generate these views, you will need to have either run a test or captured some log files otherwise the various required XML files will not be present in the log folder discussed earlier in this article. Let’s start with the disk layout which can be accessed by choosing the View menu followed by the Disk View option. You will then be presented with the Disk View window as shown in Figure 9. This gives you a view of the cluster nodes and their disk layout. Note the tool tip displayed, shown as a result of hovering the mouse over the disk in the centre of the top row, which gives you many more details about the selected disk. You can also get this information from within the main window by choosing the Reports menu followed by the Disk Statistics option. Also note that the quorum disk is automatically highlighted in red by ClusDiag – in this case, it’s Disk Q.

Figure 9: Disk View

Next is the network layout. Clusters typically have a more complex network card layout than standalone servers and so it can be extremely useful to see the network configuration in a graphical format. To see the network layout, choose the View menu followed by Network View. This view gives you a graphical layout of how the public and private networks connect to the cluster nodes. Again, tool tips give you additional network card information as you can see from the sample network view in Figure 10. You can also get this information from within the main window by choosing the Reports menu followed by the Network Statistics option.

Figure 10: Network View

Another fantastic view is the view given of the resources. The beauty of the resource layout is that you also get to see the resource dependencies. I talked at the start of this article about running tests after the cluster has been formed, but before it had been put into production and Exchange installed onto it. Of course, to see the Exchange resource dependencies at least one of the cluster nodes will need to have had Exchange installed onto it.

From the View menu, highlight the D.A.G. option (which stands for Directed Acyclic Graph) which will then reveal another context menu that allows you to choose which resource dependencies you’d like to view. In the example I’ve shown below in Figure 11, I’ve chosen the Cluster Resource Dependency(CMS) menu option, which shows the resources and their dependencies for the resource group that I called ‘CMS’. Again, tool tip information is shown and you’ll see from Figure 11 that I’ve elected to show the tool tip seen when hovering the mouse over the Information Store resource.

Figure 11: Resource Dependency View

Summary

Over this two-part article, we’ve looked at a tool (ClusPrep) that can help check the cluster nodes before the cluster is created, as well as a tool (ClusDiag) that can help check the cluster after it has been created. Both of these tools are freely available from the Microsoft downloads site, so it makes sense to include them in your cluster build and test processes.

If you missed the first part in this article series please go to Exchange Cluster Checks With ClusPrep and ClusDiag (Part 1).

Exchange Cluster Checks With ClusPrep and ClusDiag (Part 2)

Introduction

Obtaining and Installing ClusDiag

Running ClusDiag

ClusDiag Tests

Analyzing Logs

Graphical Views

Summary

About The Author

Neil Hobson

Leave a Comment Cancel Reply

Introduction

Obtaining and Installing ClusDiag

Running ClusDiag

ClusDiag Tests

Analyzing Logs

Graphical Views

Summary

About The Author

Neil Hobson

Read Next

Leave a Comment Cancel Reply