Testing SCR in a Production Environment (Part 2)

If you would like to read the other parts of this article series please go to:

Introduction

This is the second part of a four-part article that takes a look at the steps required to test the Exchange 2007 SCR recovery process without destroying the existing CCR environment serving as the SCR source. In part one I mainly set the scene for the failover test, describing the lab servers and what roles are operating on each server. In part two I am going to cover shutting down the CCR environment, restoring the storage groups to the SCR standby cluster and also making a necessary change to DNS. The last article ended with step 2 in the overall process so obviously I will be commencing with step 3 here.

Step 3 – Stop The CMS

The next thing to do is to stop the CMS which in turn will dismount all the databases on the CMS. This is also done via the Exchange Management Shell and the cmdlet, to use the Stop-ClusteredMailboxServer cmdlet. To successfully use this cmdlet you will need to provide the name of the CMS that you wish to stop and additionally you must provide a reason for the stoppage via the –StopReason parameter. The text that you add to the StopReason parameter is then added to the event log with event ID 105 as you can see from Figure 2.

Figure 2: CMS Event ID 105

The full cmdlet that you can use to stop the CMS might look like this example where the CMS name is E2K7:

Stop-ClusteredMailboxServer E2K7 –StopReason “SCR failover test” –Confirm:$false

The running of this cmdlet gives you output to show you that the state of the CMS is now offline, as you can see from Figure 3.

Figure 3: Stopping the CMS

Also notice that I have used the –Confirm parameter and set it to a value of false, which suppresses the ‘Are you sure?’ question that requires an additional prompt from you in order to proceed. You can then check either via the Exchange Management Console or the Exchange Management Shell that the databases are dismounted as a result of running the Stop-ClusteredMailboxServer cmdlet. I personally find it useful to run the Cluster Administrator program (if you are running Windows 2003) or the Failover Cluster Management program (if you are running Windows 2008) and confirm that all CMS cluster resources are offline after the cmdlet has run. You can see this in Figure 4 which shows the Failover Cluster Management program in use on Windows 2008.

Figure 4: CMS Resources Offline

Step 4 – Shutdown CCR Environment

There is not really much to say about step three. Once the CMS has been stopped and the databases have been dismounted cleanly, it is now time to shut down both nodes of the CCR environment. First you will need to shut down the passive node and once this has completed you then shut down the active node. The result is a CCR environment that has been shutdown cleanly and preserved safely, as you will obviously need this later.

Step 5 – Restore the Storage Groups

You are now at the point where you have shut down the CCR environment and are ready to move the CMS so that it is now running from the SCR standby cluster. As previously mentioned within this article series, I have not covered the SCR setup process as this is well documented in other places. It is therefore assumed at this point that you already had SCR enabled and operating correctly and with this in mind I shall dive straight into the processes required to move the CMS across to the standby cluster.

The Restore-StorageGroupCopy cmdlet is the cmdlet that you need to use to perform your move. This cmdlet is the key component in activating the storage group copy on the SCR standby cluster. The two main parameters to consider are the –StandbyMachine and –Force parameters. The –StandbyMachine parameter names the SCR target server that will host the CMS whilst the –Force parameter is used in situations where the SCR source is no longer available, such as when the CCR environment has been lost. This could either be a total failure of both nodes of the CCR environment or the loss of the data centre that hosted the CCR environment. Of course, in my example scenario the CCR environment is no longer available as it has been shut down, so again the –Force parameter is used. If you do not specify the –Force parameter and the SCR source is not available, you will get an error indicating that the restoration process was unable to verify the mount condition of the source database.

Note:
Data loss is assumed if you use the –Force parameter as you will see from Figure 5.

Since in my example there are two storage groups involved, I need to run the Restore-StorageGroupCopy cmdlet twice. Note that I run this cmdlet on the SCR standby cluster itself. The full cmdlets used are:

Restore-StorageGroupCopy –Identity “E2K7\First Storage Group” –StandbyMachine SCR –Force

Restore-StorageGroupCopy –Identity “E2K7\Second Storage Group” –StandbyMachine SCR –Force

You can see the results of running these cmdlets in Figure 5. As I mentioned earlier, the data loss warnings relating to the use of the –Force parameter. Since this is a test failover in which the CCR environment was shut down cleanly, we should not experience any data loss in this test.

Figure 5: Restoring the Storage Group Copies

If you have a lot of storage groups to restore you can take advantage of a script called GetScrSources.ps1 that ships with Exchange 2007 Service Pack 1. You will find this script along with all the other scripts in the \Program Files\Microsoft\Exchange Server\Scripts folder. If you just run the script on its own you will get output similar to that shown in Figure 6.

Figure 6: GetScrSources Script Default Output

You can run this script and pipe the results into the Restore-StorageGroupCopy cmdlet, remembering to include the -StandbyMachine and -Force parameters of course. An example that I used in my lab is:

GetScrSources | Restore-StorageGroupCopy –StandbyMachine SCR –Force

Here in Figure 7 are the results of running this command in my lab. As you might expect the results are just the same, so clearly this script is the best option if you do indeed have a lot of storage groups to restore.

Figure 7: Restoring Multiple Storage Groups With GetScrSources

Step 6 – Change DNS

Before you can think about recovering the CMS to the standby cluster there are two important steps to follow. The first of these is to make a change to DNS if you are running your environment on Windows 2008. If this is the case, you must remove the existing DNS record for the CMS. To do this, just use the DNS Management snap-in and delete the DNS A record that corresponds to the CMS, as you can see from Figure 8. Remember, this is only for Windows 2008; you do not need to do this for Windows 2003. If you do not do this step when using Windows 2008, you will get errors when trying to recover the CMS that relate to not being able to connect to Internet Information Server (IIS) on the SCR target server.

Figure 8: Deleting the CMS DNS Record

Once you have removed the DNS A record for the CMS make sure that not only is DNS up-to-date across your domain controllers, but that you have also flushed the cache on your SCR standby cluster via the ipconfig /flushdns command.

Summary

Here in part two of this article series we are now at a point where the existing CCR environment has been shut down and the storage group copies activated on the SCR standby cluster. In the next part we’ll conclude the steps required to complete the transition to the SCR standby cluster by recovering the CMS, mounting the databases and testing mailbox access.

If you would like to read the other parts of this article series please go to:

Testing SCR in a Production Environment (Part 2)

Introduction

Step 3 – Stop The CMS

Step 4 – Shutdown CCR Environment

Step 5 – Restore the Storage Groups

Step 6 – Change DNS

Summary

About The Author

Neil Hobson

Leave a Comment Cancel Reply

Introduction

Step 3 – Stop The CMS

Step 4 – Shutdown CCR Environment

Step 5 – Restore the Storage Groups

Step 6 – Change DNS

Summary

About The Author

Neil Hobson

Read Next

Leave a Comment Cancel Reply