Using a Replicated Data Store for Maximum Connectivity between Remote Sites (Part 1)


If you would like to be notified when Paul Stansel releases Using a Replicated Data Store for Maximum Connectivity between Remote Sites (Part 2) please sign up to our Real time article update newsletter.


Given the structure of Citrix farms today, it is very common to see larger farms where the servers are spread across multiple sites. Although having a Citrix server local to the users can make sense for speed considerations, it can cause administrative issues with connectivity to the data store and difficulty with authentication and server status updates. If you use a SQL back-end for your data store, you can combat some of these problems by using a replicated data store at the remote sites. This two part article will look at data store replication and how you can implement it to achieve maximum connectivity in your environment.


Introduction


With the advent of consolidated Citrix server farms, some very predictable issues arose. Since Citrix now encourages you to use a single farm whenever possible, distance between Citrix servers in the same farm is a reality many admins face. Servers are often organized in zones based on physical location with a Zone Data Collector passing the traffic between zones and back to the data store server. While this is certainly an acceptable method, it can raise performance concerns in some farm arrangements. If slow network conditions exist between zones, keeping accurate track of server load and monitoring server events often lacks a real-time quality. One possible solution to that problem is to use replicated data stores. In this 2 part article I will focus on replicating data stores within Microsoft SQL 2000. Part 1 deals with the basics of replication and some definitions of terms we’ll use throughout part 2.


What is Replication?


Although it should be obvious, to replicate a data store it must be hosted on a database server. This is not functionality that the local data store option within Citrix can accommodate. Using multiple replicated databases is referred to in SQL terms as a Distributed Data Environment. It simply means that you are bringing the data close to the user. In this case, consider the user your Citrix zones. There are many different levels of distributed data within SQL. The simplest would be Merge Replication. In Merge Replication each site maintains its own data and the sites are periodically merged with each other. The obvious downside to this is that sites are entirely independent of each other, and the data contained at each site can be out of sync with other sites until the merge process occurs. Obviously this is not good for a Citrix environment.


With a Citrix Data Store, our primary concern is consistency and real-time feedback. This means that we need a replication method that guarantees the constant synchronization between replicated sites. And lucky for us there is one! What is required for Citrix is Distributed Transactions. Distributed Transactions relies on the Microsoft Distributed Transaction Coordinator (MS DTC) to guarantee that every site has the exact same data at the exact same time. It does this through a protocol known as two phase commit. This means that each site has to agree to the changes before they are committed. 


Why Should I Replicate?


There are two real reasons to replicate your data store. The first one is because of the distances involved in some farms. Replicated data stores can provide benefits to login times for users in the remote zones, especially with users that were previously required to cross high latency WAN links to get to the data store. With a local replicated data store connection the zone data collector is able to rapidly retrieve the information it needs to present a client with the appropriate applications and access. Although the increase in response times from the data collectors can be minimal, in some cases it is a significant improvement.


The second reason to consider implementation of a replicated data store is for redundancy and disaster recovery. If you maintain a separate site that can be used in the case of a declared emergency, having the replicated database already online and active can be the difference between hours of restoration work and a simple DSN change. Let’s say you have two sites, A and B. Site A is your primary site with a hundred Citrix servers, while Site B is a much smaller location with 20 Citrix servers. Site B has been designated as the disaster fail over site by your company. A serious crisis occurs at Site A and takes down the SQL box containing your data store. If you’ve planned ahead and have a replicated data store available, the process to change whatever Citrix servers are still up at Site A to the Site B SQL server is minimal. Obviously this can result in decreased performance for Site A users, but it beats no performance at all!


The reality is of course that the role of the data store is significantly less than what it once was in a Citrix environment. It still makes sense to consider replication in cases that require constant uptime and availability of applications through Citrix. Although you can certainly operate without the data store being available, management of your Citrix environment becomes a logistical nightmare. You also have to weigh the pros and cons of distribution. These would certainly include added complexity to your SQL environment, increased network resource requirements, and require planning and scripting for failover. So if you think you are having login issues related to WAN latency and if you have the facilities and hardware available, it makes sense to at least look at replication as a solution. Just keep in mind that if you aren’t comfortable with SQL tools this is NOT something you want to tackle.


How Distributed Transactions Works with a Data Store


Let’s look at an example of Distributed Transactions and how a data store would look. We’ll call ourselves Alpha Company, with the aforementioned Sites A and B. Site A is our regional data center for North America, and Site B is a plant location. The process for creating a replicated data store is the same as replicating any other SQL database. It can be summed up like this:



  1. Create your original database
  2. Designate a Distributor server.
  3. Configure your subscriber server
  4. Designate the database server holding the first copy as the Publisher and publish the database

Obviously there’s a bit more to it than that or we would all be doing SQL Replication and leaving our poor DBAs out in the cold. Let’s look at some of the roles identified above and talk about what they are in practical terms:


Publisher: This is the server with the original copy of the database. The Publisher is making the database available to other SQL servers in the environment. Those servers may or may not have the ability to change the data in their local copy of the database. In the case of a replicated data store they certainly do.


Subscriber: The SQL server(s) that receive the transactional updates from the Publisher and potentially from other Subscriber servers. Subscriber servers in this case are allowed to change their local copy of the replicated database and in turn replicate those changes back to the Publisher and other Subscribers. We have to allow them to do this because otherwise the Publisher would never have accurate information about any of the Citrix servers using a subscriber database as their data store. That’s not a good way to run!


Distributor Server: This is the server that handles the actual distribution of the transactions to the various subscriber servers (SQL boxes). In the Citrix documentation they have you leave the first database server (the Publisher) defined as the Distribution Server. Microsoft recommends in their replication documentation that instead, you set one of the subscriber servers to be the Distribution Server. This reduces some of the load on the Publisher server and provides more fault tolerance. While this is sound advice if your distribution consists of more than one subscriber server, I don’t consider it a necessary change if you are only replicating between two sites. This one really is a personal choice.


Conclusion


Now that we’ve had a high level overview of SQL replication, part 2 of this article will focus on the actual steps to achieve replication. It is important that you have a firm understanding of the pieces involved in replication and the reasons you might want to implement it. Before you dive into the world of replication make sure it is appropriate for your environment and that you will actually achieve some benefit from it.


If you would like to be notified when Paul Stansel releases Using a Replicated Data Store for Maximum Connectivity between Remote Sites (Part 2) please sign up to our Real time article update newsletter.

About The Author

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top