Understanding TMG Web Caching Concepts and Architectures


Web caching is useful for speeding up performance of frequently accessed web pages for either internal or external users and reduces network bandwidth usage (internal or Internet). It can be costly to add additional Internet bandwidth, and some Internet service plans require you to pay for bandwidth on a usage basis. Thus reducing bandwidth usage can result in real savings to the bottom line. But even if your organization buys bandwidth on an unlimited plan, reducing usage can increase performance for the network’s users. Reduction of internal bandwidth usage can benefit all users of the LAN.

Caching servers can be deployed in groups, and these groups can be arranged in two different architectures, depending on your network’s needs. There are two different types of caching that can benefit your organization. Just in case you were anticipating that the two types of caching would be active and passive, please note that TMG does not support active caching. Active caching was supported by ISA 2000 but was removed from ISA 2004 and later versions. Instead, the two types we’re talking about here are forward and reverse caching.

In the following sections, we will look at the differences between these two types of Web caching, the architectures that are used to deploy multiple caching servers, and the protocols that are used by caching servers to communicate with one another.

Web Caching Types

As stated above, there are two basic types of Web caching:

  • Forward caching
  • Reverse caching

TMG firewalls can perform both of these, but caching is disabled by default when TMG is installed. To enable it, you have to allocate space on a cache drive. Now let’s look at each type of caching a little more closely.

Forward Caching

One way to reduce Internet bandwidth consumption is to store frequently-accessed Web objects (HTML pages, images, sound files, etc.) on the local network, where they can be retrieved by internal users without going out to a server on the Internet every time they want to access them. This is called forward Web caching, and it has the added advantage of making access for internal users faster because they are retrieving the Web objects over a fast LAN connection, which typically transfers data at 1Gbps or more, instead of over a much slower Internet connection.

Forward caching is supported by all Web caching servers. Forward caching accelerates the response to outbound requests when users on the internal network request a Web object from a server on the Internet. Those objects that are requested frequently are stored on the caching server. This means they can be retrieved via the fast local network connection instead of over a slower Internet connection.

Forward caching takes place when a user on a network that is protected by the TMG firewall makes a request for Web content. The requested content is placed in the Web cache after the first user makes a request. The next (and subsequent) user who requests the same content from the Internet has the content delivered from the Web cache on the TMG firewall instead of from the Internet Web server. This reduces the amount of traffic on the Internet connection and reduces overall network costs. In addition, the content is delivered to the user much more quickly from cache than it is from the actual Web server. This increases user satisfaction and productivity.

The primary “bottom line” benefit of the TMG firewall’s forward caching is cost savings realized by reduced bandwidth usage on the Internet connection.

Reverse Caching

Reverse caching, on the other hand, reduces traffic on the internal network and speeds access for external users when the company hosts its own Web sites on servers on the LAN. In this case, frequently-requested objects on the internal Web servers are cached at the network edge, on a proxy server (TMG), so that the load on the Web servers is reduced.

In generic caching documentation, reverse caches are sometimes referred to as “gateway caches” or “surrogate caches.”

Reverse caching is appropriate when your organization hosts its own internal Web sites that are made available to external Internet or intranet users. The caching server stores those objects that are frequently requested from the Internal Web servers and serves them to Internet users. This speeds access for the external users and it also lightens the load on the internal Web servers and reduces traffic on the internal network.

Reverse caching takes place when a user on the Internet makes a request for Web content that is located on a Web server published by a TMG firewall Web Publishing Rule. The TMG firewall retrieves the content from the Web server on an internal network or another network that is protected by the firewall and returns that information to the Internet user who requested the content. The TMG firewall caches the content it retrieves from the Web server on the internal network. When subsequent users request the same information, the content is served from the TMG cache instead of being retrieved from the originating Web site.

There are two principle benefits to this reverse caching scenario:

  • Reverse caching reduces bandwidth usage on the internal network.
  • Reverse caching allows Web content to be available to external users even when the Web servers are offline.

How Reverse Caching Reduces Bandwidth Usage

Reverse caching reduces bandwidth usage on the internal network because the cached content is served directly from the TMG firewall. No bandwidth usage is required on the internal network; thus, this bandwidth becomes available to internal network users. Corporate networks that are already having issues with insufficient bandwidth will benefit from this configuration.

How Reverse Caching Increases Availability of Web Content

There is an even more compelling advantage to reverse caching; that’s its ability to make Web site content available when the Web server is offline. This can be part of a high-availability plan for your Web services, and avoids disruption of external users’ experience regarding your Web site.

Web servers can go offline for several reasons. For example, the Web server will have to be down for a time when routine maintenance needs to be performed or after the server experiences a hardware or software crash. Regardless of the reason that the server is offline, this downtime can create a negative experience , which could range from a minor inconvenience to a serious problem, for Internet users when they try to access content on your site. The big advantage of the TMG reverse caching feature is that it makes it possible for you to take the Web server offline and still have your Web site content available to Internet users because the content is served from the TMG cache.

Web Caching Architectures

Multiple Web-caching servers can be used together to provide for more efficient caching. There are two basic caching architectures that use multiple caching servers working together:

  • Distributed Caching
  • Hierarchical Caching

TMG supports both. As the name implies, distributed caching distributes, or spreads, the cached Web objects across two or more TMG caching servers. These servers are all on the same level on the network. The figure below illustrates how distributed caching works.

Figure 1

Hierarchical caching works a little differently. In this configuration, caching servers are placed at different levels on the network. Upstream caching servers communicate with downstream proxies. For example, a caching server might be placed at each branch office. These servers would then communicate with the caching array at the main office. Requests would then be serviced first from the local cache, then from a centralized cache, before going out to the Internet server for the request.

A hierarchical caching scenario is illustrated in the figure below. Note that hierarchical caching is more efficient in terms of bandwidth usage, but distributed caching is more efficient in terms of disk space usage.

Figure 2

Finally, you can combine the two methods to create a hybrid caching architecture. This combination gives you the “best of both worlds,” improving performance and efficiency. A hybrid caching architecture is shown in the figure below.

Figure 3

Web Caching Protocols

When multiple Web caching servers work together, they need to have a way to communicate with each other, so that if the Web object that is requested by the client isn’t found in a server’s cache, it can query other caching servers before it engages in the “last resort” of going out and retrieving the document from the Internet.

There are a number of different protocols that can be used for communications between Web caching servers. The most popular of these are the following:

  • Cache Array Routing Protocol (CARP), a hash-based protocol that allows multiple caching proxies to be grouped in an array as a single logical cache. This uses a hash function to ascertain to which cache a request should be sent. The hash function can also be used by the Web Proxy client in order to determine where the content is located in a distributed cache.
  • Internet Cache Protocol (ICP), a message-based protocol defined in RFC 2186 that is based on UDP/IP and that was originally used for hierarchical caching by the Harvest project, from which the Squid open-source caching software was derived.
  • HyperText Caching Protocol (HTCP), which permits full request and response headers to be used in cache management.
  • Cache digests, a hash-based protocol that is implemented in Squid, which uses an array of bits called a Bloom filter to code a summary of documents that are stored by each proxy.

Note that Web Cache Coordination Protocol (WCCP) is not really used for communication by the caches. Rather, it’s a router-based protocol that removes distribution of requests from the caches and uses service groups to which the caches belong. The router calculates hash functions.

ICP has been around the longest, and the other protocols were developed to improve on ICP in some way. TMG uses CARP for communications between Web caching servers. CARP provides a number of benefits, including elimination of the query messaging between proxy servers that can congest the network. It’s also more scalable, and more tolerant of changes to the servers in an array (addition or removal of servers). For more information on how CARP works, check out this link.


In this article, we introduced the concept of web caching and how web caching can reduce bandwidth usage and improve web performance. We saw that forward caching is primarily about reducing Internet bandwidth use while reverse caching is about reducing the load on published web servers. There are two main web caching architectures – distributed and hierarchical – and we discussed how they are used to improve performance and availability. Finally, we reviewed the popular web caching protocols and how they work.

Leave a Comment

Your email address will not be published.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top