What’s the Buzz on Web Caching? (Part 1)

If you would like to be notified when Deb Shinder releases the next part of this article series please sign up to the ISAserver.org Real time article update newsletter.

Introduction

The TMG firewall is a lot of things to a lot of people. For some people it’s an edge firewall. For others, it’s a back-end firewall. For some, it’s just a web proxy server. And for a lot people, it’s a remote access client VPN server and site to site VPN gateway. That’s why they call it the Swiss Army knife of firewalls, because if you dig down deeply enough, you’ll find that the TMG firewall can do almost anything you want it to.

It’s a real shame that Microsoft decided to stop development on the TMG firewall, because according to a large number of people I’ve talked to, the ISA and subsequent TMG firewalls were some of the best software products the company had ever produced. I suppose this goes to show that while change is inevitable, not all change is good – at least not for all people.

But enough of the continuing eulogy for the TMG firewall. Since you’re reading this, you probably have a TMG firewall running in your environment already, or you’re thinking about introducing a new TMG firewall onto your network in spite of the impending end-of-life. Actually, much as it pains me to do so, if you’re a member of an enterprise IT shop I’d recommend that you look elsewhere for a new firewall. That’s because the TMG firewall will only be supported until 2020 (which is not as far away as it might seem), and there are some key features that will no longer be supported in the near future. You’ll get security updates, but anything that isn’t critical isn’t going to be fixed.

If you’re running a small business or you’re an individual who likes to play with software and have heard about the TMG firewall and would like to give it a try while you still can, then by all means, go ahead and give the TMG firewall a whirl. It’s really fun software to work with, and compared to just about any other firewall out there, it’s very easy to configure. And the really good news is that the TMG firewall rings true to the ethos of the old Microsoft, in that you can configure everything you want to configure in the beautiful and robust graphic user interface – no PowerShell skills required (although you can use VBScript and PowerShell to administer TMG if you so desire).

Web caching

Something that small businesses and individuals with complex home networks might want to use TMG for is web caching. Web caching is one of the capabilities of the TMG firewall that was very popular back in “the day.” What web caching does is enable the TMG firewall to “remember” web pages and other web content that was requested by users. For example, suppose Joe goes to www.isaserver.org and receives the content of the web page in his browser. The content that Joe sees in his browser is stored on the TMG firewall in its web cache. Sitting in the web cache on the TMG firewall doesn’t do anyone much good at this point, though. That goodness comes later.

About an hour later, Jane decides that she wants to see what’s new at www.isaserver.org and visits that web site. What happens when Jane does this? First, the request is sent from Jane’s browser to the TMG firewall. The TMG firewall then does a quick check of the www.isaserver.org web site to see if there have been any changes to the page since the firewall cached the page. If the page hasn’t been changed since the TMG firewall cached the page, then the page will be retrieved from the TMG firewall’s web cache and sent back to Jane. This happens a lot faster than having to go back to the www.isaserver.org web site to get the page again because the check for changes to the page that the TMG firewall does is much faster than retrieving the entire web page and retrieving the web page content from the TMG firewall’s disk drive is faster than retrieving it from the actual web site over the WAN link.

When caching was king

This was very important to enterprise IT shops in the 1990s and the first decade of the 21st century because Internet connections were slow. If you were paying for that connection by the megabyte or by the minute, caching became even more attractive. Today, however, it’s not unusual to see residential Internet connectivity in the 50-100Mbps range, and enterprise Internet connectivity moving into the 1 Gbps and above ranges. With that kind of speed available, web page loading time is more related to the capabilities of the server that is actually hosting the page and routing/performance issues between the requestor and the destination than to the Internet connection itself. These days, bandwidth issues rarely are a cause for miserable web experiences.

Individuals and companies in the past had to deal with slow modem connections and the definition of a “high speed” broadband connection was ISDN (128 Kbps), DSL (1.5 Mbps) or the at-that-time Holy Grail of a T1 line (1.44Mbps). With Internet pipes being so narrow, it was a real challenge to get a robust web experience. Those pipes regularly got clogged up with web requests, and even more when streaming media and “bandwidth hogging” applications like Skype came along (bandwidth hogging is in quotes since Skype wouldn’t be considered a bandwidth hogging application in the modern age).

This is why web caching was so popular in the early days of the Internet. If you had enough people in your organization sitting behind the same set of caching web proxy servers, there was a good chance that someone was going to request the same web content that someone else had requested earlier. When that happened, the request didn’t use up your valuable Internet bandwidth, and that saved bandwidth could be used to access new content that could be subsequently cached, and thus enable the company to save even more bandwidth.

When used correctly, web caching leads to a virtuous circle of increased web usage and reduced bandwidth consumption. Now that 4G data plans with low monthly data caps are becoming popular and some regular ISPs are trying to move to capped data plans as well, caching could become more important again.

Evolution of caching

The type of caching we’ve been talking about so far is referred to as “forward caching”. The reason it’s called forward caching is that the request is moving “forward” from a vantage point of the client located on the corporate network making an outbound request for web content on the Internet through the TMG firewall. There is another type of caching referred to as “reverse caching” and we’ll talk about that later. Suffice it to say that the value propositions are a bit different between forward and reverse caching.

Web caching doesn’t seem to get the attention it used to, at least at an enterprise level, most likely because of the advancements in Internet bandwidth technologies. However, that doesn’t mean that caching isn’t still a valid topic of discussion and consideration, because it is. It’s just that the nature and location of the cache has changed. Now, instead of putting the web cache on-premises, the cache is located at strategic Internet locations. This type of caching is called “pre-caching”.

Pre-caching: the Next Big Thing

When you “pre-cache” something, you actually load a web cache with the information you want it to have in advance. In contrast to the situation with Joe and Jane, where Jane only benefitted from the speed enhancements due to cached content after Joe first accessed the content, with pre-caching someone has put the information in the cache before anyone has made a request for it. This used to be done in enterprise networks to improve the performance for access to important line of business information and also to make that information available in the event that the Internet connection went down.

These days, pre-caching is often used to improve performance for high bandwidth intensive applications such as streaming video. Akamai currently is the big dog in the pre-caching game. Companies such as Netflix and Microsoft will put large video files into an Akamai web cache. When someone makes a request for this content, the information isn’t returned from the Netflix or Microsoft web site. Instead, the information is returned from the Akamai web cache.

What makes this better than having Netflix or Microsoft host this content in their own datacenter? The key here is that Akamai runs a Content Delivery Network (CDN) where it has web caching servers located in ISP datacenters all over the world. When someone in Texas requests web content from Netflix, that request doesn’t have to cross the thousands of miles to the Netflix datacenter. More likely, that request goes to a web caching server located very close to the person in Texas. This significantly increases performance and reduces latency.

CDNs are a hot topic now, as more and more players in the industry are getting into the “content as a service” game. Microsoft’s Azure has a CDN service that is fast approaching the quality and distribution of the Akamai service. Amazon Web Services is also getting into this business. Google Global Cache (GGC) is the basis of its content delivery platform. Even Apple, not known for its enterprise technical prowess or ability to deliver cloud quality services, is trying to get into the pre-caching CDN game. It’ll be interesting to see how this all plays out, and you can be sure there are going to be some big winners and losers in the game.

Summary

In this article, we took a brief look at what web caching is and the history of web caching. While web caching probably isn’t going to save the online world like some might argue it did back in the golden days of the Internet, it still has a place and you can still benefit from what web caching has to offer. In subsequent articles, we’ll take a closer look at the TMG firewall’s web caching abilities and see what you can do with it to make your users happier with this web experience. See you then! –Deb.

If you would like to be notified when Deb Shinder releases the next part of this article series please sign up to the ISAserver.org Real time article update newsletter.

Leave a Comment

Your email address will not be published.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top