Measuring System Performance on a Forefront Threat Management Gateway (TMG) 2010 Firewall (Part 2)

If you would like to be notified of when Richard Hicks releases the next part in this article series please sign up to our ISAserver.org Real Time Article newsletter.

If you would like to read the first part in this article series please go to Measuring System Performance on a Forefront Threat Management Gateway (TMG) 2010 Firewall (Part 1).

Introduction

Many administrators tremble with fear when it comes to troubleshooting performance issues on the Forefront TMG firewall. As I outlined in part one of this series, using a methodical and systematic approach to data gathering and analysis can make this process much less daunting. By assessing the utilization of the four main computing subsystems – CPU, memory, network, and disk – the TMG administrator can easily determine if the TMG firewall is exhibiting signs of bottlenecks and suffering from resource constraints. Frequently a poorly performing TMG firewall can be remedied simply by adding capacity or reducing demand. However, there are times when all of the basic performance indicators appear to be normal and within acceptable parameters, yet TMG is still running slow. In part two of this series I’ll demonstrate how to collect and assess performance data from the Forefront TMG firewall itself.

TMG Specific Data

If TMG performance is still substandard after you’ve determined there are no bottlenecks or resource constraints with the CPU, memory, disk, or networking subsystems, digging deeper in to the TMG firewall using the Windows Performance Monitor (perfmon) will be necessary. Here we’ll look closely at several counters for the Forefront TMG Firewall Packet Engine, Forefront TMG Firewall Service, and Forefront TMG Web Proxy objects.

Forefront TMG Firewall Packet Engine Object

Begin by opening the Windows Performance monitor by clicking on the Start button, then clicking Run. Enter perfmon.exe and click Ok. After the Performance Monitor appears, click the Performance Monitor node in the navigation tree on the left. Delete the default %Processor Time counter, then right-click anywhere in the Performance Monitor window and select the option to Add Counters from the context menu or simply click the green plus sign at the top of the window. Scroll down and expand the Forefront TMG Firewall Packet Engine object, then select the Backlogged Packets counter and click Add.


Figure 1

There is no specific threshold for this counter, but lower is definitely better. Backlogged packets can also have an impact on Dropped Packets/sec counter. If Backlogged Packets and Dropped Packets/sec rise together, or if there is a rise in Backlogged Packets that precedes an immediate rise in dropped packets, this is a good indication that the TMG firewall may not have enough capacity to handle the current volume of traffic. If this consistently occurs after you observe the Active Connections counter plateau, this may also indicate a bottleneck or capacity constraint with one of TMG’s dependent systems such as DNS or Active Directory. In addition, if the Dropped Packets/sec counter indicates an increase without a corresponding rise in backlogged packets, this might indicate that the TMG firewall is processing a lot of malicious traffic or is under attack.

Forefront TMG Firewall Service Object

The most important counter to observe under the Forefront TMG Firewall Service object is the Pending DNS Resolutions counter. The TMG firewall relies heavily on DNS to perform name resolution and for authentication. It is vital that name resolution be performing quickly and efficiently, especially for TMG firewalls that are joined to a domain. This counter should be a low as possible, and ideally should be zero. When sustained values for this counter are above zero, the name resolution infrastructure should be investigated closely. In addition, ensure that the Available Worker Threads counter does not remain at or near zero for sustained periods. A shortage of worker threads is an indication that the TMG firewall does not have enough capacity to handle the current workload.

Forefront TMG Web Proxy Object

When your TMG firewall is serving as a web proxy server, pay close attention to the Memory Pool for HTTP Requests (%) and the Memory Pool for SSL Requests (%) counters. There are no specific thresholds for these counters, but lower utilization (higher percentage available) is better. If either of these counters show sustained values of less than 30%, this is an indication that the TMG firewall does not have enough capacity to handle the current workload. In addition, Total Pending Connects should be as low as possible, and ideally zero. Observing high values here may indicate that the TMG firewall does not have enough capacity to handle the current workload.

Additional Objects

There are numerous TMG related objects and counters from which to collect data, but I want to call attention to a few of the more common ones here. Observe the Process(wspsrv)\% Processor Time for signs of high utilization. Typically when a TMG firewall is consuming most of all of the CPU, the TMG firewall service (wspsrv.exe) is the culprit. However, it is often necessary to generate a memory dump of this process and analyze the data using a kernel debugger to gather additional detailed information in order to find the root cause. I have had some success using the Process Explorer tool from Sysinternals to peer in to the process threads for clues as to what might be causing the wspsrv.exe process to consume excessive amounts of processor time. More often than not the culprit is third-party integration components such as content filtering or virus scanning plug-ins, so be on the alert when troubleshooting systems with these installed.

Often TMG firewall administrators will notice that the SQL server instance used for firewall and web proxy logging is consuming what appears to be excessive amounts of memory. Typically this isn’t anything to worry about, as high memory consumption by SQL is by design. You can measure SQL memory consumption by observing the value of Process(sqlservr#1)\Working Set – Private. If you are concerned that SQL memory consumption is impacting system performance, you can make changes that will limit the amount of memory that SQL can utilize. However, I would encourage you to monitor this counter over an extended period of time before making these changes as SQL will periodically release memory. For information on how to limit SQL memory consumption, click here.

Performance Analysis of Logs (PAL)

The process of collecting and evaluating Forefront TMG performance data is made substantially less difficult with the Performance Analysis of Logs (PAL) utility. This is an open source tool that automates the collection, processing, and evaluation of numerous performance monitor counters. The tool generates a graphical report that contains detailed analysis for many TMG specific performance monitor objects and counters. PAL can be downloaded here, and information on how to use PAL to collect and assess performance data can be found here and here.

Performance Baseline

Having a recorded performance baseline is vital when it comes to troubleshooting performance issues. Often it is extremely difficult, if not impossible, to quantify what slow system performance actually is. However, if you have collected baseline performance data when your TMG firewall is performing normally, you’ll have a valuable point of reference with which to compare your current performance data with. This can save a tremendous amount of time and agony when it comes to assessing and ultimately resolving performance issues with the TMG firewall. Get in the habit of taking performance snapshots using the Windows Performance Monitor periodically and save this data for later reference. If performance issues arise, you can compare your existing information with the saved historical data to see what values are out of line with previously recorded information.

Summary

The Forefront TMG firewall is highly instrumented, and there are myriad objects and counters that can be used to troubleshoot Forefront TMG firewall performance issues. I’ve only scratched the surface here, showing you a few of the more common objects and counters I use to troubleshoot performance issues in the field. Although the Windows Performance Monitor can provide a wealth of detailed information about the performance and utilization of TMG and the underlying subsystems, often the administrator will have to dig even deeper using even more advanced tools such as the various Sysinternals tools (process explorer, process monitor, and TCPview are commonly used), XPerf, NETSH tracing, and in some instances even the Windows kernel debugging tools. Although the tools you use to collect data may vary, if you follow the procedures I’ve outlined in this series you’ll find that the process will be much easier. With regard to performance troubleshooting on the TMG firewall, nothing will be more valuable to you than experience. Be sure to use these tools and put these procedures in to practice before you have a performance issue so that you’ll be comfortable with the tools and the process.

If you would like to be notified of when Richard Hicks releases the next part in this article series please sign up to our ISAserver.org Real Time Article newsletter.

If you would like to read the first part in this article series please go to Measuring System Performance on a Forefront Threat Management Gateway (TMG) 2010 Firewall (Part 1).

Leave a Comment

Your email address will not be published.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top