An adage that many of us live by is, “If it ain’t broke, don’t fix it.” Unfortunately, what this often seems to result in is another life principle, namely, “Don’t buy a tool unless you need it to fix something that’s broke.” Monitoring systems, networks, and databases are something like that. Most of us working in IT will agree that monitoring infrastructure is a good idea. It’s surprising, however, how many businesses — and not just small ones — aren’t willing to spend the money they should for a comprehensive and intelligent monitoring solution to provide a second pair of eyes for watching over their infrastructure.
So if you’re in charge of IT for one of those companies that count every penny and won’t buy a tool unless the engine is already broken, it’s time to get back to basics and understand why monitoring is important for organizations of all sizes. To help us understand this I talked a while back with Ben Day, a senior system engineer working in the technical support group of Paessler AG, a company based in Germany that is a leading worldwide provider of network monitoring software. And while you’re at it, you might want to check out this article by TechGenix author Richard Hicks, where he did a product review of Paessler PRTG Network Monitor, a comprehensive networking monitoring solution that has powerful capabilities. Let’s listen now to Ben as he explains the importance of monitoring your organization’s IT infrastructure.
What is infrastructure monitoring?
Today’s infrastructure, whether it’s at large enterprises or small businesses, is becoming increasingly complex. This complexity is driven by virtualization, cloud, and mobility, with increased machine-to-machine traffic right around the corner in many cases. The challenges of modern infrastructure are compounded for IT departments by the changing nature of business — companies are more global than ever, more distributed than ever, and have applications and processes that need to be available 24/7, with downtime often meaning lost productivity or revenue.
These technical and business challenges combine to create a perfect storm for IT departments. Factor in ongoing issues including the rise of shadow IT and the ever-increasing pace of technology acquisition coming from marketing departments, and your average systems administrator or IT manager is truly behind the 8-ball. The support team needs support, and for IT, that means visibility into what is happening in their infrastructure and a rules-based approach to managing and monitoring it. Deep visibility into network traffic, carefully set alerts and dashboards and maps that can be viewed anywhere all give IT an extra hand in maintaining uptime, and more importantly, to keep an eye on their infrastructure when they can’t.
Monitoring is best viewed as preventative medicine. Tracking and benchmarking uptime, bandwidth, CPU usage, capacity, and other metrics over a long period of time provides a baseline from which alerts can be set to inform IT when systems are going awry. Outages and crashes are ultimately very costly, but when IT can identify them before they happen, major problems become minor issues, easy to resolve.
Getting to specifics, a vendor-neutral monitoring tool used appropriately can help prevent a number of common IT headaches, problems that are easy to head off but taxing to fix. Here is a sampling of problems that IT can prevent, saving them time and productivity, with the help of a second set of eyes.
Auto-reboot failing Windows servers
When a Windows service or server fails, most systems administrators receive a text or email notification. The most common way to fix this problem is to send support to manually reboot the entire server. When it’s a service like Outlook and it happens in the middle of the work day, this is a significant problem.
Through monitoring, a proactive administrator can reboot servers automatically by creating a simple script that executes the reboot once the server or service is down for a predefined amount of time. Rather than wait for an alert or a ticket, the monitoring tool acts on its own to automatically reboot a failed or hung-up server, and can even write additional scripts to program specific restart options.
Virtual machines still need attention
Virtualization is a wonderful technology, but if it has a downside, it’s that it is easy to forget about. It is critical to baseline and track CPU and memory load, disk usage, and network usage and monitor those metrics. There are a number of problems that can come up if IT loses track of VMs — wasted resources, drain on network performance, extraneous VMs overloading a single host or, conversely, too few. Establishing baselines and long-term usage patterns is paramount when analyzing the health and success of virtualized infrastructure.
Virtualized environments are also entirely dependent on the efficiency and uptime of its network. So, while it’s critical to monitor the VMs, it’s equally imperative to track the performance of its host server, connection, network switches, and routers. Understanding the many metrics that go into properly performing VMs, and setting up alerts when any one metric moves outside predefined boundaries, not only prevents failure but optimizes network resources and can help keep spending under control.
Troubleshooting database performance
Identifying the root cause of poor database performance is often a difficult and time-consuming task. SQL servers, for example, require maintenance and monitoring to ensure that they are optimized. By baselining performance, database administrators can seek to understand when the server is under most strain and track the number of simultaneous user connections to see if the database is simply overloaded. Additionally, properly configured monitoring can determine the percentage of pages in the buffer cache without having to read the disk, and scripts can be written to automatically increase the amount of memory in that case.
There are any number of things that can go wrong in a complex IT environment, but most of them can be prevented if IT administrators have some help. Visibility and 24/7 monitoring are essential because they give IT something they don’t otherwise have — a second set of eyes watching over their infrastructure, even when they can’t.
Photo credit: Shutterstock