Preserving server hardware (Part 3)
If you would like to read the other parts in this article series please go to:
In the first article of this series we looked first at some different types of airborne particulates and how they can affect the health of small business server systems, PCs, and laptops used in business environments. After that we examined some solutions you can implement for ensuring such systems don't get damaged by airborne particulates. Unfortunately these kinds of protective steps don't always work as well as expected, so in the second article of this series we followed up by looking at some tips and recommendations I've gleaned over the years from my colleagues and also from the readers of our weekly newsletter WServerNews.com as they described how you can safely clean a server or client system that's gotten gunked up with dust, hair and other stuff floating around the air of the typical cubicle office, server room, or dingy hotel room.
Despite our very best intentions and most assiduous preventive maintenance however, server systems, PCs and business laptops may develop problems that can cause applications or services running on them to misbehave in strange and often unpredictable ways or even cease working entirely. The result of business systems going down or behaving unreliably can range from lost revenue to angry customers, so its important that those of us who administer and maintain such systems minimize the risk of damaging downtime.
One problem that may or may not be related to the effect of dust and airborne particulates is the overheating of system hardware. This article examines some of the causes of overheating, various problems associated with overheating, and how to identify and deal with overheating problems when they arise.
Some causes of system hardware overheating
Unfortunately the possible list of causes why server systems, PCs and laptops may overheat is quite long indeed. While not intended to be completely exhaustive, the following is a list of some of the more common causes of system hardware overheating:
Airborne particulates have gunked up the motherboard to the extent that critical components can no longer cool sufficiently from radiant emission or from the convection of air moving around them.
- A fan in the system has become so clogged up with dust or hair that it no longer functions in the removal of excessively warm air.
- A motor driving a fan in the system fails because of shorting or friction due to the accumulation of dust or airborne particulates upon exposed portions of the motor. Another possible cause of fan motor failure is the power cable for the fan coming loose from its connector on the motherboard.
- An air vent or other opening in the case of your system has become clogged with dust or hairballs, has accidentally been covered with tape or a fallen piece of paper or clothing, or is being blocked from doing its job by virtue of the system being placed too close to a wall, a piece of furniture, an enclosure of some sort, or another piece of hardware.
- The climate control system or air conditioning system within your office or server room or rack enclosure fails, causing the room or enclosure to become excessively hot.
- The server room is too small or has inadequate airflow to ensure a proper temperature environment for all the hardware you have running in it.
- The air conditioning system in the room where your servers are located has been optimized for human comfort instead of for the health of the servers located within it.
- The air conditioning system in the server room throttles down during off-peak hours even though all of the server equipment in the room is still running continuously 24x7.
- You have tower servers instead of rack-mounted servers, and you've placed the tower servers too near one another on a table or on the floor.
- The temperature outdoors rises drastically (think Los Angeles last summer in 2015) to the point that the climate control system or air conditioning in your building is unable to maintain an acceptable working environment.
- You left your laptop sitting on a table beside the south- facing window of your hotel room on a blazingly hot summer day. Or you're staying with your grandmother and you left your laptop bag sitting on the radiator in her 100 year old house.
- You've been streaming online video, or watching a movie on a DVD or Blu-Ray disk, or playing an interactive game on your laptop in your hotel room for an excessive period of time causing the CPU in your system to stay pegged at 100 percent for too long a time.
- You're working on your laptop to try and finish a quote for a customer, and you've got the laptop sitting on your lap instead of on a table, which partially blocks air from flowing freely through the vents on the sides of the machine.
- You're sitting at a table working on your laptop but there's a tablecloth covering the table that partially blocks air from flowing through the vents on the machine.
- You're on a business trip to New Orleans or somewhere in the Amazon jungle where the heat is high and the humidity is nearly 100 percent. Excessively high humidity like this can greatly exacerbate the effect of overheating caused by high temperatures.
- You bought a brand-new system when it came out and you haven't updated its BIOS since. Meanwhile, the device manufacturer has issued at least one BIOS update that has updated the outdated temperature table for that particular model of system.
- You've added three more rack-mounted servers to your rack enclosure without first evaluating whether the existing air temperature control system for the enclosure is sufficient to handle all the heat that will be generated by the additional servers.
- The thermal grease between the heat sink and the CPU has become degraded resulting in poor conductivity of heat away from the CPU to the heat sink.
- You've failed to replace some missing blanking plates on the back of your server where internal peripheral cards have previously been removed, with the result that proper airflow within the chassis cannot be maintained.
- You have too many cats sleeping on your server.
Problems that can result from system hardware overheating
When a server, PC or laptop overheats a number of different systems may be manifested. Some of the more common effects from systems overheating include the following:
- The machine starts locking up when you try to use it, causing you lose work by having to forcibly hard reboot the system.
- The system shuts down randomly in the middle of doing your work on it.
- Applications start to run more slowly on the system or perform poorly in other ways. One reason this can happen is that the CPU has detected abnormal heat levels and has automatically scaled down its clock speed to prevent further overheating, which makes executting programs slower.
- When you turn on or restart the system, it fails to complete the boot process and freezes up before the logon screen can be displayed.
- The system freezes or hangs from time to time, and although you can reboot the machine successfully, when you check the event logs there is no event for a bugcheck (blue screen). This is often a telltale sign that a hardware malfunction is occurring on the machine, and one possible cause of such missing bugchecks is an overheating system.
To be continued
Now that we've examined the some of the causes of system hardware overheating as well as a few of the problems that can result from such overheating, I'll continue in my next article by looking at several vendor tools you can use for identifying when system hardware are overheating. I'll also describe a few third-party tools you can use for identifying when your systems are overheating, as well as some practical steps you can take when your system hardware overheats to counter the effects of such overheating.
If you would like to read the other parts in this article series please go to: