It may not be part of their job description, but there is a time when every IT professional must become a detective. For me, it was during a hot summer in 2004 in Saudi Arabia. I was working with a team of network engineers on a computer network of a Saudi manufacturing company that had multiple sites. Two of the sites were in the same area about a kilometer apart, and both connected with a DSL. In one of my team meetings with the company, the decision was made to replace the DSL between the two sites with wireless point-to-point radio link. Although the company’s IT department was sufficiently large to address the information technology and networking requirements, the IT manager assigned the project to the network team that consisted of three members, including me.
Our team was small and, unfortunately, a bit inexperienced in wireless technologies. But we took on the challenge because that’s what the technology business is all about. The vendor agreed to do some preliminary installation work on the wireless devices.
‘You are set now’
On Wednesday night, we picked up the vendor, who provided retail and installation services for network devices. We headed toward the site where the wireless equipment that connects both branches of the company was to be installed. The facility was on the outskirts of the city in the industrial area. As we entered the building, the offices were empty, but we could still hear the roar of the machines carrying out the manufacturing work 24 hours a day. The three of us were the only IT guys in the building. The next two days were official holidays in Saudi Arabia, and the time selected for this installation was deliberately chosen, giving us a grace time in case we ran into a problem — which, of course, we did.
We headed to the rooftop, where we mounted one of the two transceiver devices. After the device was installed, the vendor connected the network cable in his laptop, opened a command prompt in Windows, and gave a continuous ping command to check the connectivity. The ping reply was not returned. He adjusted the direction of the wireless transceiver device and after a few attempts the ping reply started coming back. The vendor looked at us and said, “You are set now.” Little did we know that the network was going to unleash havoc on us that converted us into zombies punching keys on company’s digital machines looking for a solution to a mysterious problem.
The next day my colleague discovered that emails were not going through between the two sites. We were able to ping the router and other devices from our location, but when email was sent, a horrendously haunting message came back that, in effect, stated that the email cannot go through. Our first thought was that Microsoft Exchange Server 2003 was responsible for the wayward email, but troubleshooting a monstrous software like Exchange is not child’s play — and is not something done quickly. One of my colleagues said, “How this can be? We can ping, so there is connection, and the emails should go.”
Investigating the problem
We decided to split the investigating work among the three of us. I took the Microsoft Exchange 2003 server. Another took the routers, and the third guy investigated the network traffic monitoring in the hope of finding a clue.
Computer networking is a piece of cake when a network is in production and there are no critical problems. But once a problem strikes, it will take control of your mind. Forget about sleep: You spend days and nights becoming a network troubleshooter. You usually are under tremendous pressure from management to find and solve the problem, because if the network is down for any period of time employees will lose productivity and the company can lose huge sums of money. Talk about a lose-lose situation! Although our network problem was taking place during off-peak hours, time was running out.
All three of us strained every nerve trying to find the problem. IT professionals know that the flow of emails in an organization is like blood flow in the human body. We knew that this problem had to be resolved before the holiday was over and most employees returned to work.
I searched the Internet, and while I couldn’t find a solution, I did pick up some useful information. My colleagues were checking the settings of Cisco routers, from small tweaks to complicated commands. All in vain, and time was flying. The pressure on our team was building.
On Friday night, one of my colleagues started to have doubts about the software installation of the transceiver that was mounted on the roof. We could access the wireless transceiver device set on the rooftop through a Graphical User Interface on a browser, but the GUI interface was filled with jargon and technical terms that were fairly new to us at that time. We spent the night testing the different settings on the web interface of the wireless device. On Saturday morning, I found two settings that were supposed to work in combination with each other. When we start playing with these settings, the emails started to pass through.
This was it: two simple settings.
Lessons for a lifetime
The actual problem was that the wireless device was configured to send packets of a small size to the other end. These small-sized packets were large enough to carry the ping packets, but when larger packets like email messages were sent, the packets were not large enough to carry them. When the packet size was increased in the settings, the email messages immediately started to pass through the wireless device to the other side of the building.
This incident, more than 12 years ago, led me to the understanding that job experience is the greatest teacher. Many times, the underlying problem may not actually be difficult or even complex. It is our approach to finding it that ultimately leads to the solution.