My wife told me that when she typed something incorrectly on her computer, my face popped up on the screen. After some thought, I realised she was referring to the OpenDNS pages I had configured. I think that OpenDNS is great and I started using it after discovering a reference to it in the Tomato firware for my WRT54GL. OpenDNS configured on the WRT54GL provides great protection for my home network. I think it works better and is more secure than some of the web filtering software installed in some enterprise networks.
This incident reminded me of a perculiar DNS problem that I encountered. I arrived at work and my boss directed me to the dealing desks. There I encountered chaos. The scene is imprinted in my mind as the DBA had just had a car accident and his scalp was stapled to make him look like one of the Borg collective. He stated that he had rebooted every server in site and that since his dealing application was still not working, it must have been me that had made a change on the network. His boss picked up on this and started scolding me about doing changes without any change control authorization. Behind him a bunch of dealers looked like they were forming a lynch mob.
Luckily my boss, was standing next to me, and he placated them while I went to fetch my laptop with Ethereal, now Wireshark. I wanted to do a packet capture of the problem as the preliminary tests showed that all network connectivity was fine. I completed a packet capture of the problem with the dealing application, which interestingly enough had a name similar to a Mexican beer.
First thing I saw in the trace was lots of DNS packets and diddly squat else. I looked at the DNS packets and saw that the application was making a DNS query to the IP address of the main application server, "10.0.32.99" That failed and then it proceeded to try "10.0.32.99.company.co.za" which also failed. The next query was "10.0.32.99.co.za" which replied with an Internet IP address translation. The application then tried to connect to this address and the firewall kicked in and dropped these packets. The puzzle pieces started falling into place. The application was programmed to do a DNS query and if that did not succeed then use the IP address directly. Bad strategy as it turned out! It also explained why the DBA kept on complaining about slow logons, which I always assumed was application related. I also discovered that the previous night a company had registered the domain "99.co.za" They also had a wildcard for all hosts in that domain. The underlying problem was always there, waiting to bite.
We fixed the problem by setting up the application the way it should have been done the first time round. We created a DNS record for the main dealing application server of "mexicanbeer.company.co.za" and configured the application on all the client computers to use this instead of the IP address.
The dealing desks started returning to normal and the manager of the dealing desks walked over to ask as to the cause of the problem. After explaining it to him, he replied that he was then correct in stating that an unauthorised change had been made and we should not do that again. I tried to explain that we had no input into authorizing domains on the Internet and that was done indepemtely by the service providers. His comment was that how dare anyone use "99.co.za" when it was his. I started trying to explain the difference between IP addresses and DNS domains, his eyes glazed over, and I knew it was a lost cause.
Finally, I told him I would phone up our service provider and tell them to cease and desist from doing it again.
These and other experiences, have resulted in me creating a network troubleshooting checklist.
This whole episode brings to the fore a question in my mind as to how do you deal with explaining a technical cause to someone who has no idea as to what you are on about? Do you do like me, and lie through your teeth?
Ronald is an IT firefighter who enjoys the thrill of solving and analyzing problems. He was painted into a corner to become an IT firefighter because as a network engineer he quickly learned that everyone blamed the network, when there was a problem. He now works in the field of infrastructure architecture and service management.
|
|