Search
Recent blog posts
- Always Innovating Touch Book, Part 2 - The Day After
- Always Innovating Touch Book, Part 1 - First Impressions
- Deco Theme and CSS Optimisation - Finally Solved
- Problems with ejabberd (s2s connections failing)
- We are Geeks
- So-long reCAPTCHA
- FOSDEM 2010
- First Impressions After FOSDEM 2010
- System-wide Installation of Certificates
- Free Software - The Social Aspect
Links
Problems with ejabberd (s2s connections failing)
Branko Majic — 13. June 2010 - 14:00
In the latest turn of events I ran into a problem with my ejabberd server (running for this very domain) somewhere around Friday. All of a sudden, without any change at all, I found myself unable to see the status of all external contacts in my jabber client, Gajim. This happened somewhere around Friday, but I didn't notice it before Saturday, when a friend of mine noticed I'm not showing-up on his roster list. After some initial investigation (involving checking the firewall, which hasn't really changed in a while, as well as making sure the server itself is running ok), I gave up for that day since it was my nephew's birthday.
Of course, I couldn't let this issue keep going, since I do depend on my Jabber server for some contacts. I've spent about 2-3 hours searching on the Internet, trying to find the solution. All DNS requests seem to work ok using the dig command, making me rule out that as a possible problem (as it turned out later on, it was a wrong assumption). What got me going in the right direction was a Debian bug report #539409. After trying out the suggested commands provided in that bug report, I found out that DNS queries issued in such a way seem to wait indefinitely. At first I thought the /etc/resolv.conf wasn't parsed correctly, so I made a copy and had it have only a single nameserver directive. The issue remained, but then I decided to try querying some other DNS server for the _same_ SRV record. And... It returned a proper response. After two more dig commands, I finally found out the DNS server which was first in the list of nameservers in /etc/resolv.conf did not respond to any query. I rolled-back the old resolv.conf, commented-out the offending nameserver line, restarted ejabberd, and voilà - it all started to work.
And the lesson of the day? The next time when you have a problem with some network server, make sure that the issue is really not a network problem. All kinds of funny things can happen with the network and network-related services, and very often it's the main culprit when something stops working.





