The other day I had to implement a small whois client in python. After looking for a moment on the web I thought that this would be an easy task. Indeed the RFC is so small it's disturbing. So it didn't took long to get some sort of code running that could talk to a whois server:
import socket def whois(domain): sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((whois.dns.be, 43)) sock.send(domain + "\r\n") response = '' while True: d = sock.recv(4096) response += d if d == '': break sock.close() return response
However I soon realized that the whois RFC is small for a reason: there is nothing in the specification and so problems start to appear...
The above snippet can talk to a server but will get a proper answer only for .be domains. To get usefull information one needs to know which server to talk to. Back on the web looking for some kind of whois server list. As it turns out whois-servers.net does provide such a list through DNS CNAME records. In order to know the whois server for .com one could do a DNS query for "com.whois-servers.net"
To make the above code more generic you could connect to the whois server with the following strategy:
tld = domain.split('.')[-1] sock.connect(((tld + '.whois-servers.net'), 43))
While this is already alot better. Problem remains as not all domains use the same whois server even if they have the same TLD. To make things even easier not all whois servers have the same syntax or return the same results. If we take google.com we have to query .com.whois-servers.net for the string "=google.com" just to be informed that their actual whois server is different. This information could be obtained by parsing the result in search of the following line "Whois Server: whois.markmonitor.com"
MJJX8TPXDHKM