Python effective TLD library

    I had a need recently for a library which would take a host name and return the domain-specific portion of the name, and the effective TLD being used. "Effective TLD" is a term coined by the Mozilla project for something which acts like a TLD. For example, .com is a TLD and has domains allocated under it. However, .au is a TLD with no domains under it. The effective TLDs for the .au domain are things like .com.au and .edu.au. Whilst there are libraries for other languages, I couldn't find anything for python.

    I therefore wrote one. Its very simple, and not optimal. For example, I could do most of the processing with a single regexp if python supported more than 100 match groups in a regexp, but it doesn't. I'm sure I'll end up revisiting this code sometime in the future. Additionally, the code ended up being much easier to write than I expected, mainly because the Mozilla project has gone to the trouble of building a list of rules to determine the effective TLD of a host name. This is awesome, because it saved me heaps and heaps of work.

    The code is at http://www.stillhq.com/python/etld/etld.py if you're interested.

    Tags for this post: python etld effective tld mozilla
    Related posts: Python effective TLD library bug fix; Python effective TLD library update; I think I found a bug in python's unittest.mock library; Finding locking deadlocks in python; paramiko exec_command timeout; Packet capture in python

posted at: 06:42 | path: /python/etld | permanent link to this entry