Python effective TLD library

    I had a need recently for a library which would take a host name and return the domain-specific portion of the name, and the effective TLD being used. "Effective TLD" is a term coined by the Mozilla project for something which acts like a TLD. For example, .com is a TLD and has domains allocated under it. However, .au is a TLD with no domains under it. The effective TLDs for the .au domain are things like .com.au and .edu.au. Whilst there are libraries for other languages, I couldn't find anything for python.

    I therefore wrote one. Its very simple, and not optimal. For example, I could do most of the processing with a single regexp if python supported more than 100 match groups in a regexp, but it doesn't. I'm sure I'll end up revisiting this code sometime in the future. Additionally, the code ended up being much easier to write than I expected, mainly because the Mozilla project has gone to the trouble of building a list of rules to determine the effective TLD of a host name. This is awesome, because it saved me heaps and heaps of work.

    The code is at http://www.stillhq.com/python/etld/etld.py if you're interested.

posted at: 06:42 | path: /python/etld | permanent link to this entry

    Add a comment to this post:

    Your name:

    Your email: Email me new comments on this post
      (Your email will not be published on this site, and will only be used to contact you directly with a reply to your comment if needed. Oh, and we'll use it to send you new comments on this post it you selected that checkbox.)


    Your website:

    Comments: