Content here is by Michael Still mikal@stillhq.com. All opinions are my own.
See recent comments. RSS feed of all comments.


Thu, 11 May 2017



Python3 venvs for people who are old and grumpy

    I've been using virtualenvwrapper to make venvs for python2 for probably six or so years. I know it, and understand it. Now some bad man (hi Ramon!) is making me do python3, and virtualenvwrapper just isn't a thing over there as best as I can tell.

    So how do I make a venv? Its really not too bad...

    First, install the dependencies:

      git clone git://github.com/yyuu/pyenv.git .pyenv
      echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
      echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
      echo 'eval "$(pyenv init -)"' >> ~/.bashrc
      git clone https://github.com/yyuu/pyenv-virtualenv.git ~/.pyenv/plugins/pyenv-virtualenv
      source ~/.bashrc
      


    Now to make a venv, do something like this (in this case, infrasot is the name of the venv):

      mkdir -p ~/.virtualenvs/pyenv-infrasot
      cd ~/.virtualenvs/pyenv-infrasot
      pyenv virtualenv system infrasot
      


    You can see your installed venvs like this:

      $ pyenv versions
      * system (set by /home/user/.pyenv/version)
        infrasot
      


    Where system is the system installed python, and not a venv. To activate and deactivate the venv, do this:

      $ pyenv activate infrasot
      $ ... stuff you're doing ...
      $ pvenv deactivate
      


    I'll probably write wrappers at some point so that this looks like virtualenvwrapper, but its good enough for now.

    Tags for this post: python venv virtualenvwrapper python3
    Related posts: Implementing SCP with paramiko; Packet capture in python; A pythonic example of recording metrics about ephemeral scripts with prometheus; mbot: new hotness in Google Talk bots; Calculating a SSH host key with paramiko; Twisted conch

posted at: 21:20 | path: /python | permanent link to this entry


Sun, 28 Nov 2010



Multiple file support with scp

    Paramiko doesn't provide a scp implementation, so I've been using my own for a while. http://blogs.sun.com/janp/entry/how_the_scp_protocol_works provides good documentation about the scp protocol, but it missed out on one detail I needed -- how to send more than one file in a given session. In the end I implemented a simple scp logger to see what the protocol was doing during the copying of files. My logger said this:

      >>> New command invocation: /usr/bin/scp -d -t /tmp
      O: \0
      I: C0644 21 a\n
      O: \0
      I: file a file a file a\n\0
      O: \0
      I: C0644 21 b\n
      O: \0
      I: file b file b file b\n\0
      O: \0
      >>>stdin closed
      >>> stdout closed
      >>> stderr closed
      


    It turns out its important to wait for those zeros by the way. So, here's my implementation of the protocol to send more than one file. Turning this into paramiko code is left as an exercise for the reader.

      #!/usr/bin/python
      
      import fcntl
      import os
      import select
      import string
      import subprocess
      import sys
      import traceback
      
      
      def printable(s):
        out = ''
      
        for c in s:
          if c == '\n':
            out += '\\n'
          elif c in string.printable:
            out += c
          else:
            out += '\\%d' % ord(c)
      
        return out
      
      
      try:
        dialog = ['C0644 21 c\n',
                  'file c file c file c\n\0',
                  'C0644 21 d\n',
                  'file d file d file d\n\0']
      
        proc = subprocess.Popen(['scp', '-v', '-d', '-t', '/tmp'],
                                stdin=subprocess.PIPE,
                                stdout=subprocess.PIPE,
                                stderr=subprocess.PIPE)
      
        r = [proc.stdout, proc.stderr]
        w = []
        e = [proc.stdout, proc.stderr]
      
        fl = fcntl.fcntl(proc.stdout, fcntl.F_GETFL)
        fcntl.fcntl(proc.stdout, fcntl.F_SETFL, fl | os.O_NONBLOCK)
        fl = fcntl.fcntl(proc.stderr, fcntl.F_GETFL)
        fcntl.fcntl(proc.stderr, fcntl.F_SETFL, fl | os.O_NONBLOCK)
      
        stdin_closed = False
        while proc.returncode is None:
          (readable, _, errorable) = select.select(r, w, e)
      
          for flo in readable:
            if flo == proc.stdout:
              d = os.read(proc.stdout.fileno(), 1024)
              if len(d) > 0:
                sys.stdout.write('O: %s\n' % printable(d))
      
                if len(dialog) > 0:
                  sys.stdout.write('I: %s\n' % printable(dialog[0]))
                  os.write(proc.stdin.fileno(), dialog[0])
                  dialog = dialog[1:]
      
                if len(dialog) == 0 and not stdin_closed:
                  sys.stdout.write('>>> stdin closed\n')
                  proc.stdin.close()
                  stdin_closed = True
      
              else:
                sys.stdout.write('>>> stdout closed\n')
                r.remove(proc.stdout)
                e.remove(proc.stdout)
      
            elif flo == proc.stderr:
              d = os.read(proc.stderr.fileno(), 1024)
              if len(d) > 0:
                sys.stdout.write('E: %s\n' % printable(d))
              else:
                sys.stdout.write('>>> stderr closed\n')
                r.remove(proc.stderr)
                e.remove(proc.stderr)
      
            else:
              sys.stdout.write('>>> Unknown readable: %s: %s\n'
                               %(repr(flo), flo.read()))
      
          for flo in errorable:
            sys.stdout.write('>>> Error on %s\n' % repr(flo))
            r.remove(flo)
            e.remove(flo)
      
          proc.poll()
      
        print '#: %s' % proc.returncode
      
      except:
        exc = sys.exc_info()
        for tb in traceback.format_exception(exc[0], exc[1], exc[2]):
          print tb
          del tb
      


    Tags for this post: python paramiko scp protocol
    Related posts: Implementing SCP with paramiko; Calculating a SSH host key with paramiko; paramiko exec_command timeout; Weird paramiko problem; Executing a command with paramiko; Packet capture in python

posted at: 02:40 | path: /python/paramiko | permanent link to this entry


Mon, 28 Jun 2010



pyconau 2010 twitter summary

posted at: 21:20 | path: /python | permanent link to this entry


Tue, 24 Nov 2009



Python effective TLD library bug fix

posted at: 13:57 | path: /python/etld | permanent link to this entry


Sun, 01 Nov 2009



Python effective TLD library update

posted at: 09:45 | path: /python/etld | permanent link to this entry


Mon, 26 Oct 2009



Python effective TLD library

posted at: 06:42 | path: /python/etld | permanent link to this entry


Mon, 05 Jan 2009



Calculating a SSH host key with paramiko

    I needed to compare a host key from something other than a known_hosts file with what paramiko reports as part of the SSH connection today. If you must know, the host keys for these machines are retrieved a XMLRPC API... It turned out to be a lot easier than I thought. Here's how I produced the host key entry as it appears in that API (as well as in the known_hosts file):

      #!/usr/bin/python
      
      # A host key calculation example for Paramiko.
      # Args:
      #   1: hostname
      
      import base64
      import os
      import paramiko
      import socket
      import sys
      
      # Socket connection to remote host
      sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
      sock.connect((sys.argv[1], 22))
      
      # Build a SSH transport
      t = paramiko.Transport(sock)
      t.start_client()
      key = t.get_remote_server_key()
      
      print '%s %s' %(key.get_name(),
                      base64.encodestring(key.__str__()).replace('\n', ''))
      
      t.close()
      sock.close()
      


    Note that I could also have constructed a paramiko key object based on the output of the XMLRPC API and then compared those two objects, but I prefer the human readable strings.

    Tags for this post: python paramiko host key
    Related posts: Implementing SCP with paramiko; Multiple file support with scp; paramiko exec_command timeout; Weird paramiko problem; Executing a command with paramiko; Packet capture in python

posted at: 16:28 | path: /python/paramiko | permanent link to this entry


Wed, 10 Dec 2008



Killing a blocking thread in python?

posted at: 14:03 | path: /python | permanent link to this entry


Tue, 25 Nov 2008



Packet capture in python

    I'm home sick with a cold today and got bored. I wanted to play with packet capture in python, and the documentation for pcapy is a little sparse. I therefore wrote this simple little sample script:

      #!/usr/bin/python
      
      # A simple example of how to use pcapy. This needs to be run as root.
      
      import datetime
      import gflags
      import pcapy
      import sys
      
      FLAGS = gflags.FLAGS
      gflags.DEFINE_string('i', 'eth1',
                           'The name of the interface to monitor')
      
      
      def main(argv):
        # Parse flags
        try:
          argv = FLAGS(argv)
        except gflags.FlagsError, e:
          print FLAGS
      
        print 'Opening %s' % FLAGS.i
      
        # Arguments here are:
        #   device
        #   snaplen (maximum number of bytes to capture _per_packet_)
        #   promiscious mode (1 for true)
        #   timeout (in milliseconds)
        cap = pcapy.open_live(FLAGS.i, 100, 1, 0)
      
        # Read packets -- header contains information about the data from pcap,
        # payload is the actual packet as a string
        (header, payload) = cap.next()
        while header:
          print ('%s: captured %d bytes, truncated to %d bytes'
                 %(datetime.datetime.now(), header.getlen(), header.getcaplen()))
      
          (header, payload) = cap.next()
      
      
      if __name__ == "__main__":
        main(sys.argv)
      


    Which outputs something like this:

      2008-11-25 10:09:53.308310: captured 98 bytes, truncated to 98 bytes
      2008-11-25 10:09:53.308336: captured 66 bytes, truncated to 66 bytes
      2008-11-25 10:09:53.315028: captured 66 bytes, truncated to 66 bytes
      2008-11-25 10:09:53.316520: captured 130 bytes, truncated to 100 bytes
      2008-11-25 10:09:53.317030: captured 450 bytes, truncated to 100 bytes
      2008-11-25 10:09:53.324414: captured 124 bytes, truncated to 100 bytes
      2008-11-25 10:09:53.327770: captured 114 bytes, truncated to 100 bytes
      2008-11-25 10:09:53.328001: captured 210 bytes, truncated to 100 bytes
      


    Next step, decode me some headers!

    Tags for this post: python pcapy example
    Related posts: Dear lazy web: writing to the win32 event log in Python; Implementing SCP with paramiko; A pythonic example of recording metrics about ephemeral scripts with prometheus; mbot: new hotness in Google Talk bots; Calculating a SSH host key with paramiko; Twisted conch

posted at: 10:22 | path: /python/pcapy | permanent link to this entry


Tue, 11 Nov 2008



Finding locking deadlocks in python

    I re-factored some code today, and in the process managed to create a lock deadlock for myself. In the end it turned out to be an exception was being thrown when a lock was held, and adding a try / finally resolved the real underlying problem. However, in the process I ended up writing this little helper that I am sure will be useful in the future.

      import gflags
      import thread
      import threading
      import traceback
      import logging
      
      ...
      
      FLAGS = gflags.FLAGS
      gflags.DEFINE_boolean('dumplocks', False,
                            'If true, dumps information about lock activity')
      ...
      
      class LockHelper(object):
        """A wrapper which makes it easier to see what locks are doing."""
      
        lock = thread.allocate_lock()
      
        def acquire(self):
          if FLAGS.dumplocks:
            logging.info('%s acquiring lock' % threading.currentThread().getName())
            for s in traceback.extract_stack():
              logging.info('  Trace %s:%s [%s] %s' % s)
          self.lock.acquire()
      
        def release(self):
          if FLAGS.dumplocks:
            logging.info('%s releasing lock' % threading.currentThread().getName())
            for s in traceback.extract_stack():
              logging.info('  Trace %s:%s [%s] %s' % s)
          self.lock.release()
      


    Now I can just use this helper in the place of thread.allocate_lock() when I want to see what is happening with locking. It saved me a lot of staring at random code today.

    Tags for this post: python lock deadlock debug
    Related posts: Implementing SCP with paramiko; Packet capture in python; A pythonic example of recording metrics about ephemeral scripts with prometheus; Interesting technique for finding leaks in code; mbot: new hotness in Google Talk bots; Calculating a SSH host key with paramiko

posted at: 15:46 | path: /python | permanent link to this entry


Sun, 05 Oct 2008



paramiko exec_command timeout

posted at: 12:20 | path: /python/paramiko | permanent link to this entry


Tue, 16 Sep 2008



Weird paramiko problem

    I had a strange paramiko problem the other day. Sometimes executing a command through a channel (via the exec_command() call) would result in an exit code being returned, but no stdout or stderr. This was for a command I was absolutely sure always returns output, and it wasn't consistent -- I'd run batches of commands and about 10% of them would fail, but not always on the same machine and not always at the same time. I spent ages looking at my code, and the code for the command running at the other end of the channel.

    Then it occurred to me that this seemed a lot like a race condition. I started looking at the code for the paramiko Channel class, and ended up deciding that the answer was to check that the eof_received member variable was true before trying to close the channel.

    It turns out this just works. I've my code running commands for a couple of days now and have had zero more instances of the "no output, but did exit" error. So, there you go. Its a shame that member variable doesn't have accessors and isn't documented though. I guess that makes my code a little more fragile than I would be happy with.

    Tags for this post: python paramiko bug race condition
    Related posts: Implementing SCP with paramiko; Calculating a SSH host key with paramiko; Multiple file support with scp; paramiko exec_command timeout; Executing a command with paramiko; PNGtools 0.4

posted at: 11:41 | path: /python/paramiko | permanent link to this entry


Wed, 03 Sep 2008



Executing a command with paramiko

    I wanted to provide a simple example of how to execute a command with paramiko as well. This is quite similar to the scp example, but is nicer than executing a command in a shell because there isn't any requirement to do parsing to determine when the command has finished executing.

      #!/usr/bin/python
      
      # A simple command example for Paramiko.
      # Args:
      #   1: hostname
      #   2: username
      #   3: command to run
      
      import getpass
      import os
      import paramiko
      import socket
      import sys
      
      # Socket connection to remote host
      sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
      sock.connect((sys.argv[1], 22))
      
      # Build a SSH transport
      t = paramiko.Transport(sock)
      t.start_client()
      t.auth_password(sys.argv[2], getpass.getpass('Password: '))
      
      # Start a cmd channel
      cmd_channel = t.open_session()
      cmd_channel.exec_command(sys.argv[3])
      
      data = cmd_channel.recv(1024)
      while data:
        sys.stdout.write(data)
        data = cmd_channel.recv(1024)
      
      # Cleanup
      cmd_channel.close()
      t.close()
      sock.close()
      


    Tags for this post: python paramiko exec
    Related posts: Implementing SCP with paramiko; Calculating a SSH host key with paramiko; Multiple file support with scp; paramiko exec_command timeout; Weird paramiko problem; Packet capture in python

posted at: 15:11 | path: /python/paramiko | permanent link to this entry


Implementing SCP with paramiko

    Regular readers will note that I've been interested in how scp works and paramiko for the last couple of days. There are previous examples of how to do scp with paramiko out there, but the code isn't all on one page, you have to read through the mail thread and work it out from there. I figured I might save someone some time (possibly me!) and note a complete example of scp with paramiko...

      #!/usr/bin/python
      
      # A simple scp example for Paramiko.
      # Args:
      #   1: hostname
      #   2: username
      #   3: local filename
      #   4: remote filename
      
      import getpass
      import os
      import paramiko
      import socket
      import sys
      
      # Socket connection to remote host
      sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
      sock.connect((sys.argv[1], 22))
      
      # Build a SSH transport
      t = paramiko.Transport(sock)
      t.start_client()
      t.auth_password(sys.argv[2], getpass.getpass('Password: '))
      
      # Start a scp channel
      scp_channel = t.open_session()
                
      f = file(sys.argv[3], 'rb')
      scp_channel.exec_command('scp -v -t %s\n'
                               % '/'.join(sys.argv[4].split('/')[:-1]))
      scp_channel.send('C%s %d %s\n'
                       %(oct(os.stat(sys.argv[3]).st_mode)[-4:],
                         os.stat(sys.argv[3])[6],
                         sys.argv[4].split('/')[-1]))
      scp_channel.sendall(f.read())
      
      # Cleanup
      f.close()
      scp_channel.close()
      t.close()
      sock.close()
      


    Tags for this post: python paramiko scp
    Related posts: Multiple file support with scp; Calculating a SSH host key with paramiko; paramiko exec_command timeout; Weird paramiko problem; Executing a command with paramiko; Packet capture in python

posted at: 13:28 | path: /python/paramiko | permanent link to this entry


Tue, 05 Aug 2008



SSL, X509, ASN.1 and certificate validity dates

posted at: 15:53 | path: /python/tlslite | permanent link to this entry


Thu, 10 Jul 2008



Dealing with remote HTTP servers with buggy chunking implementations

    HTTP 1.1 implements chunking as a way of servers telling clients how much content is left for a given request, which enables you to send more than one piece of content in a given HTTP connection. Unfortunately for me, the site I was trying to access has a buggy chunking implementation, and that causes the somewhat fragile python urllib2 code to throw an exception:

      Traceback (most recent call last):
        File "./mythingie.py", line 55, in ?
          xml = remote.readlines()
        File "/usr/lib/python2.4/socket.py", line 382, in readlines
          line = self.readline()
        File "/usr/lib/python2.4/socket.py", line 332, in readline
          data = self._sock.recv(self._rbufsize)
        File "/usr/lib/python2.4/httplib.py", line 460, in read
          return self._read_chunked(amt)
        File "/usr/lib/python2.4/httplib.py", line 499, in _read_chunked
          chunk_left = int(line, 16)
      ValueError: invalid literal for int(): 
      


    I muttered about this earlier today, including finding the bug tracking the problem in pythonistan. However, finding the will not fix bug wasn't satisfying enough...

    It turns out you can just have urllib2 lie to the server about what HTTP version it talks, and therefore turn off chunking. Here's my sample code for how to do that:

      import httplib
      import urllib2
      
      class HTTP10Connection(httplib.HTTPConnection):
        """HTTP10Connection -- a HTTP connection which is forced to ask for HTTP
           1.0
        """
      
        _http_vsn_str = 'HTTP/1.0'
      
      
      class HTTP10Handler(urllib2.HTTPHandler):
        """HTTP10Handler -- don't use HTTP 1.1"""
      
        def http_open(self, req):
          return self.do_open(HTTP10Connection, req)
      
      // ...
      
        request = urllib2.Request(feed)
        request.add_header('User-Agent', 'mythingie')
        opener = urllib2.build_opener(HTTP10Handler())
        
        remote = opener.open(request)
        content = remote.readlines()
        remote.close()
      


    I hereby declare myself Michael Still, bringer of the gross python hacks.

    Tags for this post: python urllib2 buggy chunking
    Related posts: Implementing SCP with paramiko; Packet capture in python; A pythonic example of recording metrics about ephemeral scripts with prometheus; mbot: new hotness in Google Talk bots; Calculating a SSH host key with paramiko; Twisted conch

posted at: 22:27 | path: /python | permanent link to this entry


Wed, 09 Jul 2008



Universal Feedparser and XML namespaces

    I've always found python's Universal Feedparser to be a bit hard to work with when using feeds with XML namespaces. Specifically, if you don't care about the stuff in the namespaces then you're fine, but if you want that data it gets a lot harder.

    In the past I've had to do some gross hacks. For example this gem is from the MythNetTV code:

        # Modify the XML to work around namespace handling bugs in FeedParser
        lines = []
        re_mediacontent = re.compile('(.*)<media:content([^>]*)/ *>(.*)')
      
        for line in xmllines:
          m = re_mediacontent.match(line)
          count = 1
          while m:
            line = '%s<media:wannabe%d>%s</media:wannabe%d>%s' %(m.group(1), count,
                                                               m.group(2),
                                                               count, m.group(3))
            m = re_mediacontent.match(line)
            count = count + 1
      
          lines.append(line)
      
        # Parse the modified XML
        xml = ''.join(lines)
        parser = feedparser.parse(xml)
      


    Which is horrible, but works. This time around the problem is that I am having trouble getting to the gr:annotation tags in my Google reader shared items feed. How annoying.

    In the case of the Google reader feed, the problem seems to be that the annotation is presented like this:

      <gr:annotation><content type="html">Awesome. Canberra has needed
      something better than buses between the towncenters for a while, and light rail 
      seems like a great way to do it. I much prefer trains to buses, and catch a 
      light rail service to work every day when I am in Mountain View.
      </content><author gr:user-id="09387883873401903052" 
      gr:profile-id="114835605728492647856"><name>mikal</name>
      </author></gr:annotation>
      


    Feedparser can only handle simple elements (not elements that contain other elements). Therefore, this gross hack is required to get this to parse correctly:

        simplify_re = re.compile('(.*)<gr:annotation>'
                                 '<content type="html">(.*)</content>'
                                 '<author .*><name>.*</name></author>'
                                 '</gr:annotation>(.*)')
      
        new_lines = []
        for line in lines:
          m = simplify_re.match(line)
          if m:
            new_lines.append('%s<gr:annotation>%s</gr:annotation>%s'
                             %(m.group(1), m.group(2), m.group(3)))
          else:
            new_lines.append(line)
      
        d = feedparser.parse(''.join(new_lines))
      


    Gross, and fragile, but working. This is cool, because it now means that I can apply more logic in the shared links that end up in my blather feed. I'm thinking of something along the lines of only shared links with an annotation will end up in that feed, and the blather entry will include the annotation. Or something like that.

    Tags for this post: python feedparser namespace hack
    Related posts: Implementing SCP with paramiko; Packet capture in python; A pythonic example of recording metrics about ephemeral scripts with prometheus; CVS digital cameras and handy cams; mbot: new hotness in Google Talk bots; Calculating a SSH host key with paramiko

posted at: 05:22 | path: /python/feedparser | permanent link to this entry


Tue, 01 May 2007



Domain name lookup helper for python?

    Hi. I have a list of the domain portion of URLs which looks a bit like this:

    Whois lookup for fycnds.digitalpoimt.com
    Whois lookup for wvgpzdea.digitalpoimt.com
    Whois lookup for zhnsht.digitalpoimt.com
    Whois lookup for frigo25.php5.cz
    Whois lookup for handrovina.php5.cz
    Whois lookup for blabota.php5.cz
    Whois lookup for pctuzing.php5.cz
    Whois lookup for viagraviagra.php5.cz
    Whois lookup for poiu.php5.cz
    Whois lookup for flasa.php5.cz
    Whois lookup for yoy4.digitalpoimt.com
    Whois lookup for hskly.digitalpoimt.com
    Whois lookup for 2i0wjwbc.digitalpoimt.com
    Whois lookup for harnhjc.digitalpoimt.com
    Whois lookup for gqru.digitalpoimt.com
    


    I need some code which determines which portion of these hostnames is a whois-able domain name. My problem is this doesn't seem all that simple to do -- some countries have a second layer of TLDs, and some do not.

    Does anyone know of a python library, or failing that simple algorithm, which will do this for me?

    (For those left wondering, I am trying to do some analysis of the spam I get on this blog, and for that I want to know if the whois information for a domain that left a suspect comment indicates anything suspicious.)

    Tags for this post: python internet whois lookup
    Related posts: I think I've worked out the problem with the hotel network; Internet access in Perth; iBurst: Qantas Club Sydney domestic terminal two; Implementing SCP with paramiko; iBurst: Coverage in Canberra still sucks; Internet outage

posted at: 21:00 | path: /python | permanent link to this entry


Thu, 14 Dec 2006



Dear lazy web: writing to the win32 event log in Python

posted at: 23:01 | path: /python | permanent link to this entry


Mon, 08 May 2006



Twisted conch

posted at: 09:55 | path: /python/twisted | permanent link to this entry