Content here is by:
Michael Still
mikal@stillhq.com

All my Open Source projects
Extracted view of CVS
Home
Site map

See recent comments. RSS feed of all comments. Raw dump of all comments for research purposes.

ImageMagick book

MythTV book








Thu, 04 Dec 2008



Blog spam

    I occasionally complain about blog spam, but it seems I should take special time to mention that there have now been over 400,000 spam comments on stillhq.com. Specifically, the site current tells me "1335 comments today, 1335 of them spam. 401209 comments ever, 400220 of the spam.". Dear spammers -- you suck.

    I wonder if there is anything useful which can be done with all this spam? Just in case, its available at http://www.stillhq.com/allcomments/.

    Tags for this post: blog(S) spam(S)

posted at: 13:14 | path: /diary/spam | permanent link to this entry


Internet traffic

    I estimate (badly, I might add) that I currently use about 200gb of Internet traffic on my DSL link a month. If I'm going to move back to Australia sometime, that's going to become a killer. Unfortunately, because my ISP doesn't bill for traffic here in the US, they don't appear to track my use at all. I think it might be time for me to do some tracking myself.

    So, one of life's little questions. Do I use pcap to snarf traffic on the DSL, or use iptable's conntrack stuff in /proc? Just one more thing to ponder.

    Tags for this post: blog(S)

posted at: 11:34 | path: /diary | permanent link to this entry


Wed, 03 Dec 2008



Blathering for Wednesday, 03 December 2008

    09:33: Mikal shared: Exploding Offer Season
      I hadn't heard of an exploding offer until I moved to the US. The other thing I would suggest doing if you're subject to one is to make sure that the recruiters at the other companies know that you've been made an exploding offer. They shouldn't view it as a threat (these things are quite common) and ill quite often rearrange your interviews so that they can make you an offer before your other one explodes. Microsoft seems to be the big user of these exploding offers as best as I can tell.

    09:33: Mikal shared: Swiss precision
      It sounds like moving to Switzerland is about as much of a pain as it was moving to the US. Perhaps its painful to move to any country?

    09:33: Mikal shared: Register of penalty notices | NSW Food Authority
      Scary. A list of restaurants in New South Wales which have been served with a penalty notice for violating food safety standards. I should remember to check this next time I eat out...

    09:33: Mikal shared: Apple's completely unsurprising Black Friday deals appear on Australian site
      Why is Apple running Thanksgiving sales in Australia? Will they be running a Queens Birthday sale in the US soon?

    09:33: Mikal shared: Notes on Hacking the Roku Netflix Player
      This fellow has made some interesting progress on hacking the Roku Netflix player. I wonder if Roku have considered allowing a streaming frontend that either does uPNP or MythTV directly?

    09:33: Mikal shared: InstallMythBuntu - atv-bootloader - Google Code
      An alternative to Roku's box as a MythTV frontend is the AppleTV, which does currently work. This page is the install instructions for MythBuntu on an AppleTV. Pity its twice the price of the Roku box.



    Tags for this post: blather(S)

posted at: 09:33 | path: /blather | permanent link to this entry


Tue, 02 Dec 2008



Death Bringer




    ISBN: 0747400016
    LibraryThing
    I feel sorry for Jake Olsen. It must be hard having the ensign-in-the-red-shirt role. This book is as good as the rest of the series, and the First Family truly are bastards. I really liked this one.

    Tags for this post: book(S) Patrick_Tilley(S)


posted at: 21:22 | path: /book/Patrick_Tilley | permanent link to this entry


Sun, 30 Nov 2008



Books read in November 2008

posted at: 19:15 | path: /book/read | permanent link to this entry


Sat, 29 Nov 2008



Blood River




    ISBN: 0747400008
    LibraryThing
    This is the fourth book in the Amtrak Wars series. Its pretty good, about the same level of writing as Cloud Warrior and Iron Master, which I guess means it was better than First Family (which was mostly a connector between the first and third books in the series). In this book we learn that the first family are even more nasty than previously disclosed, and that Cadillac is possibly the most annoying person on the planet.

    This book must not have been very popular in the US, because I've never seen it for sale in the US, but its reasonably common in Australia. That's a shame because the copy I have is about to fall apart, which means I'll have to wait until next time I am back home to try and find a replacement copy.

    Overall this book was quite readable, and I enjoyed it.

    Tags for this post: book(S) Patrick_Tilley(S)


posted at: 16:17 | path: /book/Patrick_Tilley | permanent link to this entry


Wed, 26 Nov 2008



Isaac Asimov's Robot City: Suspicion




    ISBN: 0441731260
    LibraryThing
    This is the second book in the Isaac Asimov's Robot City series, and follows on directly from Odyssey. In fact, it follows so closely that it feels like it should be part of that earlier book.

    I preferred this book to the first in the series, I suspect because it didn't need to use a random unexplained change to escape a dying plot line (which is what I felt happened about a third of the way through the first book). This book does feel a little juvenile though, but I forgive it.

    Tags for this post: book(S) Mike_McQuay(S)


posted at: 15:21 | path: /book/Mike_McQuay | permanent link to this entry


PNGtools 0.4

posted at: 15:16 | path: /pngtools | permanent link to this entry


Tue, 25 Nov 2008



Isaac Asimov's Robot City: Odyssey




    ISBN: 0441731228
    LibraryThing
    This is the first in a series of robot stories endorsed by Isaac Asimov. I enjoyed the first third of the book more than the last two thirds, mainly because I found the second two thirds a little hard to believe. Interestingly they were hard to believe in a similar manner to some of the Stainless Steel Rat books (such as The Stainless Steel Rat Saves the World, The Stainless Steel Rat Wants You, and The Stainless Steel Rat Goes to Hell). I wont get too specific, because I don't want to spoil the plot.

    This book felt kinda juvenile as well -- the plot lacked depth in my opinion. On the other hand, I did enjoy reading it, and it was better than I expected it to be.

    Tags for this post: book(S) Michael_P_Kube_McDowell(S)


posted at: 17:39 | path: /book/Michael_P_Kube_McDowell | permanent link to this entry


Packet capture in python

    I'm home sick with a cold today and got bored. I wanted to play with packet capture in python, and the documentation for pcapy is a little sparse. I therefore wrote this simple little sample script:

      #!/usr/bin/python
      
      # A simple example of how to use pcapy. This needs to be run as root.
      
      import datetime
      import gflags
      import pcapy
      import sys
      
      FLAGS = gflags.FLAGS
      gflags.DEFINE_string('i', 'eth1',
                           'The name of the interface to monitor')
      
      
      def main(argv):
        # Parse flags
        try:
          argv = FLAGS(argv)
        except gflags.FlagsError, e:
          print FLAGS
      
        print 'Opening %s' % FLAGS.i
      
        # Arguments here are:
        #   device
        #   snaplen (maximum number of bytes to capture _per_packet_)
        #   promiscious mode (1 for true)
        #   timeout (in milliseconds)
        cap = pcapy.open_live(FLAGS.i, 100, 1, 0)
      
        # Read packets -- header contains information about the data from pcap,
        # payload is the actual packet as a string
        (header, payload) = cap.next()
        while header:
          print ('%s: captured %d bytes, truncated to %d bytes'
                 %(datetime.datetime.now(), header.getlen(), header.getcaplen()))
      
          (header, payload) = cap.next()
      
      
      if __name__ == "__main__":
        main(sys.argv)
      


    Which outputs something like this:

      2008-11-25 10:09:53.308310: captured 98 bytes, truncated to 98 bytes
      2008-11-25 10:09:53.308336: captured 66 bytes, truncated to 66 bytes
      2008-11-25 10:09:53.315028: captured 66 bytes, truncated to 66 bytes
      2008-11-25 10:09:53.316520: captured 130 bytes, truncated to 100 bytes
      2008-11-25 10:09:53.317030: captured 450 bytes, truncated to 100 bytes
      2008-11-25 10:09:53.324414: captured 124 bytes, truncated to 100 bytes
      2008-11-25 10:09:53.327770: captured 114 bytes, truncated to 100 bytes
      2008-11-25 10:09:53.328001: captured 210 bytes, truncated to 100 bytes
      


    Next step, decode me some headers!

    Tags for this post: python(S) pcapy(S)

posted at: 10:22 | path: /python/pcapy | permanent link to this entry


Mon, 24 Nov 2008



Daggerspell




    ISBN: 0553565214
    LibraryThing
    This is yet another book I read as a kid and liked. I must admit that I find the Celtic names used through out the book to be quite confusing, especially when there are multiple similar names in use at the same time. Despite that, I really enjoyed this book -- even though its a pretty classic formula story.

    Tags for this post: book(S) Katharine_Kerr(S)


posted at: 18:01 | path: /book/Katharine_Kerr | permanent link to this entry


Blathering for Monday, 24 November 2008

    12:15: Mikal shared: SOCKS - Wikipedia, the free encyclopedia
      Huh. I didn't realize there is a socks proxy built into OpenSSH. Now if only there was a way to create new port forwards after the connection is opened.

    14:52: The internets strike again. I am now assured in the comments to this post that you can in fact add a new port forward to an existing ssh connection. Next, can someone tell me how to get ssh to make me a cup of tea?


    Tags for this post: blather(S)

posted at: 14:52 | path: /blather | permanent link to this entry


Sat, 22 Nov 2008



Foundation and Earth




    ISBN: 0586071105
    LibraryThing
    I really like how Asimov wraps up the extended Foundation series. Specifically, I'd previously complained while reading Pebble in the Sky that it was hard to believe that everyone simply forgets that they originated on Earth -- this book and Foundation's Edge go a long way to resolving that annoyance for me. Its also good to find out what happened to Aurora and Solaria finally -- especially given the Solaria mystery has been bothering me since Robots and Empire.

    Speaking just about this book so a moment, I do find the use of sex as a plot development method quite odd. There are three examples that bother me -- when Bliss is slipped through interstellar customs with the explanation that she's just a whore and therefore not important enough to make an issue of; the second is when Trevize basically shags his way out of an awkward situation, despite the other protagonist being quite hostile initially; and finally where he bonks someone on a rural world. I find all three of those incidents a little out of place with the rest of the book, and in fact the rest of the series. Other authors use those kinds of plot elements, but they seem out of place in Asimov's work.

    Overall, I loved this book and it was a good conclusion to the series.

    Tags for this post: book(S) Isaac_Asimov(S)


posted at: 12:05 | path: /book/Isaac_Asimov | permanent link to this entry


Fri, 21 Nov 2008



Blathering for Friday, 21 November 2008

    09:45: Mikal shared: Buy One Dodge Ram, Get One Free [Deals]
      You know the US auto industry is in trouble when they start offering buy one get one free deals on cars.

    15:00: Mikal shared: Article about backyard chicken owners
      I didn't realize that other people found chickens entertaining too. I figured it was just me. There is nothing more entertaining than throwing a mound of kitchen scraps into the coup and then watching the chickens argue over a banana peel. Its hard to explain... Perhaps when I move back to Australia I'll setup ChickenCam.



    Tags for this post: blather(S)

posted at: 15:00 | path: /blather | permanent link to this entry


Wed, 19 Nov 2008



Blathering for Wednesday, 19 November 2008

posted at: 22:00 | path: /blather | permanent link to this entry


Mon, 17 Nov 2008



Foundation's Edge




    ISBN: 0586058397
    LibraryThing
    I'm back to reading Foundation Series books actually written by Isaac Asimov. This one is the fourth in the Foundation Series if you count them in the order they were written, but is the second last in chronological terms. Its set 500 years after the failure of the first galactic empire, and follows the first Foundation's attempt to discover if the second Foundation still exists. Well, its a bit more complicated than that, but I don't want to ruin it for you.

    As an aside, the user interface described for the ship's computer is really cool. Its a bit like augmented reality, mixed with gesture control, mixed with a direct interface into the brain. I'm not saying I want one in my house, but its cool that a book written in 1983 still has a user interface description which isn't dated, and still seems plausible.

    This book has minor inconsistencies with the story presented in the second foundation trilogy (Foundation's Fear, Foundation and Chaos and Foundation's Triumph), but I see that more as a failure in those followup authors than in this book. In fact, I've already complained about how untrue to Asimov's vision some of those books are elsewhere.

    This is a good read, and I enjoyed it greatly.

    Tags for this post: book(S) Isaac_Asimov(S)


posted at: 18:40 | path: /book/Isaac_Asimov | permanent link to this entry


Blathering for Monday, 17 November 2008

posted at: 16:15 | path: /blather | permanent link to this entry


Automatically creating folders for mailing lists

    I've been using some simple procmail rules to automatically create folders for mailing lists for ages. Tony asked me for those rules today, so I figured I'd just put them online.

      ##########################################################################
      # Mailman
      
      :0:
      * List-Id:.*<\/[^>]*
      $MATCH
      
      :0:
      * List-Post: ]*
      $MATCH
      
      ##########################################################################
      # Majordomo lists (sometimes don't have <>'s around the address
      
      :0:
      * X-Mailing-List:.*<\/[^>]*
      $MATCH
      
      :0:
      * X-Mailing-List:.*\/.*
      $MATCH
      
      ##########################################################################
      # Ezmlm
      
      :0:
      * Mailing-List: .* \/[^ ;]*
      $MATCH
      
      ##########################################################################
      # I'm not sure what creates this one...
      
      :0:
      * X-Loop: \/.*
      $MATCH
      


    Tags for this post: procmail(S)

posted at: 14:59 | path: /procmail | permanent link to this entry


Sun, 16 Nov 2008



Wanted: someone to edit / review some MythTV stuff for me

    I'm looking for someone with solid MythTV experience and a good grasp of the English language to help me out with a project. All I can promise in return is glory, and that will be proportional to the eventual success of the project. If you're interested in spending some time (probably around 40 hours or so, spread over a couple of months) on such a project drop me a line.

    Tags for this post: mythtv(S)

posted at: 19:00 | path: /mythtv | permanent link to this entry


Sat, 15 Nov 2008



Andrew and Matthew turn 2 and 3

    Wow, these pictures are really old. I'm in the process of going through my photo collection and regenerating all the associated HTML. In the process I found these pictures of Andrew and Matthew's shared birthday party from when they turned 3 and 2 respectively. Heck, its only three years late, so I guess I should put them online.

    Looking back, I think I forgot to put these online because about that time I was run off my feet with LCA 2005. That's my excuse and I'm sticking to it.

                                           

    See more thumbnails

    Tags for this post: events(S) pictures(S) 20050314(S)

posted at: 21:34 | path: /events/pictures/20050314 | permanent link to this entry


Blathering for Saturday, 15 November 2008

posted at: 19:30 | path: /blather | permanent link to this entry


The Riftwar Series

    This series follows to young kids who grow up in a rural castle on the edge of an empire. Pug ends up being the greatest Magician to ever live, and Thomas ends up being merged with an ancient being of massive power. Its a good series, even if people accuse it of being steeped in cliche.

    YearTitle
    1982Magician (later sold as Magician: Apprentice and Magician: Master)
    1985Silverthorn
    1986A Darkness at Sethanon


    Tags for this post: book(S) Raymond_E_Feist(S)

posted at: 14:39 | path: /book/Raymond_E_Feist | permanent link to this entry


A Darkness at Sethanon




    ISBN: 0553263285
    LibraryThing
    This book took longer to read that I would have liked, because I have been busy with other things. Its a good book though, and a fine conclusion to the Riftwar Series. I liked this book a lot, although I do think that Magician (Apprentice and Master) was a better book.

    Tags for this post: book(S) Raymond_E_Feist(S)


posted at: 14:33 | path: /book/Raymond_E_Feist | permanent link to this entry


Fri, 14 Nov 2008



Blathering for Friday, 14 November 2008

posted at: 22:29 | path: /blather | permanent link to this entry


On a memeomatic

    I'm on vacation today, and so I had a bit more time that usual to just think. So, when Jeff posited a meme detector for planets, I wrote one. Except its of course never just that simple. My initial implementation only took a few minutes to write, but sucked.

    What I did was I wrote a script which scanned through the list of posts from the planet's RSS feed, and kept a tally of which sequences of words (let's call them sentences, even though they're not) appear in which posts. Then, if a sentence appears in more than four posts, and those posts are from at least two domains, we've found a meme.

    That's actually a reasonable algorithm. Its big advantage is that it only has to take one pass through the posts, which means its order is linear -- O(n). Now, the problem with that algorithm is that there a small differences in some of the sentences (for example people mistype a sentence), and I ended up finding too many copies of the same meme.

    Here's some sample output from that version:



    If you look at those you'll see that they're all the same meme, but the code found it three different ways. I need an algorithm which accurately finds the meme only once.

    I should stop here and mention that I think this problem would be an excellent interview question. If you were going to ask the question in an interview you'd probably phrase it more as:

    Given a list of strings, find substrings repeated between the strings, and return a list of the substrings and the strings containing them.


    When the problem is phrased like that, I am sure that some folk think of an algorithm which compares each string with each other, looks for some sort of largest substring between the two, and then builds a table of those. However, the problem with that is that the order would be O(N^2), which is ok for a planet RSS feed, but wouldn't be so great if the set of strings you wanted to compare was something like every page on the Internet.

    Anyway, I think its possible to rescue my initial implementation by providing a final pass which checks if matches overlap and combines them if they do. For example, if the only difference between two detected memes is one post, then they're probably the same meme and can be combined.

    That's a interesting problem in itself. Its easy to measure the difference in the list of matching posts for two memes, but that comparison has O(N^2), which I just said was a bad thing. However, this is a vacation day and I couldn't think of anything better, so that's what I ended up using. I guess I'll wait for a smart interview candidate to think of a better way for me.

    You can see output from memeomatic in this blather post for today. The blather code I wrote a while ago makes it really to post messages to my site, which is why I've reused it here (you just call a method on a python module, and a pre-existing Rube Goldberg machine takes care of the rest).

    My code:

      import feedparser
      import os
      import re
      import shelve
      import sys
      import unicodedata
      import urllib
      
      
      _SENTENCE_LENGTH = 5
      
      
      def Normalize(value):
        normalized = unicodedata.normalize('NFKD', unicode(value))
        normalized = normalized.encode('ascii', 'ignore')
        return normalized
      
      
      def ListDifference(l1, l2):
        delta = []
        
        for l in l2:
          if l not in l1:
            delta.append(l)
      
        return len(delta)
      
      
      plugins_dir = '%s/plugins' % os.getcwd()
      print 'Appending %s to module path' % plugins_dir
      sys.path.append(plugins_dir)
      import blather
      
      data = shelve.open('memes.slf', writeback=True)
      data.setdefault('sentences', {})
      data.setdefault('titles', {})
      data.setdefault('content', {})
      data.setdefault('content_orig', {})
      data.setdefault('memes', [])
      
      ds = blather.DataStore()
      
      changed = False
      
      # Scan feeds, looking for new posts. This just populates the database.
      for feed in data['feeds']:
        print
        print 'Fetching %s' % feed
        d = feedparser.parse(feed)
      
        # Newest entries are first
        entries = d.entries
        entries.reverse()
        
        for ent in entries:
          print '  Considering %s' % ent.title
          data['titles'][ent.link] = ent.title
          
          content = Normalize(ent.description)
          data['content_orig'][ent.link] = content
          content = ' '.join(content.split('\n'))
          content = re.sub('<[^>]*>', '', content)
          content = re.sub('[^\w]+', ' ', content)
          content = content.lower()
          data['content'][ent.link] = content
      
          words = content.split()
          for i in range(len(words) - _SENTENCE_LENGTH):
            key = ' '.join(words[i:i + _SENTENCE_LENGTH])
            data['sentences'].setdefault(key, [])
      
            if not ent.link in data['sentences'][key]:
              data['sentences'][key].append(ent.link)
      
      # Now we have a database of sentences and the posts which share them. What we
      # really want is a collection of shared sentences that form a meme, and the
      # posts which contain those sentences.
      for sentence in data['sentences']:
        found = False
      
        if len(data['sentences'][sentence]) > 4:
          domains = {}
          
          # Its possible that they're all from one domain...
          for url in data['sentences'][key]:
            domain = url.strip('http://').split('/')[0]
            domains[domain] = True
      
          # Its not a meme unless the sentence is shared by at least four posts.
          # Try to find an existing meme which contains these posts.
          for (sentences, urls, published) in data['memes']:
            if not found and ListDifference(urls, data['sentences'][sentence]) < 2:
              data['memes'].remove((sentences, urls, published))
      
              if sentence not in sentences:
                sentences.append(sentence)
      
              new_titles = []
              for u in data['sentences'][sentence]:
                if not u in urls:
                  urls.append(u)
                  new_titles.append('<a href="%s">%s</a>'
                                    %(u, data['titles'][u]))
      
              data['memes'].append((sentences, urls, published))
              found = True
      
              if published and new_titles:
                print 'Added posts to an existing meme'
                ds.AddMessage('Memeomatic extended an existing meme: %s'
                              % ', '.join(new_titles))
                changed = True
      
          if not found and len(domains) > 1:
            print ('Created a new meme for "%s" with %s'
                   %(sentence, data['sentences'][sentence]))
            data['memes'].append(([sentence], data['sentences'][sentence], False))
      
      # Publish new memes
      for meme in data['memes']:
        (sentences, urls, published) = meme
        if not published:
          titles = []
          for url in urls:
            titles.append('<a href="%s">%s</a>' %(url, data['titles'][url]))
          
          ds.AddMessage('Memeomatic found a new meme: %s' % ', '.join(titles))
          data['memes'].remove((sentences, urls, published))
          data['memes'].append((sentences, urls, True))
          print 'Published a new meme'
          changed = True
      
      if changed:
        ds.Save()
      data.close()
      


    So there you go. I haven't set this as a cron job yet, as I want to baby sit it for a while to make sure its doing the right thing. I might one day get around to trusting it enough to just turn it on.

    Tags for this post: meme(S)

posted at: 22:04 | path: /meme | permanent link to this entry


It seems planet is a bit too trusting with dates?

    It seems that planet is a bit too trusting with dates. For example, if you have a post with a date well into the future, then you can keep that post at the top of the planet output until that date comes around. Its interesting that no one has used that maliciously yet.

    You can see an example of what I'm talking about at Planet Linux Australia, where some forward dated posts sit at the top of the page...

    Tags for this post: blog(S)

posted at: 21:55 | path: /diary | permanent link to this entry


Book meme de jour

    I don't normally get involved in this whole meme thing, but I want to test memeomatic some more. So, here goes...

    Instructions:
    • Grab the nearest book.
    • Open it to page 56.
    • Find the fifth sentence.
    • Post the text of the sentence in your journal along with these instructions.
    • Don't dig for your favorite book, the cool book, or the intellectual one: pick the CLOSEST.


    So, I'm currently reading A Darkness at Sethanon, which means its close to hand. The sentence is "They are correct as written, Commander."

    Tags for this post: meme(S)

posted at: 21:29 | path: /meme | permanent link to this entry


Thu, 13 Nov 2008



Bypassing Australia's imminent internet filter

    Paul has thoughts on how to avoid Rudd's internet filter. I am left wondering why he doesn't just suggest Tor though. Its designed for exactly this sort of censorship, requires no account in another country, and is cross platform. The only catch is that Tor does block some traffic (for example bittorrent), so you can't just use it for all your traffic.

    Tags for this post: blog(S)

posted at: 21:25 | path: /diary | permanent link to this entry


Another dynamic element to the site

    I got adventurous tonight, and whipped up some javascript which updates the sentence at the end of each post which lists how many comments there are on a post. This means that the site is always up to date, even though all the HTML is static files on disk. It also means I can finally kill that silly hourly regenerate cron job.

    Oh, and this is post 3,000 on this site.

    Tags for this post: site(S)

posted at: 19:55 | path: /site | permanent link to this entry