Content here is by:
Michael Still
mikal@stillhq.com

All my Open Source projects
Extracted view of CVS
Home
Site map

See recent comments. RSS feed of all comments. Raw dump of all comments for research purposes.

ImageMagick book
MythTV book








Mon, 08 Feb 2010



Measuring the popularity of SMTP server implementations on the Internet

    I'm interested in measuring the performance of SMTP servers connected to the Internet. Before I can poke around inside a SMTP implementation, I want to ensure that I am using one which lots of people use. To that end I have been running a series of SMTP server surveys for the last several years. This work has been alluded to in the past, but I haven't published any results. This has mainly been because while I have written a number of papers on the topic, I am yet to have one accepted by an academic conference. I've been hesitant to comment about my results because of the requirement for academic publications not be previously published work.

    I've decided to change that policy. I'm going to reserve a lot of the deeper analysis for academic publication (if I can make such a thing happen), but I am going to start talking about the work I am doing more in public. To start that off, I should mention what I've been doing...

    There have been a number of previous surveys of SMTP servers connected to the Internet, with each survey using a different methodology. So although these results are not directly comparable, a comparison still provides some insight into how the server landscape has changed over the last 12 years. A comparison of published surveys is presented in the table below. Each survey in this table shows the: sample size, which is the number of IP addresses surveyed; sample approach, which is the methodology used to determine which IP addresses to sample and adds bias into the sampling; and the number of responses, which is the number of SMTP servers that responded. The majority of these surveys have relied on random sampling of the IP address space, perhaps with a selection algorithm to limit the results selected. Few of the more recent surveys provide complete information on their probing implementation or the rules they used to identify specific implementations from their observations. It should be noted that non-response from a surveyed IP generally indicates that it is not in fact running a SMTP server accessible from the Internet.

    DateSurveyorSample sizeSample methodResponses
    27 Nov 1996Bernstein500,000Selective random25,121
    214 Aug 1997Bernstein200,000Selective random8,056
    211 May 1998Bernstein20,310MX walk17,592
    22 Apr 2000Bernstein12,595Selective random10,087
    25 Oct 2000Bernstein25,777Random859
    227 Sep 2001Bernstein39,206Random937
    21 Dec 2002Credentia4,096Random1,837
    21 Jan 2003Credentia30,000Random17,540
    21 Apr 2003Credentia37,563Random20,410
    21 May 2007MailChannels400,000Corporate domain names254,400


    Whereas the surveys that I have been running with the assistance of my ever patient PhD supervisor Dr Eric McCreath have been quite a bit larger. Note that larger isn't necessarily better with these sorts of surveys, but my methodology attempts to aim for completeness, and the relative power of PlanetLab makes these computations surprisingly cheap. Details of my surveys so far:

    DateSurveyorSample sizeSample methodResponses
    January 2008Still / McCreath46,136,113Exhaustive1,973,748
    April 2008Still / McCreath92,286,998Exhaustive1,609,111
    July 2008Still / McCreath97,545,668Exhaustive1,579,507
    October 2008Still / McCreath109,661,889Exhaustive1,801,081
    January 2009Still / McCreath110,397,428Exhaustive1,916,719
    April 2009Still / McCreath110,706,130Exhaustive1,925,760
    October 2009Still / McCreath111,209,212Exhaustive1,800,573


    Our survey is implemented by attempting to identify the MTA software running on an SMTP server using the SMTP connection banner. In other words, a collection of IP addresses are connected to on the SMTP port (TCP 25), and an attempt is made from the early stages of the SMTP protocol interaction to determine what SMTP server software is running on that host. The SMTP protocol will often reply to the connection with a status 220 line, referred to as the SMTP banner, this tells the connecting client that the server is ready. The SMTP banner also frequently states what software the server is running. Even if the software in use isn't explicitly named, it is often a string which is unique to a given SMTP implementation. This technique simply connects on the SMTP port, and logs any lines starting with 220. The connection is then closed, with no attempt to transfer an email occurring.

    So what results have I found so far? I'm trying to keep these blog posts to less than 1,000 words each, so that's too big a question to answer here. I've found some quite unexpected things along the way, such as an accurate technique for measuring the occurrence of domain parking on the Internet, and I'll discuss those in future posts. Instead, let me leave you with this short graphical summary of the results so far:



    This is the history of the currently five most popular implementations over time. You can see that Sendmail has fallen from a position of market dominance, and Exim is currently the most popular SMTP server implementation.

    I have a lot more to say about all this work, but as I mentioned earlier I want to keep the length of these posts down. I'll say more in future posts.

    Tags for this post: research(S)
    Related posts: Initial SMTP survey poster results in a pie chart; Interesting paper: "YouTube Traffic Characterization: A View From the Edge"; RemoteWorker v74; RemoteWorker v70; The witty worm with Vern Paxson; Microsoft Exchange the most popular SMTP server on the Internet?; I think I've worked out the problem with the hotel network; Domain name lookup helper for python?; Mikal, the massive domain squatter; Why does every man and his dog put man pages online?; Internet traffic; Sensis Australian search; Normalising mail server package names; Long time not much write; Satellite internet at Walmart; Announcing early results of my survey of SMTP servers; Noticed that smtpsurvey.stillhq.com is down?; Mikal, tell something I didn't know about SMTP servers on the Internet

posted at: 21:29 | path: /research | permanent link to this entry


Books read in January 2010

posted at: 17:41 | path: /book/read | permanent link to this entry


Sun, 07 Feb 2010



Home power measurement

    I've been spending some quality time with a Current Cost CC128 and my existing home sensor network. So far I've discovered that I use quite a bit of power, and that I can remotely monitor how many times a day my wife makes a cup of tea. Some example data:



    You can see that it was relatively cool compared with days a few weeks ago today. That's more obvious in the graph showing the last two weeks though:



    However, it was quite humid today:



    Which is why we didn't have the evaporative cooler on, just the fan. That doesn't seem to really affect our power usage, which really needs more analysis:



    The 500 watt minimum power draw makes me unhappy. You can see over a week it never goes away:



    Tags for this post: blog(S)
    Related posts: Extreme Machines: Eirik Raude; The environmental friendliness of shrimp trawling?; More on burial methods; It seems to me

posted at: 23:55 | path: /diary | permanent link to this entry


Blathering for Sunday, 07 February 2010

posted at: 04:03 | path: /blather | permanent link to this entry


Dogs of War

posted at: 02:25 | path: /book/Anthology | permanent link to this entry


Thu, 04 Feb 2010



Body Armor: 2000

posted at: 22:06 | path: /book/Anthology | permanent link to this entry


Wed, 03 Feb 2010



Blathering for Wednesday, 03 February 2010

posted at: 04:39 | path: /blather | permanent link to this entry


Tue, 02 Feb 2010



Lyonesse




    ISBN: 0441505309
    Ace (1987), Paperback
    LibraryThing
    This is another book I read as a child, except in this case I didn't really remember much of it -- the only bit I remembered was the punishment of Madouc's mother, but that might have been because I was a teenaged boy at the time. Overall this is a very good book. It took me a while to read because of being distracted with other projects, but the ongoing oppression of Princess Suldrun didn't really help either -- it was interesting at first, but got depressing after a while. Its also disturbing how many times sexual assault is used as a plot element in this book...

    Tags for this post: book(S) Jack_Vance(S)


posted at: 21:25 | path: /book/Jack_Vance | permanent link to this entry


Sun, 31 Jan 2010



Blathering for Monday, 01 February 2010

    09:40: Mikal shared: Kulula Air with New Funny Livery
      This plane is awesome. I feel a "business trip" to South Africa coming on.

    09:43: Mikal shared: Were in trouble...
      Does the US federal government really employ over 10% of the population?



    Tags for this post: blather(S)

posted at: 14:43 | path: /blather | permanent link to this entry


Thu, 28 Jan 2010



Blathering for Friday, 29 January 2010

posted at: 11:02 | path: /blather | permanent link to this entry


Wed, 27 Jan 2010



Blathering for Wednesday, 27 January 2010

posted at: 04:36 | path: /blather | permanent link to this entry


Mon, 25 Jan 2010



Blathering for Tuesday, 26 January 2010

posted at: 13:14 | path: /blather | permanent link to this entry


Fri, 22 Jan 2010



Building a hygrometer with a HS1101

    The next sensor I wanted to add to my home was a set of hygrometers. Specifically I wanted an exterior one, and a matching interior one. This would be useful as we have evaporative cooling, and if the humidity level outside is already high, then it doesn't make a lot of sense to put extra water into the air. Worse than that, it can also damage my books and make the house really clammy. So, adding some sensors was the first step in some form of alerting.

    I picked up two HS1101s from ebay quite cheaply (about $4 each IIRC). These devices are capacitors whose capacitance varies proportionally with relative humidity. You also need to provide a temperature at the sensor to correct the value, although the correction is pretty minor so I guess you could skip this if you really wanted to cut costs. Given I have plenty of code for Dallas 1820s now, I just dropped one of those onto the board too.

    I just used the circuit from the data sheet for my design, with a few simple tweaks (like the DS1820). Here's my surprisingly unprofessional circuit diagram:



    The DS1820 stuff that's not on the data sheet is in red. When built, it looks like this (note the crazy amount of jumper wire):



    This gives us temperature on a 1-Wire pin, and an oscillator on another pin which relates to the current humidity. You'll notice that my circuit has some extra wires, that's because I power down the 555 / HS1101 when I'm not taking a sample. I do this because Peter H. Anderson suggested that noise would be a problem otherwise. This circuit was actually quite hard to build and get working. There are a few reasons for that:
    • The large number of jumpers on the prototype PCB.
    • The lack of documentation from other arduino hackers (with the notable exception of the rather good Peter H. Anderson page).
    • The HS1101 data sheet forgets to mention that connecting pins 1 and 8 on the 555 is assumed knowledge.
    • The values for R1 and R2 vary depending on what model 555 you are using, and are crazily specific. For the LMC555 that I used, R1 is 1238K and R2 is 562K. I got close to these values, but not exact and it did seem to affect accuracy.
    • You must use a CMOS 555. That's buried in a six word sentence in the middle of a page on the data sheet, and I didn't notice it for a while. With a NMOS 555, you get effectively random numbers out of the circuit. Worse than that, CMOS 555s are actually a little hard to find, and I had to get mine from Farnell.
    • I attempted to calibrate with the government weather data from the next suburb over. Unfortunately, as best as I can tell, that data is wrong. It claims that its currently as humid here as it is in Cairns in the wet season, which I deny. Calibration is an ongoing issue for me, although I have some ideas on how to progress there. It might also not matter, as I am building an identical sensor for inside the house and as long as they are both equally wrong I can still detect the "turn off the water to the evap" state that I want to.


    Here's a comparison between my data for today and the weather service's:



    The code to run the HS1101 is relatively simple:

    // Temperature and humidity sensor node. Based on the FridgeControlWeb project of mine
    // as well as http://www.phanderson.com/picaxe/relhum_count.html
    
    #include <enc28j60.h>
    #include <etherShield.h>
    #include <ip_arp_udp_tcp.h>
    #include <ip_config.h>
    #include <net.h>
    #include <websrv_help_functions.h>
    #include <avr/io.h>
    #include <math.h>
    
    #include <OneWire.h>
    #include <DallasTemperature.h>
    
    #define ONEWIRE 3
    #define HS1101DATA 5
    #define HS1101POWER 7
    
    int count_transitions(int ms);
    
    // How long between measurement cycles
    #define SLEEP_SEC 10
    
    OneWire oneWire(ONEWIRE);
    DallasTemperature sensors(&oneWire);
    
    unsigned long last_checked = 0, this_check = 0;
    
    // Web server setup
    #define MYWWWPORT 80
    #define BUFFER_SIZE 550
    #define ERROR_500 "HTTP/1.0 500 Error\r\nContent-Type: text/html\r\n\r\n<h1>500 Error</h1>"
    
    static uint8_t mymac[6] = {0x54, 0x55, 0x58, 0x10, 0x00, 0x25}; 
    static uint8_t myip[4] = {192, 168, 1, 252};
    static uint8_t buf[BUFFER_SIZE + 1];
    char data[BUFFER_SIZE + 1];
    
    // The ethernet shield
    EtherShield es = EtherShield();
    
    uint16_t http200ok(void)
    {
      return(es.ES_fill_tcp_data_p(buf, 0, PSTR("HTTP/1.0 200 OK\r\nContent-Type: text/html\r\n"
                                                "Pragma: no-cache\r\n\r\n")));
    }
    
    // prepare the webpage by writing the data to the tcp send buffer
    uint16_t print_webpage(uint8_t *buf)
    {
      uint16_t plen;
      plen = http200ok();
      plen = es.ES_fill_tcp_data_p(buf, plen, PSTR("<html><head><title>Temperature sensor</title>"
                                                   "</head><body><pre>"));
      plen = es.ES_fill_tcp_data(buf, plen, data);
      plen = es.ES_fill_tcp_data_p(buf, plen, PSTR("</pre></body></html>"));
    
      return(plen);
    }
    
    // Float support is hard on arduinos
    // (http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1164927646) with tweaks
    char *ftoa(char *a, double f, int precision)
    {
      long p[] = {0,10,100,1000,10000,100000,1000000,10000000,100000000};
      char *ret = a;
      long heiltal = (long)f;
      
      itoa(heiltal, a, 10);
      while (*a != '\0') a++;
      *a++ = '.';
      long desimal = abs((long)((f - heiltal) * p[precision]));
      itoa(desimal, a, 10);
      return ret;
    }
    
    #define cbi(sfr, bit) (_SFR_BYTE(sfr) &= ~_BV(bit))
    #define sbi(sfr, bit) (_SFR_BYTE(sfr) |= _BV(bit))
    
    int count_transitions(int ms)
    {
         // configure Counter 1
         cbi(TCCR1A, WGM11);
         cbi(TCCR1A, WGM10);
    
         cbi(TCCR1B, WGM12);  // WGM12::WGM10 000 - Normal mode
    
         sbi(TCCR1B, CS12);   // CS12::CS10 111 - External clock, count on rising edge.
         sbi(TCCR1B, CS11);
         sbi(TCCR1B, CS10);
    
         TCNT1 = 0x0000;      // note that TCNT1 is 16-bits
         delay(ms);
         // not sure if should turn off the counter
         return(TCNT1);
    }
    
    void setup()   {                
      // initialize the digital pin as an output:
      Serial.begin(9600);
      sensors.begin();
      
      es.ES_enc28j60Init(mymac);
      es.ES_init_ip_arp_udp_tcp(mymac, myip, MYWWWPORT);
      
      pinMode(HS1101POWER, OUTPUT);
    }
    
    void loop()                     
    {
      int i, j, data_inset;
      char float_conv[10];
      float t;
      DeviceAddress addr;
      uint16_t plen, dat_p;
      float relhum_raw, relhum_corrected;
      int relhum_count;
    
      // Read temperatures, we dump the state to a buffer so we can serve it
      this_check = millis();
      if(this_check > last_checked + SLEEP_SEC * 1000)
      {
        data_inset = 0;
        sensors.requestTemperatures();
        sprintf(data + data_inset, "Sensor count: %d\n",
                sensors.getDeviceCount());
        data_inset = strlen(data);
     
        for(i = 0; i < sensors.getDeviceCount(); i++)
        {
          t = sensors.getTempCByIndex(i);
          sensors.getAddress(addr, i);
          
          data[data_inset++] = 't';
          for (j = 0; j < 8; j++)
          {
            sprintf(data + data_inset, "%02x", addr[j]);
            data_inset += 2;
          }
    
          sprintf(data + data_inset, ": %s\n", ftoa(float_conv, t, 2));
          data_inset = strlen(data);
        }
          
        // Power up the 555 / HS1101, and take a measurement. Power it down again afterwards.
        digitalWrite(HS1101POWER, HIGH);
        delay(500);
        relhum_count = count_transitions(1000);
        digitalWrite(HS1101POWER, LOW);
        sprintf(data + data_inset, "HS1101 cycles: %d\n", relhum_count);
        data_inset = strlen(data);
        
        relhum_raw = 557.7 - 0.0759 * relhum_count;
        sprintf(data + data_inset, "Raw humidity: %s\n", ftoa(float_conv, relhum_raw, 2));
        data_inset = strlen(data);
        
        relhum_corrected = (1.0 + 0.001 * (t - 25.00)) * relhum_raw;
        sprintf(data + data_inset, "Corrected humidity: %s\n",
                ftoa(float_conv, relhum_corrected, 2));
        
        Serial.println(data);
        
        last_checked = this_check;
      }
      
      // Handle network packets
      dat_p = es.ES_packetloop_icmp_tcp(buf, es.ES_enc28j60PacketReceive(BUFFER_SIZE, buf));
      if(dat_p != 0)
      {
        if (strncmp("GET ", (char *)&(buf[dat_p]), 4) != 0){
          // head, post and other methods:
          dat_p = http200ok();
          dat_p = es.ES_fill_tcp_data_p(buf, dat_p, PSTR("<h1>200 OK</h1>"));
        }
        
        // just one web page in the "root directory" of the web server
        else if (strncmp("/ ", (char *)&(buf[dat_p+4]), 2) == 0){
          dat_p = print_webpage(buf);
          Serial.println("Served temperature web page");
        }
        
        else{
          dat_p = es.ES_fill_tcp_data_p(buf, 0, PSTR(ERROR_500));
        }
        
        es.ES_www_server_reply(buf, dat_p);
      }
    }
    


    As with previous circuits, I'm going to have to thank Doug for hints and tips along the way, as well as letting me steal his entire collection of 8 pin DIP sockets.

    Tags for this post: arduino(S)
    Related posts: Beer fridge controller 0.1; The Beer Fridge saga continues; Beer fridge controller 0.3; Beer fridge controller 0.2; Arduino with the kids: Cricket Noise Door Bell; Phoenix for business; Thinking about arduino as a prototyping platform

posted at: 02:17 | path: /arduino | permanent link to this entry


Thu, 21 Jan 2010



Blathering for Friday, 22 January 2010

posted at: 05:22 | path: /blather | permanent link to this entry


Blathering for Thursday, 21 January 2010

posted at: 02:43 | path: /blather | permanent link to this entry


Sat, 16 Jan 2010



The Man in the Rubber Mask

posted at: 15:58 | path: /book/Robert_Llewellyn | permanent link to this entry


Fri, 15 Jan 2010



The Renegades of Pern




    ISBN: 0345369335
    Del Rey (1990), Mass Market Paperback, 352 pages
    LibraryThing
    This book starts off in quite a disjointed manner, with the introduction of a variety of seemingly unrelated characters. The only thing that they all have in common is that they're holdless. However, as the book progresses these characters are all weaved together into a relatively cohesive story line. I say relatively because there are gaps in the story telling, which can be a little jarring.

    Interestingly, this book also clarifies some of the events of the others in the series. Most satisfyingly it includes more detail of the buried settlement at Landing than The White Dragon did, which ties in nicely with the introduction provided in Dragonsdawn. This gives me hope that later books will take the science fiction track I've been wanting them to for a while.

    Tags for this post: book(S) Anne_McCaffrey(S)
    Related posts: The White Dragon; Dragonsdawn; Nerilka's Story; Dragonsinger; Dragondrums; Dragonquest; The Dragonlover's Guide to Pern; Dragonsong; Dragonflight; Moreta: Dragonlady of Pern


posted at: 19:50 | path: /book/Anne_McCaffrey | permanent link to this entry


Blathering for Friday, 15 January 2010

posted at: 01:35 | path: /blather | permanent link to this entry


Tue, 12 Jan 2010



Blathering for Wednesday, 13 January 2010

posted at: 09:40 | path: /blather | permanent link to this entry


Mon, 11 Jan 2010



Blathering for Tuesday, 12 January 2010

posted at: 13:42 | path: /blather | permanent link to this entry