stillhq.com : Mikal, a geek from Canberra living in Silicon Valley (no blather posts) http://www.stillhq.com The life, times, travel and software of Michael Still (no blather posts) en Copyright (c) Michael Still 2000 - 2006 blosxom simplerss20 v20050208hh 180 http://blogs.law.harvard.edu/tech/rss Noticed that smtpsurvey.stillhq.com is down? /research/smtp/survey Thu, 07 Aug 2008 09:11:00 GMT smtpsurvey.stillhq.com has been down for a couple of days now. This is because the machine at ANU which hosts the data has a hardware fault, and service techs have not yet arrived on site. The ever-helpful admins at ANU are aware of the problem, and are pursuing it as rapidly as they can. <br/><br/><i>Tags for this post: research(<a href="http://www.stillhq.com/research"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000004&tag=research&format=.png" border="0" alt="S"></a>) smtp(<a href="http://www.stillhq.com/smtp"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000004&tag=smtp&format=.png" border="0" alt="S"></a>) survey(<a href="http://www.stillhq.com/survey"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000004&tag=survey&format=.png" border="0" alt="S"></a>) </i> <br/><br/> <a href="http://www.stillhq.com/research/smtp/survey/000004.commentform.html">Comment</a> http://www.stillhq.com/research/smtp/survey/000004.html http://www.stillhq.com/research/smtp/survey/000004.html Normalising mail server package names /research/smtp/survey Fri, 21 Mar 2008 18:37:00 GMT While starting to look at mail server deployment trends, it came to my attention that I needed to normalise the names used for various mail servers across the mail server surveys for which I have data. In some cases the other guys' name for a given mail server was more accurate than mine, so you might notice over the next couple of days that mail server names are a bit variable in the results I have online. <br/><br/><i>Tags for this post: research(<a href="http://www.stillhq.com/research"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000003&tag=research&format=.png" border="0" alt="S"></a>) smtp(<a href="http://www.stillhq.com/smtp"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000003&tag=smtp&format=.png" border="0" alt="S"></a>) survey(<a href="http://www.stillhq.com/survey"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000003&tag=survey&format=.png" border="0" alt="S"></a>) </i> <br/><br/> <a href="http://www.stillhq.com/research/smtp/survey/000003.commentform.html">Comment</a> http://www.stillhq.com/research/smtp/survey/000003.html http://www.stillhq.com/research/smtp/survey/000003.html Announcing early results of my survey of SMTP servers /research/smtp/survey Wed, 19 Mar 2008 08:54:00 GMT Since June 2007 I have been building as close to an exhaustive survey of SMTP servers connected to the Internet as possible. This has involved coming up with a method for finding IP addresses to probe, probing those IP addresses, and generating results from the data collected. That code has been "finished" for a while now, and I am now ready to make it available to the public. <br/><Br/> The current data set includes 46,135,101 IP addresses, with 1,942,603 successfully identified servers. The <a href="http://smtpsurvey.stillhq.com/smtp-survey.cgi?dashboard=1">results for the survey are online</a>, as well as <a href="http://smtpsurvey.stillhq.com/smtp-survey.cgi?dashboard=2">status information for the machines running the measurement system</a> (<a href="http://smtpsurvey.stillhq.com/smtp-survey.cgi?dashboard=3">a different view of that data is available as well</a>). You can even <a href="http://smtpsurvey.stillhq.com/smtp-survey.cgi?dashboard=4">lookup your favourite domain name to see what software its running</a>. <br/><br/> This is the most recent open survey of SMTP servers that I am aware of. <a href="http://smtpsurvey.stillhq.com/">There have been other surveys</a>, but they are either quite old or don't make their data publically accessible. Its quite possible there are bugs in the web site which displays the data, so <a href="mailto:mikal@stillhq.com?subject=SMTP Survey">please let me know if you find one</a>. Apart from that, I hope this data is useful to others. <br/><br/> <form name="lookup" action="http://smtpsurvey.stillhq.com/smtp-survey.cgi" method="post">Use this form to lookup what mail server software a given domain is using. Remember to enter a domain name (like ibm.com), not a hostname (like www.ibm.com).<br/><br/>Lookup: <input type="text" name="lookup" size=50> <input type="submit" value="Submit"></form> <br/><br/><i>Tags for this post: research(<a href="http://www.stillhq.com/research"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000002&tag=research&format=.png" border="0" alt="S"></a>) smtp(<a href="http://www.stillhq.com/smtp"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000002&tag=smtp&format=.png" border="0" alt="S"></a>) survey(<a href="http://www.stillhq.com/survey"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000002&tag=survey&format=.png" border="0" alt="S"></a>) </i> <br/><br/> <a href="http://www.stillhq.com/research/smtp/survey/000002.commentform.html">Comment</a> http://www.stillhq.com/research/smtp/survey/000002.html http://www.stillhq.com/research/smtp/survey/000002.html Initial SMTP survey poster results in a pie chart /research/smtp/survey Thu, 06 Dec 2007 16:31:00 GMT <div align=center> <img src="http://chart.apis.google.com/chart?cht=p3&chd=t:31,19,15,9,5,56&chs=600x300&chl=Exchange|Postfix|Sendmail|Anonymous|Exim|Other&chco=0000ff"> </div> <br/><br/> Graph generated with <a href="http://code.google.com/apis/chart/">Google Chart API</a>, which <a href="http://google-code-updates.blogspot.com/2007/12/embed-charts-in-webpages-with-one-of.html">was announced today</a>. <br/><br/><i>Tags for this post: research(<a href="http://www.stillhq.com/research"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000001&tag=research&format=.png" border="0" alt="S"></a>) smtp(<a href="http://www.stillhq.com/smtp"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000001&tag=smtp&format=.png" border="0" alt="S"></a>) survey(<a href="http://www.stillhq.com/survey"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/000001&tag=survey&format=.png" border="0" alt="S"></a>) </i> <br/><br/> <a href="http://www.stillhq.com/research/smtp/survey/000001.commentform.html">Comment</a> http://www.stillhq.com/research/smtp/survey/000001.html http://www.stillhq.com/research/smtp/survey/000001.html Microsoft Exchange the most popular SMTP server on the Internet? /research/smtp/survey Sat, 01 Dec 2007 15:27:00 GMT <a href="http://cs.anu.edu.au/~Eric.McCreath/">Eric McCreath</a> from the <a href="http://cs.anu.edu.au">Department of Computer Science</a> at the <a href="http://www.anu.edu.au">Australian National University</a> and I presented a poster entitled "Inferring Relative Popularity of SMTP Servers" at <a href="http://www.usenix.org">USENIX</a> <a href="http://www.usenix.org/event/lisa07/">LISA 2007</a>. This blog post is a brief discussion of the content of the poster, as well as a landing page for the <a href="http://www.stillhq.com/research/smtp/survey/poster-lisa2007.pdf">paper version of the poster</a> as well as the <a href="http://www.stillhq.com/research/smtp/survey/poster-lisa2007-poster.pdf">the PDF of the actual poster</a>. For more detail into the measurement techniques used, please check out the complete paper. <br/><br/> We conducted this research because there is little data on the relative popularity of the various available SMTP server implementations. This data is of interest because it aids the development of systems which interact with these servers. For example, a potential DDoS protection system should be tested with the most common SMTP servers, as these are the ones that it is most likely to encounter in everyday use. <br/><br/> Many businesses rely on email of some form for their day to day operation. This is especially true for product support organisations, who are largely unable to perform their role in the company if their in-boxes are unavailable. Allman in "<a href="http://doi.acm.org/10.1145/945131.945157">Spam, Spam, Spam, Spam, Spam, the FTC, and Spam</a>" states that Nuclear Research studies estimate that spam costs US businesses $87 billion a year. It seems reasonable to assume that if a low level attack is costing that much, then a complete outage would impose an even greater burden on an enterprise. <br/><br/> There has been little research conducted into the current state of SMTP servers on the Internet, perhaps because this area of research has not been particularly fashionable in comparison to the HTTP metrics which are commonly collected. This is an important area of research however given the level of traffic served by these systems has been growing for years. Barracuda Networks cite Radicati research which indicates that in 2009 228 billion emails will be sent per day, with the vast majority being spam (see <a href="http://www.barracudanetworks.com/ns/products/spam_features.php">Barracuda's site for more details</a>). Afergan and Beverly in "<a href="http://doi.acm.org/10.1145/1052812.1052822">The state of the email address</a>" evaluate the state of email servers in an attempt to determine how SMTP servers are coping with the growth in traffic. Their approach involved sending out probe emails to a variety of domains. The email was crafted to have a strong assurance of bouncing because of not being addressed to a valid address. The authors then monitored the bounce traffic. They concluded that corporate SMTP servers are under surprising levels of strain and do not bounce undeliverable emails in a predictable manner. <br/><br/> We have therefore started to undertake research into SMTP servers as they appear on the Internet, with our first study being a simple survey of which SMTP implementations are most commonly deployed. Our poster discussed the current state of that survey, and provide some early results. <br/><br/> The challenge with determining the popularity of various SMTP server implementations is twofold -- firstly, not all of the SMTP servers which interact with the Internet are able to be probed from the public Internet (for example SMTP routers which route email that came from the Internet, but are not themselves accessible from the Internet); and secondly the sheer number of SMTP servers connected to the network. We have therefore used both passive and active measurements to survey these servers. Each of these measurement techniques is described below. <br/><br/> Bearing in mind that our survey is quite new, and that only 34.6 million IP addresses have been probed so far, the initial results are quite interesting. <br/><br/> <div align=center> <img src="/research/smtp/survey/smaller-poster-lisa2007-graph.png"> </div> <br/><br/> You can see from the graph that the most popular SMTP server in our dataset is Microsoft Exchange, followed by Postfix and then Sendmail. <br/><br/> Additional analysis of our existing data, as well as further development of the email parser will improve the accuracy of our survey, which will also increase the number of machines included in the survey. The survey also needs a wider set of inputs for possible IP addresses to probe -- one example of another possible source of probable SMTP servers is MX records for registered domain names. The distributed probing system needs further development to handle the scale of the proving required for a large number of SMTP servers to be included in the survey, and improvements to the reliability of the central server are also required. <br/><br/> This SMTP survey is in its early stages, and there is much work still to do. However, research of this nature is likely to produce results which are of interest to both the research community, as well as software developers and systems administrators. So far a small dataset has been analysed, which has resulted in a reasonably robust distributed probing system being constructed. Further work on the survey will continue in the future, with updated results being published from time to time. <br/><br/><i>Tags for this post: research(<a href="http://www.stillhq.com/research"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/poster-lisa2007&tag=research&format=.png" border="0" alt="S"></a>) smtp(<a href="http://www.stillhq.com/smtp"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/poster-lisa2007&tag=smtp&format=.png" border="0" alt="S"></a>) survey(<a href="http://www.stillhq.com/survey"><img src="http://www.stillhq.com/tagicon.cgi?post=/research/smtp/survey/poster-lisa2007&tag=survey&format=.png" border="0" alt="S"></a>) </i> <br/><br/> <a href="http://www.stillhq.com/research/smtp/survey/poster-lisa2007.commentform.html">Comment</a> http://www.stillhq.com/research/smtp/survey/poster-lisa2007.html http://www.stillhq.com/research/smtp/survey/poster-lisa2007.html