| Are license tags common in web pages? |
Ben Bildstein talks about his attempts to determine if license tags are common on web pages. This seems like a perfect use of PlanetLab to me, where downloading a few million web pages and performing an analysis isn't hard. For example, I downloaded over a million web pages in a few days a little while ago.
Ben's problem seems easier than the parking analysis though, as I presume that he doesn't need to actually store the downloaded pages. If a simple regexp check of the content is sufficient, then storage (which is the slow) bit goes away as an issue.
Tags for this post: research(
posted at: 02:30 | path: /research | permanent link to this entry
There are no comments on this post which have survived moderation. 10 posts have been culled and 2 blocked. Be the first to make a non-spam comment here, please!
