Thinking about IDL style descriptions of document formats

    I've been background processing about IDL style definitions of document formats for the last few days. Specifically, I'm interested in ways of expressing the structure of a document outside of code, and then having code generated to process the specified document. Sort of like lex and yacc, but more flexible and not language specific. This would mean that when you wanted to process a document in your chosen language, you wouldn't have to deal with things like SWIG -- you'd just generate the native code and go for it.

    Obviously these ideas aren't new. DCE RPC's IDL language is like this, as is Google's protobuffers. However, I want something more generic. Has anyone seen something like this?

    Tags for this post: blog document processing description language
    Related posts: Open Source document management from Alfresco; PDF/A sample documents?; Time to document my PDF testing database; Old languages; Learning Ruby; Color ebook!

posted at: 16:58 | path: /diary | permanent link to this entry