November 13, 2003

Order from Chaos

Way back we at the office ran smack-dab into the problem of categories for work blogs: how do you come up with a list of categories that is short, obvious, and comprehensive? Or do you just have this huge list of duplicative categories somehow capturing information imperfectly but sufficiently well? Even better is this solution, discussed by Jon Udell: use the metadata.

Martinez's insight is that in a Web services network, the packets (XML payloads) tend to accrete metadata that can usefully be mined. Relative to the SemWeb discussion, I'd add that this contextual metadata arises naturally, without extra effort, when a business process has been automated—or, to be more realistic, semi-automated. When Jack routes a purchase order to Jill through the BizTalk pipeline, the context is explicitly encoded in the transaction.

What happens if Jack detaches the purchase order from the BizTalk pipeline, as an InfoPath document, and routes it to Jill via email? Now the context is only implicitly encoded in the transaction. The trick is going to be figuring out how to make the implicit context explicit, without interfering with the natural flow of the transaction.

So the question really is how to track metadata of standard transactions and pull filtered data up through some search mechanism, without having real categories at all.

Phil Windley ties this idea of virtually asssembling bits of information through search-based RSS feeds on the fly:

The search-based RSS feeds on this site are virtual views of the news headlines. I think there's more to this idea of not trying to categorize things, but simply create views into the data, files, emails, whatever. Its more flexible than hard coded categories and search has clearly won out on the Internet over categorization for this very reason. As the amount of stuff on my hard drive grows, why not apply the same principal there as well? From the RESTian standpoint, this is an example of Web services, and example of how standards enable intermediaries.

Given computers have no general organizational schema (Dewey decimal) which are commonly known and used, why not have virtual views? Keep everything in one large container and provide a filtering mechanism based on metadata and content (if RSS-like), and from the user's perspective, the actual location of information becomes irrelevant. Among other blessings, this might also make systems administration a bit simpler: there are files which display to users and those which don't.

Posted at November 13, 2003 04:34 PM
Comments

This discussion has been closed. No more comments may be added.