Dan Connolly has created an online Dublin Core Extraction Service, which uses XSLT to extract RDF Dublin Core metadata from XHTML pages.
For pages that are not well-formed XHTML, the page to be processed can first be piped through Dave Raggett's HTML Tidy, courtesy of the online tidy service.
This proof of concept service demonstrates that useful metadata can be extracted from today's real world HTML pages, and follows up on a post
Connolly sent to the W3C RDF interest list in March this year.