Java theory and practice, Screen-scraping with XQuery
XQuery makes light work of HTML extraction and transformation
From the developerWorks archives
Date archived: February 27, 2017 | First published: March 22, 2005
XQuery is a W3C standard for extracting information from XML documents, currently spanning 14 working drafts. While the majority of interest in XQuery is centered around querying large bases of semi-structured document data, XQuery can be surprisingly effective for some much more mundane uses as well. In this month's Java theory and practice, columnist Brian Goetz shows you how XQuery can be used effectively as an HTML screen-scraping engine.
This content is no longer being updated or maintained. The full article is provided "as is" in a PDF file. Given the rapid evolution of technology, some steps and illustrations may have changed.