Content classification is any process that enriches data to organize it in a manner that makes the data easier to search, archive, manage, and integrate into other processes. Generating such metadata, you can in turn derive more value from existing content.
One of the big issues with classification is that people make mistakes and arrive at different classifications based on their own logic. When you define a classification system, take in account all stakeholders' opinions and try to have a consistent approach to categorizing data, even through the many challenges. For example, people in one department might not be aware of what metadata is important to people in another. In addition, training people to understand and consistently apply a classification can be time-consuming.
As ever increasing volumes of data (which some call the digital landfill) are generated, it becomes nearly impossible to classify data manually. You must turn to automated methods of analyzing content across a wide variety of formats and inputs.
Automating classification provides many benefits:
- You save money.
- You save time.
- The classification provides consistency by offering a common mechanism through which metadata is added.
- Organizations derive greater value from existing content.
Installing and running the code examples
I wrote the code examples in this article for use with the eXist XML Database or Zorba XQuery processor. To use it with the eXist XML Database, you need to have the database installed; otherwise, use the Zorba XQuery processor, which is available through an online sandbox.
Install the eXist XML Database
To install the eXist XML Database, perform the following steps:
- Download and extract the example code.
- Upload the extracted code directory into the database collection—for example, /db/content-classification.
- If you are using Microsoft® Office Access®, run the code example in a browser.
Using the Zorba XQuery processor
Alternately, you can use the online version of Zorba XQuery processor to run code examples by performing the following steps:
- Download and extract the example code.
- Cut and paste the code example into the Zorba XQuery processor
online sandbox at
http://try.zorba-xquery.com/. - Click Execute to run the code.
Notice that the difference is small between the eXist and Zorba examples.
Sharp-eyed readers, however, will notice one difference in their respective use
of the EXPath HTTP Client library: Zorba has this library built in by default,
and the eXist database does not, which is why I supply a stand-alone
http-client.xqm XQuery library designed specifically
for use with eXist. The examples in this article use the EXPath HTTP Client
library to access remote data and web services. In the second part of this
article, you integrate more advanced processing using the Yahoo! Query
Language (YQL) and AlchemyAPI tools.
Note: Be aware that you might be required to sign up to receive an API key before using these services.
Simple classification with XQuery
This first part of the article shows how you can use pure XQuery to start classifying content.
Text analytics: Defining word frequency in unstructured context
The term text analytics (or text mining) defines a set of machine learning and linguistic techniques to extract and model information metadata from textual sources. Text analytics applies natural language processing (NLP) and analytical methods on textual content and extracts useful metadata, such as:
- Language type. Analysis of character encoding, words, and content style can easily determine with high confidence which language textual data is in.
- Keywords. Text analysis can extract a set of keywords that characterize the document.
- Common entities. Algorithms that scan text for common patterns, such as email addresses, phone numbers, and people and place names, are useful for named entity extraction.
- Semantic relationships. A wide variety of approaches is available for scanning content in the hope of gleaning more compelling and deeper insights.
One such case of text mining is determining the frequency of words contained within a document, the assumption being that the more often a word is used, the more relevant this word is to the entire document.
The most common words could be construed as document keywords, but be aware that the term keywords is usually applied to the output from more sophisticated algorithms, which goes farther than just defining word frequency. For example, keyword analysis typically cross-references common words with synonym lookup tables and can also analyze the distance between words to help determine the importance of the word in context of the entire document.
In any text analysis, the first step is to generate a corpus from the textual content, with the subsequent analysis being applied to the corpus. One of the reasons for generating a corpus is to normalize text and remove anything that isn't relevant.
Listing 1 shows an XQuery program that consumes an HTML page (using the EXPath HTTP Client library) and extracts all paragraph elements from the web page. As you do not care about what case a word is in, you create the corpus out of the content, which is all lowercase.
Listing 1. XQuery program that generates a word frequency list
xquery version "1.0";
import module namespace http = "http://expath.org/ns/http-client";
let $content-url := 'http://en.wikipedia.org/wiki/Asteroid_impact_avoidance'
let $content-request :=
<http:request href="{$content-url}" method="get" follow-redirect="true"/>
let $content :=
fn:string-join(http:send-request($content-request)[2],' ')
return
let $corpus := for $w in tokenize($content, '\W+') return lower-case($w)
let $wordList := distinct-values($corpus)
return
<words> {
for $w in $wordList
let $freq := count($corpus[. eq $w])
order by $freq descending
return <word word="{$w}" frequency="{$freq}"/>
}</words>
|
The next step is to derive all unique words from the corpus, for which you use
a FLWOR to process each word, generating word count (by referring back to the
corpus, which contains all the words), and then output a
<word/> element.
Note: I use the same web URL
(http://en.wikipedia.org/wiki/Asteroid_impact_avoidance)
as the text source for all examples in the article to illustrate how effective
each approach is.
The result of running the program in Listing 1 is an XML
document that has a <word/> element containing
the frequency and word, ordered by the most frequent words contained in the
Wikipedia page on asteroid impact avoidance. Listing 2
shows the list.
Listing 2. Word frequency list
<words> <word word="the" frequency="377"/> <word word="of" frequency="236"/> <word word="a" frequency="193"/> <word word="to" frequency="167"/> <word word="and" frequency="141"/> <word word="in" frequency="124"/> <word word="earth" frequency="121"/> <word word="â" frequency="109"/> <word word="asteroid" frequency="102"/> .... </words> |
As you can see, the analysis returned a lot of words, with many frequent words being irrelevant by dint of their common usage in the English language. You can fix this by defining a few simple rules from which to reduce the amount of noise, such as removing all words of three letters or less and removing any words with a frequency of 3 or less.
Listing 3 shows the same code with logic added that tests for word string length and frequency.
Listing 3. Amended XQuery program that generates a word frequency list
xquery version "1.0";
import module namespace http = "http://expath.org/ns/http-client";
let $content-url := 'http://en.wikipedia.org/wiki/Asteroid_impact_avoidance'
let $content-request :=
<http:request href="{$content-url}" method="get" follow-redirect="true"/>
let $response := http:send-request($content-request)[2]
let $content := fn:string-join($response,' ')
return
let $corpus := for $w in tokenize($content, '\W+') return lower-case($w)
let $wordList := distinct-values($corpus)
return
<words> {
for $w in $wordList
let $freq := count($corpus[. eq $w])
order by $freq descending
return
if(string-length($w) gt 3 and $freq gt 3) then
<word word="{$w}" frequency="{$freq}"/>
else
()
}</words>
|
With different datasets, you might have to adjust or enhance these settings to exclude words of greater length or higher frequencies, but as Listing 4 shows, the minimal settings have omitted a lot of the noise, leaving a much more relevant set of terms.
Listing 4. Revised word frequency list
<words> <word word="earth" frequency="121"/> <word word="asteroid" frequency="102"/> <word word="impact" frequency="58"/> <word word="near" frequency="56"/> <word word="with" frequency="55"/> <word word="that" frequency="53"/> <word word="space" frequency="49"/> <word word="nasa" frequency="43"/> <word word="object" frequency="36"/> <word word="from" frequency="34"/> <word word="this" frequency="32"/> ... </words> |
Clearly, this approach has limitations. But it's a good start and shows you that with a small amount of XQuery, it's possible to get a basic set of keywords characterizing a document's textual content.
Adding structure to word frequency
Textual analysis applied to semi-structured documents like HTML or XML provides limited insights if it completely ignores structure. But what if you could make deeper inferences by weighting the importance of textual analysis by relating it to element structure?
In terms of HTML, wouldn't it be nice if you could somehow score words based on where they appear inside a nested structure? For example:
- Words that appear in
<title>elements are more important. - Words that appear in
<noscript>or<script>elements are less important. - Words that appear in
<h1>and<h2>elements are more important.
To achieve this structure, add a fitness attribute
to each word. This attribute performs a check to see whether the word
shows up specifically in any of these elements. Listing 5
shows the added logic, which checks to determine whether the word is
contained in any elements deemed important.
Listing 5. Add fitness to the XQuery program that generates your word frequency list
xquery version "1.0";
import module namespace http = "http://expath.org/ns/http-client";
let $content-url := 'http://en.wikipedia.org/wiki/Asteroid_impact_avoidance'
let $content-request :=
<http:request href="{$content-url}"
method="get" follow-redirect="true"/>
let $response := http:send-request($content-request)[2]
let $content := fn:string-join($response,' ')
let $corpus := for $w in tokenize($content, '\W+') return lower-case($w)
let $wordList := distinct-values($corpus)
return
<words> {
for $w in $wordList
let $fitness := if ( $response//*:title[contains(lower-case(.),$w)]) then
5
else if ($response//*:h1[contains(lower-case(.),$w)]) then
4
else if ($response//*:h2[contains(lower-case(.),$w)]) then
3
else if ($response//*:h3[contains(lower-case(.),$w)]) then
2
else if ($response//*:noscript[contains(lower-case(.),$w)]) then
-2
else if ($response//*:script[contains(lower-case(.),$w)]) then
-1
else
1
let $freq := count($corpus[. eq $w])
order by $freq descending
return
if ($freq gt 4 and string-length($w) gt 3) then
<word word="{$w}" frequency="{$freq}" fitness="{$fitness}"/>
else ()
}</words>
|
Now, you have a second metric that you can use to gather more information about the importance of a word:
<word word="asteroid" frequency="102" fitness="5"/>appeared in the<title>element.<word word="deflect" frequency="11" fitness="3"/>appeared in an<h2>element.<word word="false" frequency="7" fitness="-1"/>appeared in a<script>element, so you give it a negative fitness.
This fitness metric is simplistic because it might just happen that an important word
somehow also appears in a <script> section, or
it might be that a word that appears in a <title>
element is not as important to the body of the document as your assumption.
You can make additional improvements for scoring documents and
generating more appropriate keywords, but let's move on to integrating some
heavier-duty tools for performing text analysis.
Textual analysis using web services
Many commercial and open source tools are available that perform natural language processing (NLP). Here are some of most popular open source packages:
- GATE. A natural language processing and engineering tool.
- Apache Unstructured Information Management Architecture. Originally developed by IBM.
- RapidMiner. Data and text mining software.
- Carrot2. Text and search results framework (with clustering).
In addition, several web services provide useful textual analysis. The second half of this article focuses on how to use these services in your XQuery files. You use the EXPath HTTP Client library to access them.
YQL is a SQL-like language that lets you query data across a range of Yahoo! web services. Yahoo! is used to expose a lot of its data and services using a suite of web services; now, it uses different endpoints and methods for accessing these services through a single interface: YQL.
With YQL, you can now access data across the Internet through one simple language,
eliminating the need to learn how to call different APIs. One such service is
search.termextract, which extracts common
terms from a set of textual content. You can try it out through the browser
by using the online YQL console:
http://developer.yahoo.com/yql/console/ ?q=select%20*%20from%20search.termextract%20where%20 context%3D%22Italian%20sculptors%20and%20painters%20of %20the%20renaissance%20favored%20the%20Virgin%20Mary%20for%20inspiration%22 |
The operative YQL statement declares selecting from a table called
search.termextract on text supplied from the
context variable.
select * from search.termextract where context= |
Click Test to generate resultant XML containing a
<query/> element, with the results and
some diagnostics as in Listing 6.
Listing 6. YQL result
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="5" yahoo:created="2010-12-05T14:36:25Z" yahoo:lang="en-US">
<diagnostics>
<publiclyCallable>true</publiclyCallable>
<user-time>14</user-time>
<service-time>11</service-time>
<build-version>9962</build-version>
</diagnostics>
<results>
<Result xmlns="urn:yahoo:cate">italian sculptors</Result>
<Result xmlns="urn:yahoo:cate">virgin mary</Result>
<Result xmlns="urn:yahoo:cate">painters</Result>
<Result xmlns="urn:yahoo:cate">renaissance</Result>
<Result xmlns="urn:yahoo:cate">inspiration</Result>
</results>
</query>
|
As it's easy to use the EXPath HTTP Client library from within XQuery, let's use it to access the YQL web service within your own content classification processes. Listing 7 shows how you can call this web service from within XQuery.
Listing 7. Access the YQL web service from XQuery
xquery version "1.0";
import module namespace http = "http://expath.org/ns/http-client";
let $content-url := 'http://en.wikipedia.org/wiki/Asteroid_impact_avoidance'
let $content-request :=
<http:request href="{$content-url}"
method="get" follow-redirect="true"/>
let $response := http:send-request($content-request)[2]
let $content := fn:string-join(subsequence(($response//*:title,$response//*:p),1,10),' ')
let $query := fn:concat("select * from search.termextract where context=",$content," ")
let $query :=
fn:encode-for-uri(
fn:concat("select * from search.termextract where context='",$content,"'")
)
let $yahoo-url :='http://query.yahooapis.com/v1/public/yql?diagnostics=true&q='
let $term-extraction-url := fn:concat($yahoo-url,$query)
let $term-extraction-request := <http:request href="{$term-extraction-url}" method="get"/>
return
http:send-request($term-extraction-request)[2]
|
The above XQuery code takes care to encode your query string using the
fn:encode-for-uri() function.
YQL analysis generates a much higher-quality set of terms, as Listing 8 shows.
Listing 8. YQL term results
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="20"
yahoo:created="2010-12-05T20:14:37Z" yahoo:lang="en-US">
<diagnostics>
<publiclyCallable>true</publiclyCallable>
<url execution-time="433"
>http://search.yahooapis.com/ContentAnalysisService/V1/termExtraction
</url>
<javascript execution-time="436" instructions-used="0"
table-name="search.termextract"/>
<user-time>437</user-time>
<service-time>433</service-time>
<build-version>9962</build-version>
</diagnostics>
<results>
<Result xmlns="urn:yahoo:cate">tertiary extinction event</Result>
<Result xmlns="urn:yahoo:cate">shoemaker levy 9</Result>
<Result xmlns="urn:yahoo:cate">spaceguard survey</Result>
<Result xmlns="urn:yahoo:cate">near earth objects</Result>
<Result xmlns="urn:yahoo:cate">period comet</Result>
<Result xmlns="urn:yahoo:cate">nasa report</Result>
<Result xmlns="urn:yahoo:cate">extinction level event</Result>
<Result xmlns="urn:yahoo:cate">deep impact probe</Result>
<Result xmlns="urn:yahoo:cate">inner solar system</Result>
<Result xmlns="urn:yahoo:cate">mitigation strategies</Result>
<Result xmlns="urn:yahoo:cate">65 million years</Result>
<Result xmlns="urn:yahoo:cate">material composition</Result>
<Result xmlns="urn:yahoo:cate">impact winter</Result>
<Result xmlns="urn:yahoo:cate">chicxulub crater</Result>
<Result xmlns="urn:yahoo:cate">impact speed</Result>
<Result xmlns="urn:yahoo:cate">catastrophic impact</Result>
<Result xmlns="urn:yahoo:cate">catastrophic damage</Result>
<Result xmlns="urn:yahoo:cate">planetary defense</Result>
<Result xmlns="urn:yahoo:cate">impact events</Result>
<Result xmlns="urn:yahoo:cate">astronomical events</Result>
</results>
</query>
|
YQL also has limitations. For example, you must ensure that
content passed to YQL does not go past request limits. Because these requests
are sent as HTTP GET requests, they must be
correctly encoded.
Textual analysis with AlchemyAPI
AlchemyAPI is a company that provides an interesting set of content analysis tools (see Resources). All of the company's tools are available as a suite of web services. In this article, you use their term and named entity extraction services to perform text analysis.
Keyword extraction with Alchemy
AlchemyAPI provides a web service for extracting topic keywords from any
publicly accessible web page. Using a straightforward HTTP GET
request, you access the AlchemyAPI web service, instruct it to retrieve a
particular URL, and extract topic keywords. As a bonus, AlchemyAPI URL
processing calls automatically fetch the desired web page, normalize and clean
it (removing ads, navigation links, and other unimportant content), and
extract topic keywords. Listing 9 shows how this is done.
Listing 9. URL for accessing the AlchemyAPI topic-extraction web service
http://access.alchemyapi.com/calls/url/URLGetRankedKeywords?
apikey=PLACE_YOUR_APIKEY_HERE&
url=http://en.wikipedia.org/wiki/Asteroid_impact_avoidance
|
AlchemyAPI requires two URL parameters:
- A URL on which to make the analysis
- An apikey, which is required for any call made on the web service
You can obtain an AlchemyAPI apikey through a registration form from the AlchemyAPI site.
As AlchemyAPI gets the URL for you, calling the web service from XQuery is slightly simpler than the previous examples invoking YQL. Listing 10 shows the code.
Listing 10. XQuery generating keywords using AlchemyAPI
xquery version "1.0";
import module namespace http = "http://expath.org/ns/http-client";
let $url := 'http://en.wikipedia.org/wiki/Asteroid_impact_avoidance'
let $apikey := 'PLACE_YOUR_APIKEY_HERE'
let $alchemey_uri := 'http://access.alchemyapi.com/calls/url/URLGetRankedKeywords?'
let $href := fn:concat($alchemey_uri,'&apikey=',$apikey,'&url=',$url)
let $content-request := <http:request href="{$href}" method="get" follow-redirect="true"/>
return
http:send-request($content-request)[2]
|
Listing 11 shows the result containing keywords for your test web page.
Listing 11. Result from the topic-extraction web service
<results>
<status>OK</status>
<usage>By accessing AlchemyAPI or using information
generated by AlchemyAPI, you are agreeing to be bound by
the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>
<url>http://en.wikipedia.org/wiki/Asteroid_impact_avoidance</url>
<language>english</language>
<keywords>
<keyword>
<text>asteroid</text>
<relevance>0.983321</relevance>
</keyword>
<keyword>
<text>NASA</text>
<relevance>0.376168</relevance>
</keyword>
<keyword>
<text>comet</text>
<relevance>0.370371</relevance>
</keyword>
<keyword>
<text>near-earth object</text>
<relevance>0.363529</relevance>
</keyword>
<keyword>
<text>survey program</text>
<relevance>0.3417</relevance>
</keyword>
.... more keywords ....
</keywords>
</results>
|
Because the keywords come with a relevance score (and a lot more relevant results), the output from the AlchemyAPI web service is better than YQL in terms of quality.
Entity extraction with AlchemyAPI
You can step up a level in sophistication by using the AlchemyAPI named entity extraction web service, which is capable of identifying people, companies, organizations, cities, geographic features, and other typed entities within your content. Some heavy-duty NLP occurs here to extract entities with meaning.
As with the topic keyword web service, all you have to do is supply an apikey and URL that contains the content you want to analyze, as in Listing 12.
Listing 12. URL for accessing the AlchemyAPI named entity extraction web service
http://access.alchemyapi.com/calls/url/URLGetRankedNamedEntities?
apikey=PLACE_YOUR_APIKEY_HERE&
url=http://en.wikipedia.org/wiki/Asteroid_impact_avoidance
|
You do exactly the same thing in terms of calling the web service from XQuery, as Listing 13 shows.
Listing 13. XQuery generating entities using AlchemyAPI
xquery version "1.0";
import module namespace http = "http://expath.org/ns/http-client";
let $url := 'http://en.wikipedia.org/wiki/Asteroid_impact_avoidance'
let $apikey := 'PLACE_YOUR_APIKEY_HERE'
let $alchemey_uri := 'http://access.alchemyapi.com/calls/url/URLGetRankedNamedEntities?'
let $href := fn:concat($alchemey_uri,'&apikey=',$apikey,'&url=',$url)
let $content-request := <http:request href="{$href}" method="get" follow-redirect="true"/>
return
http:send-request($content-request)[2]
|
The result of the textual analysis is quite lengthy and, as Listing 14 shows, compelling.
Listing 14. Result from named entity extraction web service
<results>
<status>OK</status>
<usage>By accessing AlchemyAPI or using information generated by AlchemyAPI,
you are agreeing to be bound by the AlchemyAPI Terms of
Use: http://www.alchemyapi.com/company/terms.html</usage>
<url>http://en.wikipedia.org/wiki/Asteroid_impact_avoidance</url>
<language>english</language>
<entities>
<entity>
<type>GeographicFeature</type>
<relevance>0.667231</relevance>
<count>44</count>
<text>Earth</text>
</entity>
<entity>
<type>Organization</type>
<relevance>0.472053</relevance>
<count>25</count>
<text>NASA</text>
<disambiguated>
<name>NASA</name>
<subType>Company</subType>
<subType>GovernmentAgency</subType>
<subType>AirportOperator</subType>
<subType>AwardPresentingOrganization</subType>
<subType>SoftwareDeveloper</subType>
<subType>SpaceAgency</subType>
<subType>SpacecraftManufacturer</subType>
<geo>38.88305555555556 -77.01638888888888</geo>
<website>http://www.nasa.gov/home/index.html</website>
<dbpedia>http://dbpedia.org/resource/NASA</dbpedia>
<umbel>http://umbel.org/umbel/ne/wikipedia/NASA</umbel>
<yago>http://mpii.de/yago/resource/NASA</yago>
</disambiguated>
</entity>
.... entities ....
</entities>
</results>
|
The AlchemyAPI named entity extraction web service has identified all kinds of things. For example, it knows that:
- Earth is a geographical feature.
- NASA is an organization and provides several related links.
- The United States is a country.
- Representative George E. Brown is a person and identifies him as a politician.
In this sense, textual mining almost seems magical with respect to what can be gleaned from the content, but it's best to keep an eye on the relevance scoring. No system is 100 percent accurate, and you will find that certain content responds better than others for textual analysis.
This article covers a number of techniques for beginning to classify your own documents. The first attempts were focused on how to build your own XQuery text-mining techniques based on determining word frequencies. I then showed you how to integrate powerful external web services, provided by Yahoo! and AlchemyAPI, for text analysis.
Clearly, text analysis that the web services provided were higher in quality, but even with primitive word frequency XQuery examples, it's possible to use pure XQuery to get useful inferences from your data.
All the methods presented have some limitations. For example, only one document was analyzed. Performing textual analysis across a set of related documents can result in higher-quality categorization, as you can cross-reference from a larger corpus and glean deeper relations between documents. Overall, I hope that this article has shown you how powerful XQuery is for automating content categorization, and I would love to hear your feedback on your own attempts to apply XQuery in the same manner.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample scripts for this article | content_catigorisation_src.zip | 20KB | HTTP |
Information about download methods
Learn
- Text mining: Read more about text mining (sometimes called text data mining) in this Wikipedia entry.
- Yahoo Query Language(YQL): Learn more about an expressive SQL-like language that lets you query, filter, and join data across web services.
- AlchemyAPI: Check out the range of web services for analyzing text.
- EXPath: Learn more about this suite of specifications.
- More articles by this author (James R. Fuller, developerWorks, June 2008-current): Read articles about XProc, XQuery, Atom XML, Firefox XUL, and other technologies.
- XML area on developerWorks: Get the resources you need to advance your skills in the XML arena.
- New to web development: Start learning about dynamic web applications and more.
- My developerWorks: Personalize your developerWorks experience.
- IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
- XML technical library: See the developerWorks XML Zone library for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks. Also, read more XML tips.
- developerWorks technical events and webcasts: Stay current with technology in these sessions.
- developerWorks on Twitter: Join today to follow developerWorks tweets.
- developerWorks podcasts: Listen to interesting interviews and discussions for software developers.
- developerWorks on-demand demos: Watch demos ranging from product installation and setup for beginners to advanced functionality for experienced developers.
Get products and technologies
- EXPath HTTP Client module: Find module implementations and examples for this set of functions to send HTTP and HTTPS requests and handle responses.
- Zorba XQuery Processor online demo: Visit the sandbox and try out XQuery in Zorba online.
- AlchemyAPI: Learn more about and download this suite of content analysis and meta-data annotation tools.
- IBM product evaluation versions: Download or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
- EXPath mailing lists: Discuss and contribute to EXPath.
- Yahoo! Groups related to XML: Join the discussions.
- XML zone discussion forums: Participate in any of several XML-related discussions.
- The developerWorks community: Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

James Fuller has been a professional developer for more than 15 years, working with several software blue-chip companies in both his native USA and the UK. He has co-written a few technology-related books and regularly speaks and writes articles focusing on XML technologies. He is a founding committee member for XML Prague and was in the gang responsible for EXSLT. He spends most of his time working with XML databases and XQuery. You can reach James at jim.fuller@webcomposite.com.




