Query syntax
Extensive query syntax allows you to find specific documents.
Simple query syntax characters
The following list describes the characters that you can use in enterprise search and content mining applications to refine query results.
- Free style query syntax
- Free style query syntax is used to describe queries that do not have an explicit interpretation
and for which there is no default behavior defined. The default implementation for this type of
query is to return documents only if they match all terms in the free style query.
Query:
computer softwareResult: This query returns documents that include the term computer and the term software, or something else depending on the semantics implemented in the application.
- ~ (prefix)
- Precede a term with a tilde sign (~) to indicate that a match occurs anytime a document contains
the word or one of its synonyms.
Query:
~fortResult: This query finds documents that include the term
fortor one of its synonyms (such asgarrisonandstronghold). - ~ (postfix)
- Follow a single term with a tilde sign (~) to indicate that a match occurs anytime a document
contains a term that has the same linguistic base form as the query term (also known as a lemma or
stem).
Query:
run~Result: This query finds documents that include the term run, running, or ran because run is the base form of the verb.
- +
- Precede a term with a plus sign (+) to indicate that a document must contain the term for a
match to occur. Because the plus sign is the default, it is usually omitted. The plus sign is not
needed because documents are included in the search results only if they match all terms in a free
style query. In a free text query (without the plus sign) only matches in exact form are
returned.
Query:
+computer +softwareResult: This query returns documents that include the term computer and the term software.
- −
- Precede a term with a minus sign (-) to indicate that the term must be absent from a document
for a match to occur. The minus sign acts as a filter to remove documents and must be associated
with a query that returns positive results.
Query:
computer -hardwareResult: This query returns documents that include the term computer and not the term hardware.
Query:
$language::en -url:qaResult: This query returns documents in the English language minus documents that have URLs with the string qa.
Query:
url:com -url:supportResult: This query returns documents that have URLs with the string com minus those documents that have URLs with the string support.
- =
- Precede a term with the equal sign (=) to indicate that the document must contain an exact match
of the term for a match to occur. (Lemmatization is disabled.)
Query:
=applesResult: This query returns documents if and only if they include the plural term
apples. - \
- Precede a character in a term with the backslash (\) escape character to find terms or phrases
that contain restricted characters, such as backslashes and double quotation marks in phrases.
Reserved query syntax terms can also be escaped with the backslash character. For example, you can
escape terms such as AND, ANY, INORDER, NOT,
OR, SENTENCE, and WITHIN with a backslash character. You
cannot escape wildcard characters (
*and?).Query:
"program files\\ibm"Result: This query returns documents that contain the phrase program files\ibm.
- *:*
- Use this special query syntax to retrieve all available documents in the collection without
performing score computation. To use this syntax with enterprise search collections, the
Enable the query to return all documents (*:*) check box must be selected on
the Search Server Options page. This check box is selected by
default.
Query:
*:*Result: This query returns all available documents in the collection.
- *
-
Place a wildcard character (*) anywhere in, before, or after a term or a field to indicate that the document can contain any word that matches any of the possible combinations. A term with a wildcard character is interpreted as equivalent to an OR of all its applicable expansions. Wildcard support applies the following rules:
- The set of expansions contain the maximal configured number of expansions. If there are more expansions in the index than the maximal number, those expansions are ignored. If some expansions of the wildcard term were ignored, the query result indicates that.
- The set of expansions contains all terms in the index that can be obtained by replacing the wildcard characters with arbitrary sequences of characters.
- Wildcard characters are supported only for plain text terms. Wildcard characters are not supported for XML element names, attribute names, or attribute values.
- A term that consists solely of a wildcard is not supported.
- Wildcard characters are supported within phrases.
- If the number of expansions for a wildcard term exceed the configured maximum number of expansions, the expansions that exceed that maximum are ignored by the query evaluation. In that case, the ResultSet object's method isEvaluationTruncated() returns true. This does not uniquely identify the situation because it also returns true if the evaluation was terminated early due to a timeout.
Query:
app*Result: This query finds documents that include the terms apple, apples, application, and so on, because these words begin with app.
Query:
DB2 info*Result: This query finds all documents that contain DB2 followed by a word that begins with info.
Query:
title:tech*Result: This query finds all documents with titles that begin with tech.
Remember: To specify queries with wildcard characters, an administrator must enable wildcard support when configuring search options for the collection in the administration console. - ?
- Replace a character in a term with the question mark (?) wildcard character to find terms that
match all other characters in the term.
Query:
m?reResult: This query returns documents that contain the terms mare, mere, mire, and more.
- " "
-
Use double quotation marks (") to indicate that a document must contain the exact phrase within the double quotation marks for a match to occur. Words inside phrases are never lemmatized.
You can also add wildcard characters (* or ?) within phrases. The wildcard character must be next to a letter or word. Stand-alone wildcard characters are not supported. Wildcard character support must be enabled in the administration console.
Query:
"computer software programming"Result: This query finds documents that include the exact phrase
computer software programming.Phrases are designated as required by default. Hence the two queries
building "new york"andbuilding +"new york"are equivalent. Phrases can also be forbidden (-) and required but insufficient (^).Query:
"app* pea*"Result: This query finds documents that include the terms apples pears, appears peaceful, appreciate peas, and so on, because these words begin with app and pea. This query does not find documents with apples and pears or other such combinations.
Query:
"apples * pears"Result: This query matches apples and pears or apples or pears, but it does not match apples pears.
Restriction: Using double quotation marks for URL or email address strings does not return appropriate results. To search for URL or email strings such as www.ibm.com or somebody@mycompany.com, do not enclose the string in double quotation marks.To search for phrases that contain double quotation marks (") or backslash characters (\), use the backslash character to escape the restricted character. For example, "\"The Godfather\"" or "hardware\\software requirements".
- /facet_name/value_level_1/.../value_level_n
- If you search a collection that contains facets, you can search for documents that contain a
specific facet or facet value. For facets with multiple value levels, such as hierarchical and date
facets, you can search for multiple-level facet values.
Query:
/country/JapanResult: This query finds documents that include the facet
countrywith the facet valueJapan.Query:
/date/2009/1/15 /location/US/CaliforniaResult: This query finds documents that include the facet
datewith the multiple-level facet values2009,1, and15, and the facetlocationwith the multiple-level facet valuesUSandCalifornia. - ^boost
- Follow a search term by a boost value to influence how documents that contain a specified term
are ranked in the search results.
Query:
ibm Germany^5.0Result: This query finds documents that include the terms
IBMandGermany, and increases the relevance of these documents by a factor of 5 in the search results. - ~ambiguity
-
Query:
ibm analytics~0.5Result: This query does a fuzzy search and finds documents that include the terms
IBMandanalytics,IBMandanalyze,IBMandanalysis, and so on. - ( )
- Use parentheses ( ) to indicate that a document must contain one or more of the terms within the
parentheses for a match to occur. Use OR or a vertical bar ( | ) to separate the terms in
parentheses.
Do not use plus signs (+) or minus signs (-) within the parentheses.
Query:
+computer (hardware OR software)Query:
+computer (hardware | software)Result: Both of these queries find documents that include the term computer and at least one of the terms hardware or software.
An OR of terms is designated as required (+) by default. Therefore, the previous queries are equivalent to
+computer +(hardware | software).
Query syntax for query keywords
The following list describes keywords that you can use to limit a search to specific documents or specific parts of documents.
- IN contextual view
- If a content analytics collection contains contextual views, you can include the IN keyword with
other query operators and keywords to search only the documents that belong to a specific contextual
view.
Query:
computer IN question “software maintenance” IN answerResult: This query returns documents that contain the term computer in the question view and contain the phrase software maintenance in the answer view.
Query:
/keyword$._word.noun/computer IN question IN answerResult: This query returns documents that include the
nounfacet with the facet valuecomputerin the intersection of the question and answer views.Query:
(software maintenance) WITHIN 5 IN answerResult: This query returns documents that contain the words software and maintenance, or matching forms of the words, in any order, within 5 words of each other in the answer view.
Query:
@xmlf2::'<title>IBM computers</title>' IN questionResult: This query returns documents that contain the phrase IBM computers in the <title> element of an XML fragment in the question view.
- (terms) WITHIN context IN ORDER
- Follow a search term or phrase by proximity search operators to find documents that contain
terms within a specified number of words of each other, in the same sentence, or in a specified
order within a sentence. The IN ORDER option is optional and specifies that words must appear in the
same order that you specify them in the query. The context can be:
- A positive number. For example,
(a b c) WITHIN 5matches documents that contain the three specified words or matching forms of the words, in any order, within 5 words of each other (that is, up to two words between them).The query
("a" "b" "c") WITHIN 5 INORDERmeans that the three words must appear in the same order, and in their exact form, within five words of each other. No lemmatization is performed for the terms a, b, or c. - WITHIN SENTENCE means that the terms must appear in the same sentence. Lemmatization does not occur if the terms are specified in quotation marks.
Sample proximity queries:
( x y z ) WITHIN 5
("x" y z ) WITHIN SENTENCE
( x "y z") WITHIN SENTENCE
subject:(world star) WITHIN SENTENCE (lemmatization is done of world and star, in any order)
("Hello" "World") WITHIN SENTENCE INORDER (no lemmatization and order is maintained)
- A positive number. For example,
- (terms) ANY number
- Use the
ANYkeyword to find documents that contain a certain number of the specified query terms.Query:
(x y z) ANY 2Result: This query returns documents that contain at least two of the specified query terms.
- site:text
- If you search a collection that contains web content, use the
sitekeyword to search a specific domain. For example, you can return all pages from a particular website.Do not include the prefix
http://in a site query.Query:
+laptop site:www.ibm.comResult: This query finds all documents on the
www.ibm.comdomain that contain the wordlaptop. - url:text
- If you search a collection that contains web content, use the
urlkeyword to find documents that contain specific words anywhere in the URL.Query:
url:supportResult: This query finds documents that have a URL with the word
support, such ashttp://www.ibm.com/support/fr/.Query:
url:support url:frResult: This query finds documents that have a URL with the words
supportandfrin any order.Query:
url:support&frResult: This query finds documents that have a URL with the phrase
support fr. This query is similar to using double quotation marks to search for an exact phrase. - link:text
- If you search a collection that contains web content, use the
linkkeyword to find documents that contain at least one hypertext link to a specific web page.Query:
link:http://www.ibm.com/usResult: This query finds all documents that include one or more links to the page
http://www.ibm.com/us. - field:text
- If the documents in a collection include fields (or columns), and the collection administrator
made those fields searchable by field name, you can query specific fields in the
collection.
Query:
lastname:smith div:softwareResult: This query returns all documents about employees with the last name Smith
(lastname:smith)who work for the Software division(div:software). - docid:documentid
- Use the
docidkeyword to find documents that have a specific URI (or document ID). Typically, there is at most one document in a collection that matches a specific URI.Query:
(docid:http://www.ibm.com/solutions/us/ OR docid:http://www.ibm.com/products/us/)Result: This query finds all documents with the URI
http://www.ibm.com/solutions/us/or the URIhttp://www.ibm.com/products/us/. - samegroupas:URI
-
By default, IBM® Watson Explorer Content Analytics treats the URLs with the same host name as if they belong to the same group, and treats the news articles from the same thread as if they belong to the same group. For URIs from all other data sources, each URI forms its own group. However, an administrator can organize URIs that match specific prefixes into groups. For example, consider the following group definitions:
http://mycompany.server1.com/hr/ hr http://mycompany.server2.com/hr/ hr http://mycompany.server3.com/hr/ hr http://mycompany.server1.com/finance/ finance file:///myfileserver1.com/db2/sales/ sale file:///myfileserver1.com/websphere/sales/ sale file:///myfileserver2.com/db2/sales/ sale file:///myfileserver2.com/websphere/sales/ saleIn this example, all the URIs with the prefix http://mycompany.server1.com/hr/ or http://mycompany.server2.com/hr/ or http://mycompany.server3.com/hr/ belong to one group: hr. All URIs with the prefix http://mycompany.server1.com/finance/ belong to another group: finance. And all the URIs with prefix file:///myfileserver1.com/db2/sales/ or file:///myfileserver1.com/websphere/sales/ or file:///myfileserver2.com/db2/sales/ or file:///myfileserver2.com/websphere/sales/ belong to yet another group: sale. If file:///myfileserver2.com/websphere/sales/mypath/mydoc.txt is a URI in the collection, a query with the following search term will restrict the search to the URIs in the sale group:
All results for this query will have one of the following prefixes:samegroupas:file:///myfileserver2.com/websphere/sales/mypath/mydoc.txtfile:///myfileserver1.com/db2/sales/ file:///myfileserver1.com/websphere/sales/ file:///myfileserver2.com/db2/sales/ file:///myfileserver2.com/websphere/sales/Query:
samegroupas:http://www.ibm.com/solutions/us/Result: This query finds all documents with URIs, in this case URLs, that belong to the same group as
http://www.ibm.com/solutions/us/. - facetName::/facet_name_1/.../facet_name_n
- In a content analytics collection, you can search for documents that contain a specific
facet.
Query:
facetName::/”Part of Speech”/Noun/”General Noun”Result: This query finds documents that include the facet
General Nounin a content analytics collection. - facetValue::/facet_name_1/.../facet_name_n/value
- In a content analytics collection, you can search for documents that contain a specific facet
value.
Query:
facetValue::/”Part of Speech”/Noun/”General Noun”/CarResult: This query finds documents that include the value
Carof the facetGeneral Nounin a content analytics collection. - date::/facet_name/time_scale/value
- In a content analytics collection, you can search for documents that contain a specific date
facet value.
Query:
date::/date/Year/2010Result: This query finds documents that include the value
2010for the year time scale of the default date facet in a content analytics collection.Query:
date::/modifieddate/Month/200905Result: This query finds documents that include the value
200905for the month time scale of themodifieddatedate facet in a content analytics collection. - facet::/facet_name/value_level_1/.../value_level_n
- In an enterprise search collection, you can search for documents that contain a specific facet
or facet value. For facets with multiple value levels, such as hierarchical and date facets, you can
search for multiple-level facet values.
Query:
facet::/country/JapanResult: This query finds documents that include the facet
countrywith the facet valueJapanin an enterprise search collection.Query:
facet::/date/2009/1/15 facet::/location/US/CaliforniaResult: This query finds documents that include the facet
datewith the multiple-level facet values2009,1, and15, and the facetlocationwith the multiple-level facet valuesUSandCalifornia. - flag::/flag_name
- If an administrator configured document flags for the collection, you can use the
flagprefix to search for documents that are assigned a particular flag.Query:
flag::/"Important"Result: This query finds documents that are flagged as Important.
- scope::/scope_name
- If an administrator configured scopes for the collection, you can use the
scopeprefix to search for documents that are in a particular scope.Query:
scope::/TechSupportResult: This query finds documents that are in the TechSupport scope.
- rulebased::category_ID
- Use the
rulebasedkeyword to find documents that belong to a specific rule-based category.Sample category tree:Root juice lemon appleQuery:
rulebased::.juice.lemonResult: This query returns documents that belong to the rule-based category juice.lemon.
- $source::source_type
- Use the
$sourcekeyword to find documents that come from a specific data source type. Source queries are useful in collections that contain documents from multiple sources.To obtain a list of the available source types for a collection, call the getAvailableAttributeValues(Searchable ATTRIBUTE_SOURCE) method of that collection's Searchable object.
Query:
$source::DB2 "computer science"Result: This query finds documents that were added to a collection by the DB2 crawler and that contain the phrase
computer science. - $language::language_id
- Use the
$languagekeyword to find documents that were written in a specific language.To obtain a list of the available language IDs for a collection, call the getAvailableAttributeValues(Searchable.ATTRIBUTE_LANGUAGE) method of that collection's Searchable object.
Query:
$language::en "computer science"Result: This query finds documents in English that contain the phrase
computer science. - $doctype::document_type
- Use the
$doctypekeyword to find documents that have a specific document format or MIME type.To obtain a list of the available document types for a collection, call the getAvailableAttributeValues(Searchable.ATTRIBUTE_DOCTYPE) method of that collection's Searchable object.
Query:
$doctype::application/pdf "computer science"Result: This query finds Portable Document Format (PDF) documents that contain the phrase
computer science. - $similar::document_id~similarity
- Use the
$similarkeyword to find documents that are near duplicates of the specified document. Thesimilarityvalue specifies the level of strictness to apply. The valid range is 0.0<=1.0. Specifying 1.0 does not mean that exact content matching is performed. It means that the search for similar documents is based on the highest level of similarity. The higher the similarity value, the closer the documents must be to being near duplicates of each other.Query:
$similar::http://www.ibm.com/solutions/us~1.0Result: This query finds documents that are highly similar to http://www.ibm.com/solutions/us.
- #field::=value
- Use parametric constraint syntax to find documents that have a numeric field with a value equal
to the specified number.
Query:
#price::=1700 laptopResult: This query finds documents that contain the term
laptopand apricefield with a value equal to 1700. - #field::>value
- Use parametric constraint syntax to find documents that have a numeric field with a value
greater than the specified number.
Query:
#price::>1700 laptopResult: This query finds documents that contain the term
laptopand apricefield with a value greater than 1700. - #field::<value
- Use parametric constraint syntax to find documents that have a numeric field with a value less
than the specified number.
Query:
#price::<1700 laptopResult: This query finds documents that contain the term
laptopand apricefield with a value less than 1700. - #field::>=value
- Use parametric constraint syntax to find documents that have a numeric field with a value
greater than or equal to the specified number.
Query:
#price::>=1700 laptopResult: This query finds documents that contain the term
laptopand apricefield with a value greater than or equal to 1700. - #field::<=value
- Use parametric constraint syntax to find documents that have a numeric field with a value less
than or equal to the specified number.
Query:
#price::<=1700 laptopResult: This query finds documents that contain the term
laptopand apricefield with a value less than or equal to 1700. - #field::>value1<value2
- Use parametric constraint syntax to find documents that have a numeric field with a value that
falls between a range of specified numbers.
Query:
#price::>1700<3900 laptopResult: This query finds documents that contain the term
laptopand apricefield with a value greater than 1700 and less than 3900. - #field::>=value1<=value2
- Use parametric constraint syntax to find documents that have a numeric field with a value that
matches or falls between a range of specified numbers.
Query:
#price::>=1700<=3900 laptopResult: This query finds documents that contain the term
laptopand apricefield with a value greater than or equal to 1700 and less than or equal to 3900. - #field::>value1<=value2
- Use parametric constraint syntax to find documents that have a numeric field with a value that
matches the criteria in the specified range of numbers.
Query:
#price::>1700<=3900 laptopResult: This query finds documents that contain the term
laptopand apricefield with a value greater than 1700 and less than or equal to 3900. - #field::>=value1<value2
- Use parametric constraint syntax to find documents that have a numeric field with a value that
matches the criteria in the specified range of numbers.
Query:
#price::>=1700<3900 laptopResult: This query finds documents that contain the term
laptopand apricefield with a value greater than or equal to 1700 and less than 3900. - #field::>"Date"
- Use parametric constraint syntax to find documents that match a specific date or date range.
Query:
#date::>"2007-12-01"Result: This query finds documents that were created or last modified on 1 December, 2007 or later.
- ACL constraints: (security_tokens)
- For security, you cannot specify access control constraints in the query string. Use the
setACLConstraints(String aclConstraints) method of the Query
interface to specify access control constraints for the query. You can specify parentheses, plus
signs (+), minus signs (-) ,circumflexes (^), and an XML security context string in the ACL
constraints string (@SecurityContext::'securityContext'). For information about the
securityContext string syntax, see the Javadoc documentation that describes the
setACLContstraints method. The symbols have the same meaning as described in the
previous syntax descriptions.
ACL constraints string in setACLConstraints method:
(michelle_c | dev_group)ACL constraints string in setACLConstraints method:
michelle_c @SecurityContext::'securityContext'Query:
thinkpadResult: This query finds documents that include the term
thinkpadand the security tokensmichelle_cordev_groupin the first case, andmichelle_cand the specified security context constraints in the second case.
Query syntax characters for opaque terms
You can create query syntax for two types of opaque terms. An opaque term is one that is expressed and handled by another query language, such as the XML query languages XML Fragment and XPath. XML Fragment can also be used to query UIMA structures. The sign for an opaque term is expressed with @xmlf2:: (XML fragment) or @xmlxp:: (XPath query). The XML fragment or the XPath query is enclosed in single quotation marks (' ').
The expression xmlf2 is used for XML fragments, and xmlxp is used for XPath terms. An opaque term has the following syntax: @syntax_name::'value'. The expression starts with the @ sign, followed by the syntax name (xmlf2 or xmlxp), two colons (::), and a value that is enclosed in single quotation marks (' '). The value parameter is sometimes preceded by -, +, or ^. If you need to use a single quotation mark in the value section of the expression, escape the single quotation by using a backslash (\), for example, \'.
For negative terms, use a minus sign (−) before the @ symbol, for example,
-@xmlf2::'<person>michelle</person>'. However, Watson Explorer Content Analytics does not accept negative unique query terms. The query
-@xmlf2::'<person>michelle</person>' does not return results. To get results,
use one positive term in the query, for example, documentation
-@xmlf2::'<person>michelle</person>'.
@xmlf2::'<Element>IBM +computers</Element>'
@xmlf2::'<Element>IBM =computers</Element>'
@xmlf2::'<Element>IBM computers~</Element>' @xmlxp::'personarecord[country contains("Germany") or title contains("IBM")]' - @xmlf2::'<tag1> text1 </tag1>'
- Use the @xmlf2:: prefix and enclose the query in single quotation marks to indicate a fragment
query as a new search and index API opaque term.
Query:
@xmlf2::'<title>"Data Structures"</title>'Result: This query finds documents that contain the phrase Data Structures within the span of an indexed annotation called title.
- @xmlf2::<tag1><.depth value="$number"><tag2> ... </tag2></.depth></tag1>
- @xmlf2::<tag1><.depth value='$number'><tag2> ... </tag2></.depth></tag1>
-
The first query uses double quotation marks. The second query uses single quotation marks. However, each query returns the same results. This query syntax looks for occurrences of tag2 exactly $number levels under tag1.
$number is a positive integer. You can use single quotation marks (' ') or double quotation marks (" ") around the numerical value. This query syntax is not applicable to Unstructured Information Management Architecture (UIMA).
Query: (This query should appear on one line.)@xmlf2::'<author>Albert Camus<.depth value='1'> <publisher>Carey Press</publisher></.depth></author>'Result: This query finds documents of the publisher one level under the author. A document with the following XML elements
will not be returned with the example query because the publisher (<author>Albert Camus <ISBN>002-12345</ISBN> <country>USA <publisher>Carey Press</publisher> </country> </author><publisher>) element occurs two levels under the author (<author>) element. - @xmlf2::'<tag1> ... </tag1>'
- You can distinguish between elements and attributes. Attributes are written either explicitly
within the element.
You can define words and phrases within attributes, which is the same as the normal terms of the query. However, you can write expressions only of words and phrases, not of tags. These words or phrases support the same features as the normal terms of the query.
Query:
@xmlf2::'<author country="USA"></author>'Result: This query finds documents where the author originates from the USA.
Query:@xmlf2::'<author country="USA"> <firstName>Michelle</firstName> <lastName>Ropelatto</lastName></author>'Result: This query finds documents where the author name is Michelle Ropelatto and is from the USA.
- @xmlf2::'+text1 ... +text2 -text3 ... -text4 text5'
- Use a plus sign (+) or a minus sign (-) as prefixes to words or phrases (always between
quotation marks (" ")). At each query level, whether for the text or the tag name, "+" means that
the terms must appear; "-" means that the terms should not appear and others are optional and
contribute only to ranking. If no "+" terms exist, then at least one of the optional terms must
appear. The data under elements creates a new nested query level.
Query:
@xmlf2::'+"Graph Theory" -network'Result: This query finds documents that contain the phrase
Graph Theory, and do not contain the term network.Query:@xmlf2::'<book><author>hemingway</author> -<title>old man</title></book>Result: This query finds documents that contain a book by Hemingway but not the book The Old Man and the Sea.
- @xmlf2::'<tag1> <.or> ... </.or> <.and> ... </.and> </tag1>'
- Use Boolean syntax for AND (<.and>) and OR (<.or>) expressions in a query.
Query:
@xmlf2::'<book><.or><author>Sylvia Plath</author><title>XML -Microsoft</title></.or></book>'Result: This query finds documents that specify a book whose author is Sylvia Plath or where the title of the book includes the word XML but not Microsoft.
- @xmlf2::'<annotation1+annotation2> ... </annotation1+annotation2>'
- You can express the concatenation of consecutive annotations in a fragment query by using the
plus sign (+) between the start and end tags of the element. The consecutive annotations must
overlap by at least one word (they must intersect). The concatenation of two or more overlapping
annotations is a new virtual annotation that spans the sum of the text spanned by the
annotations.
Query:
@xmlf2::'<Report+HoldsDuring> +Pakistan +March +Reuters</Report+HoldsDuring>'Result: This query finds documents from Reuters about events in Pakistan in March that are contained in the concatenated annotation formed by the
Report
andHoldsDuring
annotations. - @xmlf2::'<annotation1*annotation2> ... </annotation1*annotation2>'
- You can express the intersection of annotations in a fragment query using the asterisk sign (*)
between the start and end tags of an element. The intersection of two or more overlapping
annotations is a new virtual annotation that spans just the text that is covered by the intersection
of the overlapping annotations.
Query:
@xmlf2::'<Inhibits* Activates>Aspirin</Inhibits*Activates>'Result: This query finds documents in which Aspirin occurs in both the 'Inhibits' and 'Activates' annotations.
- @xmlxp::'/tag1/@tag1'
- You can distinguish between elements (XML start and end tags) and attributes. Attributes are
written explicitly with a leading @ sign. The @ sign enables you to distinguish between elements and
attributes that might have the same name. Concatenations and intersections are applicable only to
UIMA documents, and not to pure XML documents, where spands do not cross over by definition.
Query:
@xmlxp::'/author[@country="USA"]'Result: This query finds documents in which USA is included in the character string that is the value of the attribute country that is associated with author.
- @xmlxp::'/tag1[tag2 or tag3 and tag4]'
- Use full Boolean to express AND and OR scope in an XPath query.
Query:
@xmlxp::'book[author ftcontains("Jose Perez") or title ftcontains("XML -Microsoft")]'Result: This query finds documents that specify a book whose author is Jose Perez or where the title of the book includes the word XML, but not Microsoft.
- @xmlxp::'tag1//tag2/tag3'
- You can distinguish between descendent nodes (//) and child nodes (/).
Query:
@xmlxp::'/books//book/name'Result: This query finds documents that specify a book element as a descendant of a books element and that specify a name element as a direct child of the book.
- @xmlxp::'tag1/.../tagn'
- Use the @xmlxp:: prefix and enclose the query in single quotation marks to indicate an XPath
query as an search and index API opaque term.
Query:
@xmlxp::'books[booktitle ftcontains("Data Structures")]'Result: This query finds documents that contain the phrase "Data Structures" within the span of an indexed annotation called "title."