Table of Contents
Documents are retrieved from BDB XML containers using XQuery expressions. XQuery is a language designed to query XML documents. Using XQuery, you can retrieve entire documents, subsections of documents, or values from one or more individual document nodes. You can also use XQuery to manipulate or transform values returned by document queries.
Note that XQuery represents a superset of XPath 2.0, which in turn is based on XPath 1.0. If you have prior experience with BDB XML 1.x, then you should be familiar with XPath as that was the query language offered by that library.
BDB XML supports the entire W3 XQuery specification. As of this printing, the specification is dated July 2004. However, BDB XML will be updated to track any changes in the working specification that may occur. You can find the XQuery specification at http://www.w3.org/XML/Query.
Beyond the W3C specifications, there are several good books on the market today that fully describe XQuery. In addition, there are many freely available resources on the web that provide a good introduction to the language. Searching for 'XQuery' in the Web search engine of your choice ought to return a wealth of information and pointers on the language.
That said, this chapter begins with a very thin introduction to XQuery that should be enough for you to understand any BDB XML concepts required to proceed with usage of the library. In particular, the next section of this manual highlights those aspects of XQuery that have unique meanings relative to BDB XML usage. Be aware, however, that the following introduction is not meant to be complete — a full treatment of XQuery is beyond the scope of an introductory manual such as this.
We follow this brief introduction to XQuery with a general description of querying documents stored in BDB XML containers, and examining the results of those queries. See Retrieving BDB XML Documents using XQuery for that information.
XQuery can be used to:
Query for a document. Note that queries can be formed against an individual document, or against multiple documents.
Query for document subsections, including values found on individual document nodes.
Manipulate and transform the results of a query.
Modify a document (see Modifying XML Documents for more information).
To do this, XQuery views an XML document as a collection of element, text, and attribute nodes. For example, consider the following XML document:
<?xml version="1.0"?> <Node0> <Node1 class="myValue1">Node1 text </Node1> <Node2> <Node3>Node3 text</Node3> <Node3>Node3 text 2</Node3> <Node3>Node3 text 3</Node3> <Node4>300</Node4> </Node2> </Node0>
In the above document, <Node0>
is the
document's root node, and
<Node1>
is an element node.
Further, the element node, <Node1>
, contains a
single attribute node whose name is class
and whose
value is myValue1
. Finally,
<Node1>
contains a text node whose value is
Node1 text
.
A document's root can always be referenced using a single forward slash:
/.
Subsequent element nodes in the document can be referenced using Unix-style path notation:
/Node1
To reference an attribute node, prefix the attribute node's name
with '@
':
/Node1/@class
To return the value contained in a node's text node (remember that not all element nodes
contain a text node), use distinct-values()
function:
distinct-values(/Node1)
To return the value assigned to an attribute node, you also use the
distinct-values()
function:
distinct-values(/Node1/@class)
When you provide an XQuery path, what you receive back is a result set. You can further filter this result set by using predicates. Predicates are always contained in brackets ([]) and there are two types of predicates that you can use: numeric and boolean.
Numeric predicates allow you to select a node based on its position relative to another node in the document (that is, based on its context).
For example, consider the document presented in XQuery: A Brief Introduction. This document contains three <Node3> elements. If you simply enter the XQuery expression:
/Node1/Node2/Node3
all <Node3> elements in the document are returned. To return, say, the second <Node3> element, use a predicate:
/Node1/Node2/Node3[2]
The meaning of an XQuery expression can change depending on the current context. Within XQuery expressions, context is usually only important if you want to use relative paths or if your documents use namespaces. Do not confuse XQuery contexts with BDB XML contexts. While BDB XML contexts are related to XQuery contexts, they differ in that BDB XML contexts are a data structure that allows you to define namespaces, define variables, and to identify the type of information that is returned as the result of a query (all of these topics are discussed later in this chapter).
Just like Unix filesystem paths, any path that does not begin
with a slash (/
) is relative to your current
location in a document. Your current location in a document is
determined by your context. Thus, if in the document
presented in
XQuery: A Brief Introduction
your context is set to Node2
, you can refer
to Node3
with the simple notation:
Node3
Further, you can refer to a parent node using the following familiar notation:
..
and to the current node using:
.
Natural language and, therefore, tag names can be imprecise. Two different tags can have identical names and yet hold entirely different sorts of information. Namespaces are intended to resolve any such sources of confusion.
Consider the following document:
<?xml version="1.0"?> <definition> <ring> Jewelry that you wear. </ring> <ring> A sound that a telephone makes. </ring> <ring> A circular space for exhibitions. </ring> </definition>
As constructed, this document makes it difficult (though not impossible) to select the node for, say, a ringing telephone.
To resolve any potential confusion in your schema or supporting code, you can introduce namespaces to your documents. For example:
<?xml version="1.0"?> <definition> <jewelry:ring xmlns:jewelry="http://myDefinition.dbxml/jewelry"> Jewelry that you wear. </jewelry:ring> <sounds:ring xmlns:sounds="http://myDefinition.dbxml/sounds"> A sound a telephone makes. </sounds:ring> <showplaces:ring xmlns:showplaces="http://myDefinition.dbxml/showplaces"> A circular space for exhibitions. </showplaces:ring> </definition>
Now that the document has defined namespaces, you can precisely query any given node:
/definition/sounds:ring
In order to perform queries against a document stored in BDB XML
that makes use of namespaces, you must declare the namespace
to your query. You do this using
XmlQueryContext.setNamespace()
.
See Defining Namespaces for more
information.
By identifying the namespace to which the node belongs, you are declaring a context for the query.
The URI used in the namespace definition is not required to actually resolve to anything. The only criteria is that it be unique within the scope of any document set(s) in which it might be used.
Also, the namespace is only required to be declared once in the document. All subsequent usages need only use the relevant prefix. For example, we could have added the following to our previous document:
<jewelry:diamond> The centerpiece of many rings. </jewelry:diamond> <showplaces:diamond> A place where baseball is played. </showplaces:diamond>
Finally, namespaces can be used with attributes too. For an example:
<clubMembers> <surveyResults school:class="English" xmlns:school="http://myExampleDefinitions.dbxml/school" number="200"/> <surveyResults school:class="Mathematics" number="165"/> <surveyResults social:class="Middle" xmlns:social="http://myExampleDefinitions.dbxml/social" number="543"/> </clubMembers>
Once you have declared a namespace for an attribute, you can query the attribute in the following way:
/clubMembers/surveyResults/@school:class
And to retrieve the value set for the attribute:
distinct-values(/clubMembers/surveyResults/@school:class)
XQuery allows you to use wildcards when document elements are unknown. For example:
/Node0/*/Node6
selects all the Node6 nodes that are 3 nodes deep in the document and whose path
starts with Node0
. Other wildcard matches are:
Selects all of the nodes in the document:
//*
Selects all of the Node6 nodes that have three ancestors:
/*/*/*/Node6
Selects all the nodes immediately beneath Node5:
/Node0/Node5/*
Selects all of Node5's attributes:
/Node0/Node5/@*
It is possible to perform a case-insensitive and
diacritic insensitive match using BDB XML's built-in
function, dbxml:contains()
. This
function takes two parameters, both strings. The first
identifies the attribute or element that you want to
examine, and the second provides the string you want to
match.
For example, the search:
collection('myCollection.dbxml')/book[dbxml:contains(title, "Résumé")]
matches "resume", "Resume", "Resumé" and so forth.
Note that searches performed using
dbxml:contains()
can be backed by
BDB XML's substring indexes.
XQuery provides several functions that can be used for global navigation to a specific document or collection of documents. From the perspective of this manual, two of these are interesting because they have specific meaning from within the context of BDB XML
Within XQuery, collection()
is a function that
allows you to create a named sequence. From within BDB XML, however,
it is also used to navigate to a specific container. In this
case, you must identify to collection()
the literal name
of the container. You do this either by passing the container name directly to the function, or by declaring
a default container name using the
XmlQueryContext.setDefaultCollection()
method.
Note that the container must have already been
opened by the XmlManager
in order for
collection to reference that container. The exception to this is if
XmlManager
was opened using the
XmlManagerConfig.setAllowAutoOpen()
method.
For example, suppose you want to perform a query against a container
named container1.dbxml
. In this case, first open
the container using
XmlManager.openContainer()
and then specify the collection() function on the query. For
example:
collection("container1.dbxml")/Node0
Note that this is actually short-hand for:
collection("dbxml:/container1.dbxml")/Node0
dbxml:/
is the default base URI for BDB XML. You
can change the base URI using
XmlQueryContext.setBaseURI()
.
If you want to perform a query against multiple containers, use the union ("|") operator. For example, to
query against containers c1.dbxml
and c2.dbxml
, you would use the
following expression:
(collection("c1.dbxml") | collection("c2.dbxml"))/Node0
See Retrieving BDB XML Documents using XQuery for more information on how to prepare and perform queries.
XQuery provides the doc()
function so that
you can trivially navigate to the root of a named document.
doc()
is required to take a URI.
To use
doc()
to navigate to a specific document
stored in BDB XML, provide an XQuery path that uses the
dbxml:
base URI, and that identifies the
container in which the document can be found. The actual
document name that you provide is the same name that was set for
the document when it was added to the container (see
Adding Documents for more
information).
For example, suppose you have a document named "mydoc1.xml" in
container "container1.dbxml". Then to perform a query against that
specific document, first open
container1.dbxml
and then provide a query
something like this:
doc("dbxml:/container1.dbxml/mydoc1.xml")/Node0
See Retrieving BDB XML Documents using XQuery for more information on how to prepare and perform queries.
XQuery offers iterative and transformative capabilities through FLWOR
(pronounced "flower") expressions. FLWOR is an
acronym that stands for the five major clauses in a FLWOR
expression: for, let, where, order by
and
return
. Using FLWOR expressions, you can iterate
over sequences (frequently result sets in BDB XML), use variables,
and filter, group, and sort sequences. You can even use FLWOR to
perform joins of different data sources.
For example, suppose you had documents in your container that looked like this:
<product> <name>Widget A</name> <price>0.83</price> </product>
In this case, queries against the container for these documents return the documents in order by their document name. But suppose you wanted to see all such documents in your container, ordered by price. You can do this with a FLWOR expression:
for $i in collection("myContainer.dbxml")/product order by $i/price descending return $i
Note that from within BDB XML, you must provide FLWOR expressions in a single string. Lines can be separated either by a carriage return ("\n") or by a space. Thus, the above expression would become:
String flwor="for $i in collection('myContainer.dbxml')/product\n"; flwor += "order by $i/price descending\n"; flwor += "return $i"