Queries Involving Document Structure

Notice that the parts container can contain documents with different structures. The ability to manage structured data in a flexible manner is one of the fundamental differences between XML and relational databases. In this example, a single container manages documents of two different structures sharing certain common elements. The fact that the documents partially overlap in structure allows for efficient queries and common indices. This can be used to model a union of related data. Structural queries exploit such natural unions in XML data. Here are some example structural queries.

First select all part records containing parent-part nodes in their document structure. In english, the following XQuery would read: "from the container named parts select all part elements that also contain a parent-part element as a direct child of that element". As XQuery code, it is:

dbxml> query '
collection("parts.dbxml")/part[parent-part]'

300 objects returned for eager expression '
collection("parts.dbxml")/part[parent-part]'

To examine the query results, use the 'print' command:

dbxml> print
<part number="540"><description>Description of 540</description>
<category>0</category><parent-part>0</parent-part></part>
<part number="30"><description>Description of 30</description>
<category>0</category><parent-part>0</parent-part></part>
...
<part number="990"><description>Description of 990</description>
<category>0</category><parent-part>0</parent-part></part>
<part number="480"><description>Description of 480</description>
<category>0</category><parent-part>0</parent-part></part>

To display only the parent-part element without displaying the rest of the document, the query changes only slightly:

dbxml> query '
collection("parts.dbxml")/part/parent-part'

300 objects returned for eager expression '
collection("parts.dbxml")/part/parent-part'

dbxml> print
<parent-part>0</parent-part>
<parent-part>0</parent-part>
...
<parent-part>2</parent-part>
<parent-part>2</parent-part>

Alternately, to retrieve the value of the parent-part element, the query becomes:

dbxml> query '
collection("parts.dbxml")/part/parent-part/string()'

300 objects returned for eager expression '
collection("parts.dbxml")/part/parent-part/string()'

dbxml> print
0
0
...
2
2

Invert the earlier example to select all documents that do not have parent-part elements:

dbxml> query '
collection("parts.dbxml")/part[not(parent-part)]'

2700 objects returned for eager expression '
collection("parts.dbxml")/part[not(parent-part)]'

dbxml> print
<part number="22"><description>Description of 22</description>
<category>2</category></part>
<part number="1995"><description>Description of 1995</description>
<category>5</category></part>
...
<part number="2557"><description>Description of 2557</description>
<category>7</category></part>
<part number="2813"><description>Description of 2813</description>
<category>3</category></part>

Structural queries are somewhat like relational joins, except that they are easier to express and manage over time. Some structural queries are even impossible or impractical to model with more traditional relational databases. This is in part due to the nature of XML as a self describing, yet flexible, data representation. Collections of XML documents attain commonality based on the similarity in their structures just as much as the similarity in their content. Essentially, relationships are implicitly expressed within the XML structure itself. The utility of this feature becomes more apparent when you start combining structural queries with value based queries.