Tuesday, October 4, 2011

Writing SQL queries in XML

Introduction

This article provides a quick reference for querying XML data with DB2 pureXML. This article reviews the syntax associated with some common queries, including the XQuery with SQL format and the SQL/XML with XQuery format. And even once you select between those two formats, there are often multiple ways in which you can write most queries to achieve the same results.
This article is primarily targeted at software architects, designers, and developers that have familiarity with XML. This article assumes you are using IBM development tools such as Rational® Software Architect, Rational Application Developer, Optim® Development Studio, or InfoSphere™ Data Architect to work with XML data within DB2.
Setting up the environment
Begin by setting up your environment to work with the queries.
Accessing the templates
The syntax used for each query example in this article is derived from the content found within the editor templates provided with the IBM tools. To access the set of templates within the tools, complete the following steps:
  1. Navigate to Window > Preferences.
  2. Within Preferences, navigate to Data Management > SQL Development > SQL and XQuery Editor > Templates, as shown in Figure 1.
  3. Review the available templates, import templates provided by others, or update the templates and export them to share with others.

Figure 1. View of SQL and XQuery templates
List of SQL and XQuery templates
When you edit SQL or XQuery scripts, you can access and use the available templates by using the CTRL+Space key-combination within the editor window.
Setting up the database tables
The examples in this article use the SAMPLE database tables with XML data that DB2 provides. You can install these examples as part of DB2 First Steps, as shown in Figure 2, or by using db2sampl -xml from the command line after installation.

Figure 2. Configuring sample data using First Steps
Configuring Sample Data via First Steps with XML and SQL object                     and data selected
The example queries in this article use the CUSTOMER table from the SAMPLE database, as shown in Figure 3.

Figure 3. View of CUSTOMER table from the SAMPLE database
Customer table icon
Listing 1 shows a sample XML document as found in the INFO column of the CUSTOMER table.

Listing 1. INFO column
<customerinfo Cid="1001">
<name>Kathy Smith</name>
<addr country="Canada">
<street>25 EastCreek</street>
<city>Markham</city>
<prov-state>Ontario</prov-state>
<pcode-zip>N9C 3T6</pcode-zip>
</addr>
<phone type="work">905-555-7258</phone>
</customerinfo>

Creating queries using XQuery
This section describes how to create queries using XQuery.
Types of XQuery expressions
The following list shows some of the different types of XQuery expressions that can be used within a query.
Primary
Use of basic primitives of the language, including literals, variable references, parenthesized expressions, function calls, and so on
Path
Identify nodes within an XML tree using syntax of XPath 2.0
FLWOR
Iterate over sequences, and bind variables to intermediate results.
Comparison
Compare two values
Constructors
Create XML structures within a query
Logical
Use and/or to compute Boolean values.
Conditional
Use if, then, and else to evaluate whether the value of a test expression is true or false.
Basic queries
There are two basic approaches to querying XML data in DB2 with XQuery. You can use the DB2 sqlquery function to use SQL to access specific XML data, or you can use the DB2 xmlcolumn function to use XQuery to access all XML data in a column. Listing 2 shows the basic syntax.

Listing 2. Basic syntax
xquery db2-fn:sqlquery(${sql_query})${xquery_expression}

Listing 3 shows a simple query whereby you use SQL to return a subset of the documents in the INFO column. At the end of the query, include an XPath expression to retrieve the name element from within the XML document.

Listing 3. Return subset of documents
xquery db2-fn:sqlquery("select INFO from CUSTOMER where CID = 1000")/customerinfo/name

In this case, you are interested in all of the XML documents in the INFO column. Then for each document, you use a Path expression to return the name elements.
Listing 4 shows the xmlcolumn syntax.

Listing 4. XMLcolumn syntax
xquery db2-fn:xmlcolumn('${schema}.${table}.${column}')${xquery_expression}

Listing 5 gives all of the XML documents in the INFO column. Then for each document, you use a Path expression to return the name elements.

Listing 5. XML documents in INFO column
xquery db2-fn:xmlcolumn('CUSTOMER.INFO')/customerinfo/name

FLWOR
A common approach in using XQuery is to write queries that include the following clauses, some of which are optional: For, Let, Where, Order by, and Return. These are abbreviated as FLWOR (and pronounced flower). Keep in mind that XQuery expects XML input, and it returns XML. The syntax is shown in Listing 6.

Listing 6. FLWOR syntax
xquery
for $$i in ${xquery_expression}
let $$x := ${xquery_expression}
where ${xquery_expression}
order by ${xquery_expression}
return ${xquery_expression}

The syntax in Listing 6 includes all of the available FLWOR expression elements. Following are examples that show that you can modify the structure of the query with the optional elements.
Listing 7 is an example of a simple FLWOR query to retrieve the city element associated with each address for each customerinfo element.

Listing 7. Simple FLWOR
xquery
for $c in db2-fn:xmlcolumn("CUSTOMER.INFO")/customerinfo
return $c/addr/city

Listing 8 adds to the previous query, adding a restriction to bring back the city for only the customer with attribute Cid = 1001.

Listing 8. FLWOR that brings back city
xquery
for $c in db2-fn:xmlcolumn("CUSTOMER.INFO")/customerinfo
where $c/@Cid = 1001
return $c/addr/city

Listing 9 extracts the names for the customers for whom prov-state is Ontario, and it alphabetizes the results by the name.

Listing 9. FLWOR for Ontario
xquery
for $c in db2-fn:xmlcolumn("CUSTOMER.INFO")/customerinfo
where $c/addr/prov-state = "Ontario"
order by $c/@name
return $c/name

Listing 10 uses Let to assign a variable to hold the pcode-zip value for customers that live in Canada and then alphabetizes the results by Cid. The code then constructs a new XML element, <contact>, which contains the Cid and pcode-zip.

Listing 10. FLWOR for pcode-zip
xquery
for $c in db2-fn:xmlcolumn("CUSTOMER.INFO")/customerinfo
let $pc := $c/addr/pcode-zip
where $c/addr/@country = "Canada"
order by $c/@Cid
return <contact>{$c/@Cid}{$pc}</contact>

Those query examples were quite simple, and they do not account for namespaces. To support namespaces, you can add the namespace to the elements listed in the query; or you can declare a default namespace using the command declare default element namespace '${namespace_uri}';.
Listing 11 shows an XML document that uses a namespace declaration.

Listing 11. XML document with namespace
<customerinfo xmlns="http://poindustry.org" Cid="1001">
<name>Kathy Smith</name>
<addr country="Canada">
<street>25 EastCreek</street>
<city>Markham</city>
<prov-state>Ontario</prov-state>
<pcode-zip>N9C 3T6</pcode-zip>
</addr>
<phone type="work">905-555-7258</phone>
</customerinfo>

Because you are dealing with a simple case of just a single namespace declaration, you can take either of two approaches in querying this document.
Listing 12 declares a default namespace. You do not have to modify element references in the query to include the namespace, because by default, they access this single namespace.

Listing 12. Default namespace
xquery
declare default element namespace = "http://poindustry.org";
for $c in db2-fn:xmlcolumn("CUSTOMER.INFO")/customerinfo
return $c/addr/city

Listing 13 leverages multiple namespaces by defining a namespace prefix, such as cust-ns, to be used as a shorthand reference to the namespace. Once defined, use this namespace prefix along with any element names that you reference.

Listing 13. Namespace prefix
xquery
declare namespace cust-ns = "http://poindustry.org";
for $c in db2-fn:xmlcolumn("CUSTOMER.INFO")/cust-ns:customerinfo
return $c/cust-ns:addr/cust-ns:city

For the namespace examples in Listing 12 and Listing 13, remember that the semicolon is a typical statement terminator for queries. When working with namespaces, change the statement terminator to a different character, such as the hash symbol (#).
Transform and updates
This section describes queries for updating specific content (elements or attributes) within an XML document. The queries discussed leverage the XQuery Update Facility, which is an update to the language and semantics of XQuery 1.0. You can use this type of query to update, replace, delete, or insert content into an XML document. This change can be made to the data in the database or applied to the data returned from a query. The query syntax changes slightly as you perform different actions. Listing 14 shows an example of the basic syntax for a replace query.

Listing 14. Basic syntax (replace)
UPDATE ${table}
SET ${xml_col} = XMLQUERY('
transform
copy $$x := ${xquery_expression}
modify do replace value of ${xquery_expression}
with ${xquery_expression}
return ${xquery_expression}')
WHERE ${expression}

The following are examples that modify the syntax to show support for the other Transform actions.
Listing 15 updates the value of the city element for a specific customer. Note that transform is an optional part of the query. As such, you can omit it for the next few examples.

Listing 15. Replace example
UPDATE customer
SET info = XMLQUERY('
transform
copy $c := $INFO
modify do replace value of $c/customerinfo/addr/city
with "Toronto"
return $c')
WHERE CID =1001

Listing 16 deletes the city element for a specific customer.

Listing 16. Delete example
UPDATE customer
SET info = XMLQUERY('
copy $c := $INFO
modify do delete $c/customerinfo/addr/city
return $c')
WHERE CID =1001

Inserts are interesting, as you need to give thought on where in the document you want to insert information. Listing 17 is non-specific about where in the document to insert the new element. The actual place of insertion is non-deterministic.

Listing 17. Insert example
UPDATE customer
SET info = XMLQUERY('
copy $c := $INFO
modify do insert <city>Toronto</city>
into $c/customerinfo/addr
return $c')
WHERE CID =1001

If you need a specific placement, you can look at one of a number of alternate approaches to inserting data. Listing 18 inserts data after the phone element.

Listing 18. Insert after example
UPDATE customer
SET info = XMLQUERY('
copy $c := $INFO
modify do insert <phone type="cell">905-555-7272</phone>
after $c/customerinfo/phone
return $c')
WHERE CID =1001

Listing 19 inserts an element as the first subelement within addr.

Listing 19. Insert first example
UPDATE customer
SET info = XMLQUERY('
copy $c := $INFO
modify do insert <company-name>ACME Co</company-name>
as first into $c/customerinfo/addr
return $c')
WHERE CID =1001

Listing 20 inserts an element as the last subelement within addr.

Listing 20. Insert last example
UPDATE customer
SET info = XMLQUERY('
copy $c := $INFO
modify do insert <company-name>ACME Co</company-name>
as last into $c/customerinfo/addr
return $c')
WHERE CID =1001

Listing 21 inserts an element into the document before the addr element.

Listing 21. Insert before example
UPDATE customer
SET info = XMLQUERY('
copy $c := $INFO
modify do insert <email>ksmith@acme.com</email>
before $c/customerinfo/addr
return $c')
WHERE CID =1001

Listing 22 renames an existing element by changing the addr element to become address.

Listing 22. Rename example
UPDATE customer
SET info = XMLQUERY('
copy $c := $INFO
modify do rename $c/customerinfo/addr
as "address"
return $c')
WHERE CID =1001

Listing 23 shows a transform query that only modifies the result of the query and does not alter the contents of the database.

Listing 23. XQuery rename example
xquery
copy $c := db2-fn:sqlquery('select INFO from CUSTOMER where CID =1001')
modify do rename $c/customerinfo/addr
as "address"
return $c

Creating queries using SQL/XML
This section offers examples of creating queries using SQL/XML.
Basic SQL/XML query
A good place to start is to look at the barebones syntax for a SQL/XML query, as shown in Listing 24.

Listing 24. Syntax of SQL/XML query
SELECT XMLQUERY('${xquery_expression}') 
FROM ${table_name}
WHERE XMLExists('${xquery_expression}')

There are a couple of approaches to using this syntax with different types of XQuery expressions. First look at an example that incorporates a path expression
Listing 25 shows a query to get the name of the customers who live in the city of Markham. The XMLExists predicate is used to determine whether an XQuery expression returns a sequence of one or more items. Note that you need to use square brackets ([ ]) to surround any value predicates within the XQuery expression. Doing so ensures that the evaluation of the expression is in accordance with what is semantically expected. If you omit the square brackets, the result of the XQuery expression will always return a sequence, and in turn XMLExists will always be true.

Listing 25. Using predicates
SELECT XMLQUERY('$INFO/customerinfo/name')
FROM CUSTOMER
WHERE XMLEXISTS('$INFO/customerinfo/addr[city= "Markham"]')

Next, look at one more basic example that uses a FLWOR expression along with XMLQUERY. Listing 26 returns the addr element for the customer with an attribute Cid > 1002 and where the country = "Canada".

Listing 26. FLWOR with XMLQUERY
SELECT XMLQUERY('for $i in $INFO/customerinfo/addr 
return $i')
FROM CUSTOMER
WHERE XMLEXISTS('let $i := $INFO/customerinfo
where $i/@Cid > 1002
and $i/addr/@country = "Canada"
return $i')

XMLTable
There are times when you need to return a relational table based on information from an XML document. You can use XMLTable to create such a result.
At its most basic, the XMLTable syntax is shown in Listing 27.

Listing 27. XMLTable syntax
SELECT X.${new_col1_name}, X.${new_col2_name}, ${table}.${col}
FROM XMLTable(${row_generating_xquery_expression}
COLUMNS
${new_col1_name} ${new_col1_data_type} PATH '${column_generating_xquery_expression}',
${new_col2_name} ${new_col2_data_type} PATH '${column_generating_xquery_expression}'
) AS X

Listing 28 offers a simple example of using XMLTable. You are returning rows based on two values from an XML document. Each of the two elements returned from the XML document appears in its own column. Use X.* to indicate that you want to return all columns that XMLTable provides.

Listing 28. Using X*
SELECT X.*
FROM customer,
XMLTABLE('$INFO/customerinfo'
COLUMNS
custname VARCHAR(20) PATH 'name',
city VARCHAR(20) PATH 'addr/city') AS X

In addition to returning columns based on information from the XML document, you can also return columns from relational data, as shown in Listing 29.

Listing 29. Return columns from relational data
SELECT customer.CID, X.*
FROM customer,
XMLTABLE('$INFO/customerinfo'
COLUMNS
custname VARCHAR(20) PATH 'name',
city VARCHAR(20) PATH 'addr/city') AS X

Listing 30 adds a new column that is of type XML where you are constructing a new XML document on the fly.

Listing 30. Adding a new column
SELECT customer.CID, X.*
FROM customer,
XMLTABLE('$INFO/customerinfo'
COLUMNS
custname VARCHAR(20) PATH 'name',
city VARCHAR(20) PATH 'addr/city',
newXML XML PATH '<newXML>{addr/pcode-zip}</newXML>') AS X