|
Advertisement |
Querying the Complex XML database with XQuery
Posted On June 30, 2010 by Sneha Philipose filed under Programming
HTML clipboard
Many enterprise applications prefer to store XML data as a rich data type, i.e. a sequence of bytes, in a relational database system to avoid the complexity of decomposing the data into a large number of tables and the cost of reassembling the XML data. XQuery is an W3C recommendation for querying XML data. It provides a set of language constructs (FLWOR), the ability to dynamically shape the query result, and a large set of functions and operators. It includes the emerging W3C recommendation XPath 2.0 for path-based navigational access. XQuery’s type system is compatible with that of XML Schema and allows static type checking. This article covers the basic aspects of the XQuery.
Introduction
XQuery is designed to query XML data - not just XML files, but anything that can appear as XML, including databases. XQuery is to XML what SQL is to database tables. Storing XML data as a sequence of bytes representing a rich data type has several advantages. XML schemas for real-life applications are complex so that decomposing XML data conforming to those schemas into the relational data model results in a large number of tables. This makes the decomposition logic complex, the re-assembly cost high, and the queries very complicated. Furthermore, changes to the XML schema require a significant amount of maintenance of the database schema and the application. XML as a rich data type also permits structural characteristics of the XML data, such as document order and recursive structures, to be preserved more faithfully.
The XQuery language was designed for querying or processing XML. Just as a traditional SQL query takes a set of tables as input and returns an XML table as its result, XQuery takes sequences of XML nodes as input and evaluates to a sequence of XML nodes. However, from the very beginning, XQuery was designed to allow XML views of non-XML data, as well as serialized forms of non-XML data. The reason for this is simple: XML is used to represent almost any conceivable kind of information, and it is easiest to integrate information if it is given a common view.
XQuery is a language designed for integrating data from multiple sources, including XML sources like documents or web messages and databases. It does this by leveraging the ability of XML to model virtually any kind of data. To query anything with XQuery, it must be presented as though it were XML, either by serializing it as XML or by creating an XML view of the data through some form of middleware. For relational data, most systems use the SQL/XML mappings for the XML view, since they are quite suitable and have been specified in detail.

Components of XQuery
XML is the basis of XQuery's type system and data model. The fundamental types of XQuery include the kinds of nodes found in XML documents: document nodes, elements, attributes, processing instructions, comments, and text nodes. XQuery also supports the built-in datatypes of W3C XML Schema for representing integers, strings, dates, and other datatypes - these built-in data types are predefined in XQuery, and are available with or without a schema. Most modern programming languages provide some form of complex user-defined types, such as structures or objects. In XQuery, the only complex types are XML documents, elements, attributes, and W3C XML Schema complex types. There is no need to write a schema to create and manipulate complex XML structures in XQuery. However, if a query needs to ensure consistent use of the types in a schema, a schema may be imported into a query. This has an effect analogous to importing structure or class definitions in an object oriented language.
XML has established itself as the ubiquitous format for data exchange on the Internet. An imminent development is that of streams of XML data being exchanged and queried. Data management scenarios where XQuery is evaluated on XML streams are becoming increasingly important and realistic, e.g. in e-commerce settings.



XQuery Applications
Below are a few examples of how XQuery can be used:
1. Extracting information from a database for a use in web service.
2. Generating summary reports on data stored in an XML database.
3. Searching textual documents on the Web for relevant information and compiling the results.
4. Selecting and transforming XML data to XHTML to be published on the Web.
5. Pulling data from databases to be used for the application integration.
6. Splitting up an XML document that represents multiple transactions into multiple XML documents.
XQuery can be used to:
· Extract information to use in a Web Service
· Generate summary reports
· Transform XML data to XHTML
· Search Web documents for relevant information
XQuery Environment and Querying with XML Data
XQuery is case-sensitive and XQuery elements, attributes, and variables must be valid XML names.
XQuery Basic Syntax Rules
Some basic syntax rules:
¨ XQuery is case-sensitive
¨ XQuery elements, attributes, and variables must be valid XML names
¨ An XQuery string value can be in single or double quotes
¨ An XQuery variable is defined with a $ followed by a name, e.g. $bookstore
¨ XQuery comments are delimited by (: and :), e.g. (: XQuery Comment :)
XQuery : Numeric Data Types
|
Name |
Description |
|
byte |
A signed 8-bit integer |
|
decimal |
A decimal value |
|
int |
A signed 32-bit integer |
|
integer |
An integer value |
|
long |
A signed 64-bit integer |
|
negativeInteger |
An integer containing only negative values (..,-2,-1) |
|
nonNegativeInteger |
An integer containing only non-negative values (0,1,2,..) |
|
nonPositiveInteger |
An integer containing only non-positive values (..,-2,-1,0) |
|
positiveInteger |
An integer containing only positive values (1,2,..) |
|
short |
A signed 16-bit integer |
|
unsignedLong |
An unsigned 64-bit integer |
|
unsignedInt |
An unsigned 32-bit integer |
|
unsignedShort |
An unsigned 16-bit integer |
|
unsignedByte |
An unsigned 8-bit integer |
XQuery : Primitive data types
The primitives data types in XQuery are the same as for XML Schema.
· Numbers, including integers and floating-point numbers.
· The boolean values true and false.
· Strings of characters, for example: "Hello world!". These are immutable - i.e. you cannot modify a character in a string.
· Various types to represent dates, times, and durations.
· A few XML-related types. For example a QName is a pair of a local name (like template) and a URL, which is used to represent a tag name like
xsl:template after it has been namespace-resolved.
Derived types are variations or restrictions of other types, for example range types. Primitive types and the types derived from them are known as atomic types, because an atomic value does not contain other values. Thus a string is considered atomic because XQuery does not have character values.
XQuery Conditional Expressions
"If-Then-Else" expressions are allowed in XQuery.
Look at the following example:
for $x in doc("books.xml")/bookstore/book
return if ($x/@category="CHILDREN")
then <child>{data($x/title)}</child>
else <adult>{data($x/title)}</adult>
Notes on the "if-then-else" syntax: parentheses around the if expression are required. else is required, but it can be just else ().
The result of the example above will be:
<adult>Everyday Italian</adult>
<child>Harry Potter</child>
<adult>Learning XML</adult>
<adult>XQuery Kick Start</adult>
XQuery Comparisons
In XQuery there are two ways of comparing values.
1. General comparisons: =, !=, <, <=, >, >=
2. Value comparisons: eq, ne, lt, le, gt, ge
The difference between the two comparison methods are shown below.
The following expression returns true if any q attributes have a value greater than 10:
$bookstore//book/@q > 10
The following expression returns true if there is only one q attribute returned by the expression, and its value is greater than 10. If more than one q is returned, an error occurs:
$bookstore//book/@q gt 10
XQuery Adding Elements and Attributes
The XML Example Document
Adding Elements and Attributes to the Result
We may include elements and attributes from the input document ("books.xml) in the result:
for $x in doc("books.xml")/bookstore/book/titleorder by $xreturn $x
The XQuery expression above will include both the title element and the lang attribute in the result, like this:
<title lang="en">Everyday Italian</title>
<title lang="en">Harry Potter</title>
<title lang="en">Learning XML</title>
<title lang="en">XQuery Kick Start</title>
The XQuery expression above returns the title elements the exact same way as they are described in the input document.
We now want to add our own elements and attributes to the result!
Add HTML Elements and Text
Now, we want to add some HTML elements to the result. We will put the result in an HTML list - together with some text:
<html>
<body>
<h1>Bookstore</h1>
<ul>
{
for $x in doc("books.xml")/bookstore/book
order by $x/title
return <li>{data($x/title)}. Category: {data($x/@category)}</li>
}
</ul>
</body>
</html>
The XQuery expression above will generate the following result:
<html>
<body>
<h1>Bookstore</h1>
<ul>
<li>Everyday Italian. Category: COOKING</li>
<li>Harry Potter. Category: CHILDREN</li>
<li>Learning XML. Category: WEB</li>
<li>XQuery Kick Start. Category: WEB</li>
</ul>
</body>
</html>
Add Attributes to HTML Elements
Next, we want to use the category attribute as a class attribute in the HTML list:
<html>
<body>
<h1>Bookstore</h1>
<ul>
{
for $x in doc("books.xml")/bookstore/book
order by $x/title
return <li class="{data($x/@category)}">{data($x/title)}</li>
}
</ul>
</body>
</html>
The XQuery expression above will generate the following result:
<html>
<body>
<h1>Bookstore</h1>
<ul>
<li class="COOKING">Everyday Italian</li>
<li class="CHILDREN">Harry Potter</li>
<li class="WEB">Learning XML</li>
<li class="WEB">XQuery Kick Start</li>
</ul>
</body>
</html>
XQuery Functions
XQuery 1.0, XPath 2.0, and XSLT 2.0 share the same functions library.
XQuery includes over 100 built-in functions. There are functions for string values, numeric values, date and time comparison, node and QName manipulation, sequence manipulation, Boolean values, and more. You can also define your own functions in XQuery.
The default prefix for the function namespace is fn:.
Functions are often called with the fn: prefix, such as fn:string(). However, since fn: is the default prefix of the namespace, the function names do not need to be prefixed when called.
Examples of Function Calls
A call to a function can appear where an expression may appear. Look at the examples below:
Example 1: In an element
<name>{uppercase($booktitle)}</name>
Example 2: In the predicate of a path expression
doc("books.xml")/bookstore/book[substring(title,1,5)='SKP']
Example 3: In a let clause
let $name := (substring($booktitle,1,4))
XQuery User-Defined Functions
If you cannot find the XQuery function you need, you can write your own.
User-defined functions can be defined in the query or in a separate library.
Syntax
declare function prefix:function_name($parameter AS datatype)
AS returnDatatype
{
...function code here...
}
Notes on user-defined functions:
· Use the declare function keyword
· The name of the function must be prefixed
· The data type of the parameters are mostly the same as the data types defined in XML Schema
· The body of the function must be surrounded by curly braces
Example of a User-defined Function Declared in the Query
declare function local:minPrice($p as xs:decimal?,$d as xs:decimal?)
AS xs:decimal?
{
let $disc := ($p * $d) div 100
return ($p - $disc)
}
Below is an example of how to call the function above:
<minPrice>{local:minPrice($book/price,$book/discount)}</minPrice>
XQuery 1.0 and XPath 2.0 share the same data model and support the same functions and operators.
XQuery :An XML Example
We will use the following XML document in the examples below.
"books.xml":
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
Selecting Nodes From "books.xml"
Functions
XQuery uses functions to extract data from XML documents.
The doc() function is used to open the "books.xml" file:
doc("books.xml")
Path Expressions
XQuery uses path expressions to navigate through elements in an XML document.
The following path expression is used to select all the title elements in the "books.xml" file:
doc("books.xml")/bookstore/book/title
(/bookstore selects the bookstore element, /book selects all the book elements under the bookstore element, and /title selects all the title elements under each book element)
The XQuery above will extract the following:
<title lang="en">Everyday Italian</title>
<title lang="en">Harry Potter</title>
<title lang="en">XQuery Kick Start</title>
<title lang="en">Learning XML</title>
Predicates
XQuery uses predicates to limit the extracted data from XML documents.
The following predicate is used to select all the book elements under the bookstore element that have a price element with a value that is less than 30:
doc("books.xml")/bookstore/book[price<30]
The XQuery above will extract the following:
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
Conclusion
This article gives an overview of some of the major features of the XQuery .This article covers only the basic concepts .For core technical aspects & consultancy you can consult the following references or e-mail at : skphind@yahoo.co.uk
References & Web Links
Ø XQuery from the Experts by Katz,Howard et.al. ,Addision Wesley Pub.
Ø XQuery by Walmsley ,Prinscilla ,O’Reilly Pub.
Ø XPath 2.0 Programmer’s Reference by Kay and Michael,Wrox Pub.
Ø www.xqueryfunctions.com
About the Authors
1. Sunil Kr.Pandey
Chief Consultant, VNS-Unit-2 SKPSoft Consultancy services
Varanasi (UP)
India.
E-mail:skphind@rediffmail.com
2. R.B.Mishra
Professor
Department of Computer Engineering
Institute of Technology(IT),
Banaras Hindu University(BHU),
Varanasi(UP)
India.






Rajeev Gopal Nair commented, on August 1, 2010 at 9:28 p.m.:
You guys are doing great Job. Keep it up.