Definition:
The main purpose of XPath is to address parts of an XML document. It also provides basic services for the manipulation of strings, numbers and Booleans. XPath uses compact, non-XML syntax to facilitate the use of XPath within URIs and XML attribute values. XPath operates on the abstract and logical structure of an XML document, rather than its surface syntax.
In addition to its use for addressing, XPath is also designed so that it has a natural subset that can be used to match (test whether or not a node matches a pattern); that is why it is used in Alchemy Catalyst to check for conditions within a document.
XPath grew out of efforts to share a common syntax between XSL (XSLT) and XPointer transformations. It allows the search and retrieval of information within an XML document structure. XPath is not an XML syntax: rather, it uses a syntax that refers to the logical structure of an XML document.
Table of Contents
Why Use XPath
XML is not a magic wand; it does not specify how the data is transmitted over the cable or how the data is stored. XML simply determines the format of the data: what is done with the data depends on the programmer. The real power behind XML isn’t just its ability to represent data: the real power of XML lies in helper technologies that, when combined with XML, provide robust solutions, and XPath is one such ancillary technology.
XPath is used in conjunction with XPointer and XSLT for document search. XPath declarations can also be used individually by using the representation of an XML document in a Document Object Model (DOM). Chapter 6 of XML and ASP.NET,“Exploring the System.Xml Namespace,”delves into XPath in the .NET Framework and discusses how XPath declarations can be issued using the System.Xml.XPath namespace. Chapter 5 of XML and ASP.NET,“MSXML Parser,”also shows how to use the MSXML parser to query using XPath declarations.
Xpath: An Analogy to SQL
Considering the possibility of a relational database: is the real power of a database simply the ability to store the data, index it, and specify the relationships between the data tables? After all, a database is supposed to contain the data, so is the ability to handle the data the real advantage behind a relational database? If so, a simple file would suffice for this. It’s easy to see that the real power of a database is the ability to use Structured Query Language (SQL) statements to retrieve subsets of data. To take this a step further example, the fact that SQL is an ANSI standard makes your knowledge of SQL applicable to different databases running on different platforms.
Using this same logic, XML would simply be a data storage format without a prescribed form of data retrieval. This is exactly what XPath is: XPath is the XML document query language. XPath is the common name used for the XML Path Language. Using XPath statements can retrieve subsets of complex data from XML documents using syntax that is universal across implementations. The same XPath declarations that work within the System.Xml and System.Xml.XPath namespaces must work exactly the same as the XPath declarations in the MSXML parser, and both must work exactly the same as other parsing programs that implement the W3C XPath recommendation.