What is XPath?
XPath is a query language for selecting nodes and computing values from XML documents using path expressions to navigate through the hierarchical structure of XML data.
Quick Facts
| Full Name | XML Path Language |
|---|---|
| Created | 1999 by W3C (XPath 1.0) |
| Specification | Official Specification |
How It Works
XPath provides a powerful syntax for addressing and extracting data from XML documents. It uses path expressions similar to file system paths to navigate through XML's tree structure, selecting elements, attributes, and text content. XPath supports various node selection methods including absolute paths from the root, relative paths from the current context, and predicates for filtering. The language includes built-in functions for string manipulation, numeric calculations, and boolean operations, making it essential for XML processing, XSLT transformations, and web scraping applications.
Key Characteristics
- Uses path expressions to navigate XML tree structure
- Supports absolute (/) and relative (//) path selection
- Predicates in brackets [] filter node selections
- Axis specifiers define navigation direction (child, parent, sibling)
- Built-in functions for strings, numbers, and node sets
- Wildcards (*) match any element at current level
Common Use Cases
- Extracting data from XML configuration files
- Web scraping and HTML parsing
- XSLT stylesheet transformations
- XML document validation and testing
- Automated UI testing with Selenium
Example
Loading code...Frequently Asked Questions
What is the difference between / and // in XPath?
Single slash (/) selects from the root or immediate children only, while double slash (//) selects matching elements anywhere in the document tree. For example, /bookstore/book selects direct children, while //book selects all book elements at any depth.
How do XPath predicates work?
Predicates are conditions in square brackets that filter node selections. They can use position (book[1] for first book), attributes (book[@category='fiction']), or comparisons (book[price>30]). Multiple predicates can be chained for complex filtering.
Can XPath be used with HTML documents?
Yes, XPath is commonly used for HTML parsing and web scraping. However, HTML must often be converted to XHTML or parsed with an HTML-tolerant parser since HTML isn't always well-formed XML. Tools like lxml in Python handle this automatically.
What are XPath axes and how are they used?
Axes define the direction of navigation from the current node: child, parent, ancestor, descendant, following-sibling, preceding-sibling, etc. For example, ancestor::div selects all div ancestors, and following-sibling::p selects all following p siblings.
How is XPath different from CSS selectors?
XPath is more powerful, supporting parent/ancestor navigation, complex predicates, and built-in functions. CSS selectors only traverse downward and are simpler. XPath uses path-like syntax (//div/p), while CSS uses selector syntax (div > p).