How to Use XPath for Querying XML in Python

XPath is a powerful query language for extracting specific data from XML documents. Python's lxml library offers comprehensive support for XPath, making it ideal for working with deeply nested or complex XML files. This tutorial explains how to use XPath in Python with practical examples.

Code Breakdown:


from lxml import etree

Step 1: Import the etree module from the lxml library, which provides support for XPath queries.


# Load the XML data
xml_data = """
<students>
    <student>
        <id>1</id>
        <name>John Deo</name>
        <mark>75</mark>
    </student>
    <student>
        <id>2</id>
        <name>Jane Smith</name>
        <mark>85</mark>
    </student>
</students>
"""
root = etree.fromstring(xml_data)

Step 2: Define your XML data as a string and parse it using etree.fromstring() to create an XML tree.

Querying with XPath


# Extract all student names
names = root.xpath("//student/name/text()")
print(names)

Explanation: The XPath expression //student/name/text() retrieves the text content of all <name> elements within <student> nodes. The output will be a list of names.

Filtering with Attributes


# Extract student with specific id
student = root.xpath("//student[id='2']/name/text()")
print(student)

Explanation: This query selects the <name> of the student whose <id> is "2". The result will be a list containing "Jane Smith".

Advanced Query Example


# Get marks greater than 80
high_scorers = root.xpath("//student[mark > 80]/name/text()")
print(high_scorers)

Explanation: This query filters students with marks greater than 80 and retrieves their names. The result will include students who meet the condition.

Sample Output XML:


<students>
    <student>
        <id>1</id>
        <name>John Deo</name>
        <mark>75</mark>
    </student>
    <student>
        <id>2</id>
        <name>Jane Smith</name>
        <mark>85</mark>
    </student>
</students>

Use Cases:

  • Extracting specific data points from large XML documents.
  • Filtering XML data based on conditions.
  • Navigating complex or deeply nested XML structures.

Conclusion:

XPath queries enable precise data extraction from XML documents, making it a vital tool for working with structured data. Python's lxml library simplifies these queries, allowing developers to handle XML efficiently.


Subhendu Mohapatra — author at plus2net
Subhendu Mohapatra

Author

🎥 Join me live on YouTube

Passionate about coding and teaching, I publish practical tutorials on PHP, Python, JavaScript, SQL, and web development. My goal is to make learning simple, engaging, and project‑oriented with real examples and source code.



Subscribe to our YouTube Channel here



plus2net.com







Python Video Tutorials
Python SQLite Video Tutorials
Python MySQL Video Tutorials
Python Tkinter Video Tutorials
We use cookies to improve your browsing experience. . Learn more
HTML MySQL PHP JavaScript ASP Photoshop Articles Contact us
©2000-2025   plus2net.com   All rights reserved worldwide Privacy Policy Disclaimer