How to Read an XML File in Python: A Complete Guide

Reading and parsing XML files is a common task in Python, especially when dealing with configuration files, API responses, or data files. This guide explains how to use Python's ElementTree library to read XML files step by step.

Code Breakdown:


import xml.etree.ElementTree as ET

Step 1: Import the ElementTree module for reading and parsing XML files in Python.


# Load the XML file
tree = ET.parse("E:\\testing\\sqlite\\students.xml")
root = tree.getroot()

Step 2: Use ET.parse() to load the XML file into memory and getroot() to access the root element of the XML tree.


# Iterate through the XML elements
for student in root.findall("student"):
    id = student.find("id").text
    name = student.find("name").text
    mark = student.find("mark").text
    print(f"ID: {id}, Name: {name}, Mark: {mark}")

Step 3: Use findall() to locate all <student> elements in the XML. Access individual child elements like <id>, <name>, and <mark> using find() and extract their text values.

Sample XML File:


<students>
    <student>
        <id>1</id>
        <name>John Deo</name>
        <mark>75</mark>
    </student>
    <student>
        <id>2</id>
        <name>Max Ruin</name>
        <mark>85</mark>
    </student>
</students>

Explanation of Output:

  • The root element <students> contains multiple <student> child elements.
  • Each <student> element includes information such as ID, name, and marks as its children.
  • The code prints each student’s data in a formatted string.

Use Cases:

  • Extracting configuration settings from XML files for applications.
  • Processing XML-based API responses, such as SOAP messages.
  • Parsing hierarchical data for analysis or transformation.

Full Code Example:


import xml.etree.ElementTree as ET

# Load the XML file
tree = ET.parse("E:\\testing\\sqlite\\students.xml")
root = tree.getroot()

# Iterate through the XML elements
for student in root.findall("student"):
    id = student.find("id").text
    name = student.find("name").text
    mark = student.find("mark").text
    print(f"ID: {id}, Name: {name}, Mark: {mark}")

Using Pandas Library

import pandas as pd

# Path to the XML file
xml_file = "F:\\testing\\student.xml"

# Read the XML file into a DataFrame
df = pd.read_xml(xml_file, parser="etree")

# Display the DataFrame
print(df)

# Save to a CSV or JSON file (optional)
df.to_csv("data.csv", index=False)
df.to_json("data.json", orient="records", indent=4)

Code Explanation:

  • import pandas as pd: Imports the Pandas library for data manipulation and analysis.
  • xml_file = "F:\\testing\\student.xml": Specifies the path to the XML file you want to read.
  • df = pd.read_xml(xml_file, parser="etree"): Reads the XML file and converts its data into a Pandas DataFrame. The parser="etree" option ensures the built-in XML parser is used.
  • print(df): Displays the contents of the DataFrame to the console.
  • df.to_csv("data.csv", index=False): Saves the DataFrame to a CSV file named data.csv. The index=False parameter excludes the index column from the CSV file.
  • df.to_json("data.json", orient="records", indent=4): Exports the DataFrame to a JSON file named data.json. The orient="records" option converts rows into a list of dictionaries, and indent=4 pretty-prints the JSON with an indentation of 4 spaces.

Conclusion:

Reading XML files in Python is straightforward with the ElementTree library. This guide provides a foundational example, but the same principles can be extended to more complex XML processing tasks. Explore additional functionalities like XPath and namespaces for advanced use cases.


Download sample XML file: student.xml

Subscribe to our YouTube Channel here


Subscribe

* indicates required
Subscribe to plus2net

    plus2net.com







    Python Video Tutorials
    Python SQLite Video Tutorials
    Python MySQL Video Tutorials
    Python Tkinter Video Tutorials
    We use cookies to improve your browsing experience. . Learn more
    HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
    ©2000-2024 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer