Reading and parsing XML files is a common task in Python, especially when dealing with configuration files, API responses, or data files. This guide explains how to use Python's ElementTree library to read XML files step by step.
import xml.etree.ElementTree as ET
Step 1: Import the ElementTree module for reading and parsing XML files in Python.
# Load the XML file
tree = ET.parse("E:\\testing\\sqlite\\students.xml")
root = tree.getroot()
Step 2: Use ET.parse() to load the XML file into memory and getroot() to access the root element of the XML tree.
# Iterate through the XML elements
for student in root.findall("student"):
id = student.find("id").text
name = student.find("name").text
mark = student.find("mark").text
print(f"ID: {id}, Name: {name}, Mark: {mark}")
Step 3: Use findall() to locate all <student> elements in the XML. Access individual child elements like <id>, <name>, and <mark> using find() and extract their text values.
<students>
<student>
<id>1</id>
<name>John Deo</name>
<mark>75</mark>
</student>
<student>
<id>2</id>
<name>Max Ruin</name>
<mark>85</mark>
</student>
</students>
import xml.etree.ElementTree as ET
# Load the XML file
tree = ET.parse("E:\\testing\\sqlite\\students.xml")
root = tree.getroot()
# Iterate through the XML elements
for student in root.findall("student"):
id = student.find("id").text
name = student.find("name").text
mark = student.find("mark").text
print(f"ID: {id}, Name: {name}, Mark: {mark}")
import pandas as pd
# Path to the XML file
xml_file = "F:\\testing\\student.xml"
# Read the XML file into a DataFrame
df = pd.read_xml(xml_file, parser="etree")
# Display the DataFrame
print(df)
# Save to a CSV or JSON file (optional)
df.to_csv("data.csv", index=False)
df.to_json("data.json", orient="records", indent=4)
Reading XML files in Python is straightforward with the ElementTree library. This guide provides a foundational example, but the same principles can be extended to more complex XML processing tasks. Explore additional functionalities like XPath and namespaces for advanced use cases.