Converting XML data to CSV is a common task for data analysis and processing. Python offers multiple ways to perform this conversion. This guide explains two approaches: using the ElementTree module and the Pandas library, with detailed examples and comparisons.
Here’s a sample XML file data.xml
:
<data>
<record>
<id>1</id>
<name>John Doe</name>
<age>30</age>
<city>New York</city>
</record>
<record>
<id>2</id>
<name>Jane Smith</name>
<age>25</age>
<city>Los Angeles</city>
</record>
<record>
<id>3</id>
<name>Emily Davis</name>
<age>35</age>
<city>Chicago</city>
</record>
</data>
The Pandas library provides the read_xml() method to easily parse XML data into a DataFrame, which can then be exported as a CSV file.
import pandas as pd
# Input and output file paths
input_file = 'data.xml'
output_file = 'data.csv'
# Read the XML file into a Pandas DataFrame
df = pd.read_xml(input_file)
# Export the DataFrame to a CSV file
df.to_csv(output_file, index=False)
# Output confirmation
print(f"Data exported to {output_file} successfully.")
The ElementTree module, part of Python’s standard library, allows you to parse XML files and write the data to a CSV file.
import xml.etree.ElementTree as ET
import csv
# Input and output file paths
input_file = 'E:\\testing\\data\\student.xml'
output_file = 'E:\\testing\\data\\data.csv'
# Parse the XML file
tree = ET.parse(input_file)
root = tree.getroot()
# Extract data from XML
rows = []
headers = []
# Iterate through records
for record in root.findall('record'):
row = {}
for element in record:
if element.tag not in headers:
headers.append(element.tag) # Collect headers dynamically
row[element.tag] = element.text
rows.append(row)
# Write data to CSV
with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=headers)
writer.writeheader() # Write headers
writer.writerows(rows) # Write rows
print(f"Data exported to {output_file} successfully.")
<record>
elements in the XML file.Feature | Pandas `read_xml()` | ElementTree |
---|---|---|
Ease of Use | Very simple and concise | Requires manual element iteration |
Dependencies | Requires Pandas | Built into Python |
Customization | Limited | Full control over XML parsing |
Performance | Faster for simple XML | Handles complex XML structures better |
Use Pandas `read_xml()` if:
Use ElementTree if:
Both methods are effective for converting XML to CSV in Python. Pandas is best for quick and simple tasks, while ElementTree offers flexibility for more complex requirements. Choose the approach that fits your specific use case.
How to convert CSV data to XML file using Tkinter GUI and Pandas DataFrame