How to Validate XML Files in Python: A Comprehensive Guide

XML validation ensures that an XML document conforms to a predefined structure using XSD schemas. This tutorial demonstrates how to validate XML files in Python with lxml.

XML and XSD Data:


<!-- Define XML Data -->
<students>
    <student>
        <id>1</id>
        <name>John Deo</name>
        <mark>75</mark>
    </student>
</students>

<!-- Define XSD Schema -->
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:element name="students">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="student" maxOccurs="unbounded">
                    <xsd:complexType>
                        <xsd:sequence>
                            <xsd:element name="id" type="xsd:integer"/>
                            <xsd:element name="name" type="xsd:string"/>
                            <xsd:element name="mark" type="xsd:integer"/>
                        </xsd:sequence>
                    </xsd:complexType>
                </xsd:element>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

Python Code for Validation:


from lxml import etree

# Define XML data
xml_data = """
<students>
    <student>
        <id>1</id>
        <name>John Deo</name>
        <mark>75</mark>
    </student>
</students>
"""

# Define XSD schema
xsd_data = """
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:element name="students">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="student" maxOccurs="unbounded">
                    <xsd:complexType>
                        <xsd:sequence>
                            <xsd:element name="id" type="xsd:integer"/>
                            <xsd:element name="name" type="xsd:string"/>
                            <xsd:element name="mark" type="xsd:integer"/>
                        </xsd:sequence>
                    </xsd:complexType>
                </xsd:element>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>
"""

# Parse XML and XSD
xml_doc = etree.fromstring(xml_data)
xsd_doc = etree.XMLSchema(etree.fromstring(xsd_data))

# Validate XML
if xsd_doc.validate(xml_doc):
    print("XML is valid.")
else:
    print("XML is invalid.")
    print(xsd_doc.error_log)

Output:


XML is valid.

Explanation:

  • etree.fromstring(): Parses XML and XSD strings into tree structures.
  • etree.XMLSchema(): Creates an XSD schema object for validation.
  • validate(): Returns True if the XML conforms to the schema; otherwise, it provides an error log.

Use Cases:

  • Ensuring compatibility of XML files with APIs.
  • Validating XML configuration files before deployment.
  • Verifying data received from external sources.

Conclusion:

Validating XML ensures its structure and data integrity. Python's lxml library offers a robust solution for this, enabling error-free XML processing.


Subscribe to our YouTube Channel here


Subscribe

* indicates required
Subscribe to plus2net

    plus2net.com







    Python Video Tutorials
    Python SQLite Video Tutorials
    Python MySQL Video Tutorials
    Python Tkinter Video Tutorials
    We use cookies to improve your browsing experience. . Learn more
    HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
    ©2000-2024 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer