Read and write data using Pandas Dataframe from different sources

read_csv()Data input to Pandas DataFrameto_csv()
read_clipboard() to_clipboard()
read_excel() to_excel()
read_html() to_html()
read_json()to_json()
read_pickle()to_pickle()
read_sql()to_sql()
read_table()to_string()
read_xml()to_xml()
to_dict()
To read and store data there are different methods available in Pandas. By using these tools we can read from sources like Excel , CSV file , Tabular data from HTML pages or from Database. Similarly after processing we can save data in Excel, CSV file, Generate HTML table or save directly in a Database table.

We can even read or write to our Clipboard. So Next time if you like a table in any webpage then just select and copy the table and run few lines of standard Python code to save the copied table to a Excel or CSV file.

Input to Pandas ( Reading to DataFrame )
read_csvReading data from csv file
read_clipboardReading data from Clipboard and storing in csv file
read_sqlUsing MySql records to create Pandas DataFrame
read_excelReading data from Excel file
read_jsonReading data from Json file
read_tableReading Tab separated data from .tsv file
read_sql_tableReading from Database table
read_htmlReading Tabular data from Web page
read_pickleReading DataFrame object from pickled file
Output from Pandas
to_clipboardData From DataFrame copied to clipboard
to_csvSaving data to CSV file
to_dictSaving data to Dictionary
to_excelSaving data to Excel file
to_jsonSaving / output data in Json format
to_htmlSaving DataFrame data as HTML table in file
to_pickleSaving DataFrame data by pickling in a file
to_sqlUsing DataFrame managing MySql database
to_stringSaving DataFrame data to String or in a file
insertAdd data to MySQL table from DataFrame

Creating XML file by Exporting data from Excel file

1. Importing Required Libraries

import pandas as pd  # For reading the Excel file
import xml.etree.ElementTree as ET  # For creating and manipulating XML files
  • Pandas: A powerful library for handling structured data like Excel and CSV files.
  • xml.etree.ElementTree: A Python library for creating and editing XML files.

2. Loading the Excel File

# Step 1: Load the Excel file
excel_file = "E:\\testing\\python_dtl.xlsx"  
# Replace with the path to your Excel file
output_file = "E:\\python_dtl.xml"  
# Replace with the desired output file name

data = pd.read_excel(excel_file)  # Load the Excel file into a Pandas DataFrame
  • Excel File Path: Specify the path to the Excel file you want to convert to XML.
  • read_excel(): Reads the Excel file into a Pandas DataFrame for easy manipulation.

3. Creating the Root Element

# Step 2: Create the root element for the XML file
root = ET.Element("root")  # Root element for the XML file
  • ET.Element: Creates the root element of the XML file.
  • Root Element: Acts as the parent node for all data rows in the XML file.

4. Iterating Through Rows and Creating XML Elements

# Step 3: Iterate through each row in the DataFrame and create XML elements
for _, row in data.iterrows():
    # Create a parent element for each row
    row_element = ET.SubElement(root, "row")
    
    # Create child elements for each column
    for col_name in data.columns:
        col_element = ET.SubElement(row_element, col_name)
        col_element.text = str(row[col_name])  # Add column value as text
  • iterrows(): Iterates through each row of the DataFrame.
  • ET.SubElement: Creates child elements for the XML structure based on column names.
  • Child Elements: Each column in the DataFrame becomes a child element under its corresponding row in the XML.

5. Writing the XML File

# Step 4: Create an XML tree and save it to a file
tree = ET.ElementTree(root)
tree.write(output_file, encoding="utf-8", xml_declaration=True)
print(f"XML file '{output_file}' created successfully!")
  • ET.ElementTree.write(): Saves the XML tree to a file with the specified encoding and XML declaration.
  • UTF-8 Encoding: Ensures compatibility with most systems and languages.
  • Output Confirmation: Prints a success message with the output file name.
Full code is here.
import pandas as pd
import xml.etree.ElementTree as ET

# Step 1: Load the Excel file
# Replace with the path to your Excel file
excel_file = "E:\\testing\\python_dtl.xlsx"  
output_file = "E:\\python_dtl.xml"  # Replace with the desired output file name
data = pd.read_excel(excel_file)  # Load the Excel file into a Pandas DataFrame

# Step 2: Create the root element for the XML file
root = ET.Element("root")  # Root element for the XML file

# Step 3: Iterate through each row in the DataFrame and create XML elements
for _, row in data.iterrows():
    # Create a parent element for each row
    row_element = ET.SubElement(root, "row")
    
    # Create child elements for each column
    for col_name in data.columns:
        col_element = ET.SubElement(row_element, col_name)
        col_element.text = str(row[col_name])  # Add column value as text

# Step 4: Create an XML tree and save it to a file
tree = ET.ElementTree(root)
tree.write(output_file, encoding="utf-8", xml_declaration=True)
print(f"XML file '{output_file}' created successfully!")

Explanation of the For Loop

The following for loop is a critical part of the code that processes the DataFrame and generates the XML structure:


for _, row in data.iterrows():
    # Create a parent element for each row
    row_element = ET.SubElement(root, "row")
    
    # Create child elements for each column
    for col_name in data.columns:
        col_element = ET.SubElement(row_element, col_name)
        col_element.text = str(row[col_name])  # Add column value as text

Explanation:

  • data.iterrows(): Generates an iterator over the rows of the DataFrame. Each row is returned as a tuple containing:
    • The row's index (stored in `_`, not used here).
    • A Pandas Series object (row) representing the data of the row.

    Example: If we print row for a DataFrame with the following data:

    
    | Name  | Age | City         |
    |-------|-----|--------------|
    | John  | 30  | New York     |
    | Alice | 25  | Los Angeles  |
            

    The output of print(row) during the first iteration will be:

    
    Name    John
    Age      30
    City    New York
    Name: 0, dtype: object
            
  • ET.SubElement(root, "row"): Creates a new XML element named row under the root element for each row in the DataFrame.
  • for col_name in data.columns: Iterates through the column names of the DataFrame to process each cell of the current row.
  • ET.SubElement(row_element, col_name): Creates a child XML element for the column under the row element. The column name is used as the tag name.
  • col_element.text = str(row[col_name]): Sets the text content of the XML element to the value of the current cell. The value is converted to a string to ensure compatibility.

Example:

If the DataFrame contains the following data:


| Name  | Age | City         |
|-------|-----|--------------|
| John  | 30  | New York     |
| Alice | 25  | Los Angeles  |

The loop will generate the following XML structure:


<row>
    <Name>John</Name>
    <Age>30</Age>
    <City>New York</City>
</row>
<row>
    <Name>Alice</Name>
    <Age>25</Age>
    <City>Los Angeles</City>
</row>

This loop dynamically converts each row of the DataFrame into XML, making it highly flexible for varying data structures.



Sample student DataFrame DataFrame to SQlite table at Colab platform using to_sql()
SQLite table to DataFrame at Colab platform using read_sql()
Pandas read_clipboard() read_html() read_csv() read_excel() to_excel()
Subscribe to our YouTube Channel here


Subscribe

* indicates required
Subscribe to plus2net

    plus2net.com







    Python Video Tutorials
    Python SQLite Video Tutorials
    Python MySQL Video Tutorials
    Python Tkinter Video Tutorials
    We use cookies to improve your browsing experience. . Learn more
    HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
    ©2000-2024 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer