Reading data from Excel file and creating Pandas DataFrame using read_excel() in Python with options
Options :
Reading the worksheet
We can read any worksheet of the Excel file by using the option sheet_name. We have used one Excel file my_file.xlsx with two worksheets my_Sheet_1 and my_Sheet_2. You can create your own sample my_file.xlsx by using the code at the end of this tutorial.
Now we can read the data of my_Sheet_2 by using above code, sheet_name=0 is for reading the data of first sheet i.e my_Sheet_1. We can also use the sheet names.
By default it reads the first sheet or sheet_name=0
index_col
By default value is None. This will add one index column to the DataFrame. We can specify one column to use as Index.
import pandas as pd
df = pd.read_excel('D:\\my_file.xlsx',index_col='ID')
print(df)
header
By default the column header is 0. We can give integer values. Here is the code to start from 1st row.
df = pd.read_excel('D:\\my_file.xlsx',header=1)
header=None
If our file is not having header then we must keep header=None, by using header=None we can see the index will start from 0th row. Here is the output with header=None.
0 1 2 3 4
0 NaN NAME ID MATH ENGLISH
1 0.0 Ravi 1 30 20
2 1.0 Raju 2 40 30
3 2.0 Alex 3 50 40
As the number are used as headers , we can create our own names by using a list.
names
We will create a list and use as our headers. We will also exclude the file header by using header=None
import pandas as pd
my_list=['header1','header2','header3','header4','header5']
df = pd.read_excel('D:\\my_file.xlsx',header=None,names=my_list)
print(df)
Output is here
header1 header2 header3 header4 header5
0 NaN NAME ID MATH ENGLISH
1 0.0 Ravi 1 30 20
2 1.0 Raju 2 40 30
3 2.0 Alex 3 50 40
Unique values of a column
Download the sample student excel file. There are around 35 records. One of the column is class. Display the distinct or unique class names from this column. We will use unique() for this.
import pandas as pd
df = pd.read_excel('D:\student.xlsx')
my_classes=df['class'].unique()
print(my_classes)
import pandas as pd
my_dict={
'NAME':['Ravi','Raju','Alex'],
'ID':[1,2,3],'MATH':[30,40,50],
'ENGLISH':[20,30,40]
}
my_dict2={
'NAME':['Ron','Pick','Rabin'],
'ID':[4,5,6],'MATH':[60,40,30],
'ENGLISH':[30,40,50]
}
df = pd.DataFrame(data=my_dict)
df2 = pd.DataFrame(data=my_dict2)
with pd.ExcelWriter('D:my_file.xlsx') as my_excel_obj: #Object created
df.to_excel(my_excel_obj,sheet_name='my_Sheet_1')
df2.to_excel(my_excel_obj,sheet_name='my_Sheet_2')
Selecting Excel file to create DataFrame using Tkinter
Tkinter filedialog to browse and select excel file to create Pandas DataFrame using read_excel()
Tkinter is a python library used to create GUI applications. By using Tkinter filedialog we can browse and select excel file in local system. Using the selected excel file we can create a DataFrame and show the first 5 rows inside a text widget.
import tkinter as tk
from tkinter import *
from tkinter import filedialog
import pandas as pd
my_w = tk.Tk()
my_w.geometry("410x300") # Size of the window
my_w.title('www.plus2net.com')
my_font1=('times', 18, 'bold')
l1 = tk.Label(my_w,text='Create DataFrame',width=30,font=my_font1)
l1.grid(row=0,column=1)
b1 = tk.Button(my_w, text='Upload Excel File',
width=20,command = lambda:upload_file())
b1.grid(row=1,column=1)
t1 = tk.Text(my_w, height=7, width=45,bg='yellow') # added one text box
t1.grid(row=2,column=1,pady=10) #
def upload_file():
file = filedialog.askopenfilename(
filetypes=[("Excel file", ".xlsx")])
df=pd.read_excel(file,index_col='id') # creating DataFrame
t1.delete('1.0',END) # Delete previous data from position 0 till end
t1.insert(tk.END, df.head()) # adding data to text widget
my_w.mainloop() # Keep the window open
Reading Excel file and creating table in MySQL or SQLite database
Tkinter file browser to select Excel file for Inserting data to MySQL or SQLite database table
Use student.xlsx file to create a DataFrame.
Using Tkinter GUI, take input from user as Student id. Using the id to get the all details as per the excel record.
Display the record details in Tkinter window.
Use DataFrame loc to get row details.
import pandas as pd
df = pd.read_excel('E:\\data\\student.xlsx',index_col='id') # Excel path
#print(df.loc[id].tolist()) # as list
import tkinter as tk
my_w = tk.Tk()
my_w.geometry("300x200") # with and height of the window
my_w.title("plus2net.com") # Adding a title
l1 = tk.Label(my_w, text='ID', width=5,font=18) # added one Label
l1.grid(row=1,column=1,padx=2,pady=10)
e1 = tk.Entry(my_w, width=5,bg='yellow',font=18) # added one Entry box
e1.grid(row=1,column=2,padx=2)
b1=tk.Button(my_w,text='GO', width=5,font=18,command=lambda:my_upd())
b1.grid(row=1,column=3,padx=5)
l2 = tk.Label(my_w, text='Details', font=10) # added one Label
l2.grid(row=2,column=1,padx=2,pady=10,columnspan=3)
def my_upd():
id=int(e1.get()) # read the user entered data or id
l2.config(text=df.loc[id]) # update the Label
my_w.mainloop()