« Pandas
We used DataFrame of 35 rows by reading from a Excel file.
Download student.xlsx file ⇓
Using the above excel file you can try these questions on basics of Data handling.
- Read the excel file, to create the DataFrame.
- How many rows and columns are there?
- Read the first 4 rows
- How many rows are there?
- Display the columns of the DataFrame
- Display the first 5 recrods of NAME column of the DataFrame
- Display highest 5 records based on the MARK column.
- Display all classes in CLASS column ( Unique class )
More Questions at the end of this Page.
Read the excel file, to create the DataFrame.
Read more on how to read excel file here
import pandas as pd
my_data = pd.read_excel('D:\student.xlsx',index_col='id')
print(my_data)
How many rows and columns are there?
print(my_data.shape) # ( 35, 5)
Read the first 4 rows
Read more on DataFrame.head()
print(my_data.head(4))
Output
name class mark sex
id
1 John Deo Four 75 female
2 Max Ruin Three 85 male
3 Arnold Three 55 male
4 Krish Star Four 60 female
How many rows are there?
print(len(my_data)) # 35
Display the columns of the DataFrame
print(my_data.columns)
Output
Index(['id', 'name', 'class', 'mark', 'sex'], dtype='object')
Display the first 5 recrods of NAME column of the DataFrame
print(my_data['name'][:5]) # Five rows of name column
Output
0 John Deo
1 Max Ruin
2 Arnold
3 Krish Star
4 John Mike
Display highest 5 records based on the MARK column.
Read more about sort_values() to arrange rows in increasing or decreasing order
my_dt=my_data.sort_values(['mark'],ascending=False)
print(my_dt[:5])
Output
id name class mark sex
32 33 Kenn Rein Six 96 female
11 12 Recky Six 94 female
31 32 Binn Rott Seven 90 female
10 11 Ronald Six 89 female
24 25 Giff Tow Six 88 male
Display all classes in CLASS column ( Unique class )
print(my_data['class'].unique())
Output
['Four' 'Three' 'Five' 'Six' 'Seven' 'Nine' 'Eight']
To get the number of columns
print(len(my_data['class'].unique())) # 7
Exercise
Here is a list of Queries you can execute using the above learning and display the outcome.
List of Queries
Don’t use Query directly to manage MySQL database, instead use Excel files to create DataFrame and then apply Pandas methods to get the answers.
« loc mask
where
query
« Pandas
Pandas DataFrame
iloc - rows and columns by integers »