name class1 mark gender
id
1.0 John Deo Four 75.0 female
2.0 Max Ruin Three 85.0 male
NaN Arnold Three 55.0 male
4.0 Krish Star Four 60.0 female
NaN John Mik Four 60.0 female
6.0 Alex Jo Four 55.0 NaN
7.0 My John Five 78.0 male
NaN NaN NaN NaN NaN
9.0 Tes Qry Six 78.0 male
10.0 NaN Four 55.0 female
11.0 Ronald Six NaN female
12.0 Recky Six 94.0 female
13.0 Ron NaN 55.0 NaN
14.0 King NaN NaN NaN
You can create DataFrame from the excel file and you can see from above figures that there are some blank data. import pandas as pd
df = pd.read_excel('D:\student-dropna_1.xlsx',index_col='id')
axis=0
(default value for axis) so we can delete rows. thresh=2
so 2 or more valid data is required to keep the row. df=df.dropna(how='any',axis=0,thresh=2)
TypeError: You cannot set both the how and thresh arguments at the same time.
how='any'
df=df.dropna(axis=0,thresh=2)
Output
name class1 mark gender
id
1.0 John Deo Four 75.0 female
2.0 Max Ruin Three 85.0 male
NaN Arnold Three 55.0 male
4.0 Krish Star Four 60.0 female
NaN John Mik Four 60.0 female
6.0 Alex Jo Four 55.0 NaN
7.0 My John Five 78.0 male
9.0 Tes Qry Six 78.0 male
10.0 NaN Four 55.0 female
11.0 Ronald Six NaN female
12.0 Recky Six 94.0 female
13.0 Ron NaN 55.0 NaN
df=df.dropna(axis=0,thresh=3)
Output
name class1 mark gender
id
1.0 John Deo Four 75.0 female
2.0 Max Ruin Three 85.0 male
NaN Arnold Three 55.0 male
4.0 Krish Star Four 60.0 female
NaN John Mik Four 60.0 female
6.0 Alex Jo Four 55.0 NaN
7.0 My John Five 78.0 male
9.0 Tes Qry Six 78.0 male
10.0 NaN Four 55.0 female
11.0 Ronald Six NaN female
12.0 Recky Six 94.0 female
In above code you can see the 13th row is removed.
axis=1
df=df.dropna(axis=1,thresh=11)
Output
name class1 mark
id
1.0 John Deo Four 75.0
2.0 Max Ruin Three 85.0
NaN Arnold Three 55.0
4.0 Krish Star Four 60.0
NaN John Mik Four 60.0
6.0 Alex Jo Four 55.0
7.0 My John Five 78.0
NaN NaN NaN NaN
9.0 Tes Qry Six 78.0
10.0 NaN Four 55.0
11.0 Ronald Six NaN
12.0 Recky Six 94.0
13.0 Ron NaN 55.0
14.0 King NaN NaN
df.shape[0]
Number of rows df.shape[1]
Number of columns.
df=df.dropna(axis=0,thresh=df.shape[1]*0.7)
Output
name class1 mark gender
id
1.0 John Deo Four 75.0 female
2.0 Max Ruin Three 85.0 male
NaN Arnold Three 55.0 male
4.0 Krish Star Four 60.0 female
NaN John Mik Four 60.0 female
6.0 Alex Jo Four 55.0 NaN
7.0 My John Five 78.0 male
9.0 Tes Qry Six 78.0 male
10.0 NaN Four 55.0 female
11.0 Ronald Six NaN female
12.0 Recky Six 94.0 female
df=df.dropna(axis=1,thresh=df.shape[0]*0.8)
Output
name
id
1.0 John Deo
2.0 Max Ruin
NaN Arnold
4.0 Krish Star
NaN John Mik
6.0 Alex Jo
7.0 My John
NaN NaN
9.0 Tes Qry
10.0 NaN
11.0 Ronald
12.0 Recky
13.0 Ron
14.0 King
df.dropna(axis=0,thresh=2,inplace=True)
print(df) # source dataframe is changed.
Author
🎥 Join me live on YouTubePassionate about coding and teaching, I publish practical tutorials on PHP, Python, JavaScript, SQL, and web development. My goal is to make learning simple, engaging, and project‑oriented with real examples and source code.