« Pandas
Update None, NaN or NA values and map them as True
Return the masked bool values of each element.
Examples
import pandas as pd
import numpy as np
my_dict={'NAME':['Ravi','Raju','Alex',None,'King',None],
'ID':[1,2,np.NaN,4,5,6],
'MATH':[80,40,70,70,82,30],
'ENGLISH':[81,70,40,50,np.NaN,30]}
df = pd.DataFrame(data=my_dict)
print(df.isnull())
Output : All None and NaN values are masked to True and others are given as False
NAME ID MATH ENGLISH
0 False False False False
1 False False False False
2 False True False False
3 True False False False
4 False False False True
5 True False False False
In real life situation we get data in Excel or CSV file or from any database. We will try to get data from an Excel file and using the same data we will create our DataFrame.
Read more on how to create DataFrame by reading Excel file.
Download student-isnull.xlsx file ⇓
import pandas as pd
import numpy as np
# Check your path for excel file
my_data = pd.read_excel('D:\student-isnull.xlsx')
print(my_data)
Output
id name class1 mark sex
0 1.0 John Deo Four 75.0 female
1 2.0 Max Ruin Three 85.0 male
2 NaN Arnold Three 55.0 male
3 4.0 Krish Star Four 60.0 female
4 NaN John Mike Four 60.0 female
5 6.0 Alex John Four 55.0 NaN
6 7.0 My John Rob Five 78.0 male
7 NaN NaN NaN NaN NaN
8 9.0 Tes Qry Six 78.0 male
9 10.0 NaN Four 55.0 female
10 11.0 Ronald Six NaN female
11 12.0 Recky Six 94.0 female
Checking if NaN is there or not
We can check if there is any NaN value is there or not in our DataSet.
print(my_data.isnull().values.any())
Output ( returns True if any value in DataFrame is NaN or None )
True
We can check any column for presence of any NaN or None value, we are checking name column only here
print(my_data['name'].isnull().values.any())
Two columns name and mark we will check for NaN or None value.
print(my_data[['name','mark']].isnull().values.any())
Showing rows having NaN value
Display where id column is having NaN value
print(my_data[my_data['id'].isnull()])
Output
id name class1 mark sex
2 NaN Arnold Three 55.0 male
4 NaN John Mike Four 60.0 female
7 NaN NaN NaN NaN NaN
Display where name column is having NaN value
print(my_data[my_data['name'].isnull()])
Output
id name class1 mark sex
7 NaN NaN NaN NaN NaN
9 10.0 NaN Four 55.0 female
Similarly other column names can be used.
Counting Number of NaN elements
We will count total number of NaN data present and find out the number of NaN or missing values in each columns.
Read more on sum() here.
Total number of NaN present inside different columns ( of our sample excel file )
print(my_data['id'].isnull().sum()) # output 3
print(my_data['name'].isnull().sum()) # output 2
print(my_data['class1'].isnull().sum()) # output 1
print(my_data['mark'].isnull().sum()) # output 2
print(my_data['sex'].isnull().sum()) # output 2
We will count the number of NaN for total DataFrame. First we will display the breakup of total number against each columns.
print(my_data.isnull().sum()) # output each column wise
Output
id 3
name 2
class1 1
mark 2
sex 2
For the total number of NaN of the DataFrame.
print(my_data.isnull().sum().sum()) # Output 10
Filling NaN values by fillna()
isnull() can identify or count the missing values in a DataFrame. We can replace these values by using fillna()
« fillna()
notnull()
« loc « at « mask
« Pandas
Pandas DataFrame
iloc - rows and columns by integers »
← Subscribe to our YouTube Channel here