« Pandas
Check for Not Null values and map them as True
Return the masked bool values of each element.
Examples
import pandas as pd
import numpy as np
my_dict={'NAME':['Ravi','Raju','Alex',None,'King',None],
'ID':[1,2,np.NaN,4,5,6],
'MATH':[80,40,70,70,82,30],
'ENGLISH':[81,70,40,50,np.NaN,30]}
df = pd.DataFrame(data=my_dict)
print(df.notnull())
Output : All NOT None and NOT NaN values are masked to True and others are given as False
NAME ID MATH ENGLISH
0 False False False False
1 False False False False
2 False True False False
3 True False False False
4 False False False True
5 True False False False
In real life situation we get data in Excel or CSV file or from any database. We will try to get data from an Excel file and using the same data we will create our DataFrame.
Read more on how to create DataFrame by reading Excel file.
Download student-isnull.xlsx file ⇓
import pandas as pd
import numpy as np
# Check your path for excel file
my_data = pd.read_excel('D:\student-isnull.xlsx')
print(my_data)
Output
id name class1 mark sex
0 1.0 John Deo Four 75.0 female
1 2.0 Max Ruin Three 85.0 male
2 NaN Arnold Three 55.0 male
3 4.0 Krish Star Four 60.0 female
4 NaN John Mike Four 60.0 female
5 6.0 Alex John Four 55.0 NaN
6 7.0 My John Rob Five 78.0 male
7 NaN NaN NaN NaN NaN
8 9.0 Tes Qry Six 78.0 male
9 10.0 NaN Four 55.0 female
10 11.0 Ronald Six NaN female
11 12.0 Recky Six 94.0 female
Checking if NaN is there or not
We can check if there is any actual data ( Not NaN) value is there or not in our DataSet.
print(my_data.notnull().values.any())
Output ( returns True if any value in DataFrame is real data by using any() )
True
We can check any column for presence of any Not NaN or Not None value.
We are checking name column only here
print(my_data['name'].notnull().values.any())
Two columns name and mark we will check for NaN or None value.
print(my_data[['name','mark']].notnull().values.any())
In above case we can check all values by using all()
Showing rows not having any NaN value ( in column )
Display where id column is not having NaN value
print(my_data[my_data['id'].notnull()])
Output ( all null values in id column are removed )
id name class1 mark sex
0 1.0 John Deo Four 75.0 female
1 2.0 Max Ruin Three 85.0 male
3 4.0 Krish Star Four 60.0 female
5 6.0 Alex John Four 55.0 NaN
6 7.0 My John Rob Five 78.0 male
8 9.0 Tes Qry Six 78.0 male
9 10.0 NaN Four 55.0 female
10 11.0 Ronald Six NaN female
11 12.0 Recky Six 94.0 female
Display where name column is not having NaN value
print(my_data[my_data['name'].notnull()])
Output ( two rows 7 & 9 are removed as they are having NaN data in Name column )
id name class1 mark sex
0 1.0 John Deo Four 75.0 female
1 2.0 Max Ruin Three 85.0 male
2 NaN Arnold Three 55.0 male
3 4.0 Krish Star Four 60.0 female
4 NaN John Mike Four 60.0 female
5 6.0 Alex John Four 55.0 NaN
6 7.0 My John Rob Five 78.0 male
8 9.0 Tes Qry Six 78.0 male
10 11.0 Ronald Six NaN female
11 12.0 Recky Six 94.0 female
Similarly other column names can be used.
Counting Number of NaN elements
We will count total number of actual data ( not NaN ) data present and find out the number of real data in each columns.
Read more on sum() here.
Total number of Not NaN present inside different columns ( of our sample excel file )
print(my_data['id'].notnull().sum()) # output 9
print(my_data['name'].notnull().sum()) # output 10
print(my_data['class1'].notnull().sum()) # output 11
print(my_data['mark'].notnull().sum()) # output 10
print(my_data['sex'].notnull().sum()) # output 10
We will count the number of not NaN for total DataFrame. First we will display the breakup of total number against each columns.
print(my_data.notnull().sum()) # output each column wise
Output
id 9
name 10
class1 11
mark 10
sex 10
For the total number of Not NaN of the DataFrame.
print(my_data.notnull().sum().sum()) # Output 50
Filling NaN values by fillna()
isnull() can identify or count the missing values in a DataFrame. We can replace these values by using fillna()
« fillna()
isnull()
« loc « at « mask
« Pandas
Pandas DataFrame
iloc - rows and columns by integers »
← Subscribe to our YouTube Channel here