Python Pandas DataFrame filtering rows based on conditions

We will use logical operators and & , or | and not ~ in our filters

import pandas as pd 
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],'MATH':[30,40,50,60,70,80],
         'ENGLISH':[20,30,40,50,60,70]}
my_data = pd.DataFrame(data=my_dict)
print(my_data)

Output

   NAME  ID  MATH  ENGLISH
0  Ravi   1    30       20
1  Raju   2    40       30
2  Alex   3    50       40
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70

We will add our conditions to above code.
List all who scored more than or equal to 50 in MATH

print(my_data[my_data['MATH']>=50])

Output

   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70

List all who scored more than or equal to 50 in MATH and ENGLISH

print(my_data[(my_data['MATH']>=50) & (my_data['ENGLISH']>=50)])

Output

   NAME  ID  MATH  ENGLISH
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70

List all who scored less than 50 in both subjects ( MATH and ENGLISH)

print(my_data[(my_data['MATH'] <50) & (my_data['MATH'] <50) ])

Output

   NAME  ID  MATH  ENGLISH
0  Ravi   1    30       20
1  Raju   2    40       30

List all who scored more than or equal to 50 in MATH or ENGLISH ( in any one subject they should get 50 or more )

print(my_data[(my_data['MATH']>=50) | (my_data['ENGLISH']>=50)])

Output

   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70

Scored equal to 50 in Math (just pass mark )

print(my_data[(my_data['MATH']==50)])

Output

   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40

Score is not equal ( != ) to 50 and not equal to 60

print(my_data[(my_data['MATH'] !=50) & (my_data['MATH'] !=60) ])

Output

   NAME  ID  MATH  ENGLISH
0  Ravi   1    30       20
1  Raju   2    40       30
4  King   5    70       60
5  Jack   6    80       70

Sum of ENGLISH and MATH is more than 70

print(my_data[(my_data[['MATH','ENGLISH']].sum(axis=1)>70)])

Output

   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70

We can add one not condition ~ to this and get the (false matching ) records.

print(my_data[~(my_data[['MATH','ENGLISH']].sum(axis=1)>70)])

Comparing columns

List all rows where MATH is greater than ENGLISH.

print(my_data[my_data['MATH']>my_data['ENGLISH']])

using endswith string matching

All names ending with 'x'

print(my_data[my_data['NAME'].str.endswith('x')])

Output

   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40

Adding one or | condition

print(my_data[my_data['NAME'].str.endswith('x') | my_data['NAME'].str.endswith('ck')])

Output

   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
5  Jack   6    80       70

Using isin()

We will check name if present inside a list.

print(my_data[my_data['NAME'].isin(['Raju','King'])])

Output

   NAME  ID  MATH  ENGLISH
1  Raju   2    40       30
4  King   5    70       60

Using apply

Length of the name column is less than 4 ( more on Pandas apply )

print(my_data[my_data['NAME'].apply(lambda x: len(x) < 4)])

Output

  NAME  ID  MATH  ENGLISH
3  Ron   4    60       50

Updating all rows of a column in DataFrame

import pandas as pd 
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],'MATH':[30,40,50,60,70,80],
         'ENGLISH':[20,30,40,50,60,70]}
my_data = pd.DataFrame(data=my_dict)
#print(my_data)
my_data['ENGLISH']=50 # assign value to all rows of DAtaframe
print(my_data)

Output

   NAME  ID  MATH  ENGLISH
0  Ravi   1    30       50
1  Raju   2    40       50
2  Alex   3    50       50
3   Ron   4    60       50
4  King   5    70       50
5  Jack   6    80       50

Filtering columns and create new DataFrame

my_new = my_data.filter(['equipment','category'],axis=1)

cols=['MATH','ENGLISH']
my_new=my_data[cols]

Displaying columns of the DataFrame

print(my_new.columns)

Number of rows in DataFrame.

print(len(my_new))

Highlight rows based on condition by style property
Pandas Pandas DataFrame
contains() to display and delete row based on Conditions

Numpy arrays Python & MySQL Python- Tutorials

Subscribe to our YouTube Channel here

Filtering rows based on Conditions

Comparing columns

using endswith string matching

Using isin()

Using apply

Updating all rows of a column in DataFrame

Subscribe