Filtering rows based on Conditions


Youtube Live session on Tkinter

We will use logical operators and & , or | and not ~ in our filters
import pandas as pd 
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],'MATH':[30,40,50,60,70,80],
         'ENGLISH':[20,30,40,50,60,70]}
my_data = pd.DataFrame(data=my_dict)
print(my_data)
Output
   NAME  ID  MATH  ENGLISH
0  Ravi   1    30       20
1  Raju   2    40       30
2  Alex   3    50       40
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70
We will add our conditions to above code.
List all who scored more than or equal to 50 in MATH
print(my_data[my_data['MATH']>=50])
Output
   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70
List all who scored more than or equal to 50 in MATH and ENGLISH
print(my_data[(my_data['MATH']>=50) & (my_data['ENGLISH']>=50)])
Output
   NAME  ID  MATH  ENGLISH
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70
List all who scored less than 50 in both subjects ( MATH and ENGLISH)
print(my_data[(my_data['MATH'] <50) & (my_data['MATH'] <50) ])
Output
   NAME  ID  MATH  ENGLISH
0  Ravi   1    30       20
1  Raju   2    40       30
List all who scored more than or equal to 50 in MATH or ENGLISH ( in any one subject they should get 50 or more )
print(my_data[(my_data['MATH']>=50) | (my_data['ENGLISH']>=50)])
Output
   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70
Scored equal to 50 in Math (just pass mark )
print(my_data[(my_data['MATH']==50)])
Output
   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
Score is not equal ( != ) to 50 and not equal to 60
print(my_data[(my_data['MATH'] !=50) & (my_data['MATH'] !=60) ])
Output
   NAME  ID  MATH  ENGLISH
0  Ravi   1    30       20
1  Raju   2    40       30
4  King   5    70       60
5  Jack   6    80       70
Sum of ENGLISH and MATH is more than 70
print(my_data[(my_data[['MATH','ENGLISH']].sum(axis=1)>70)])
Output
   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
3   Ron   4    60       50
4  King   5    70       60
5  Jack   6    80       70
We can add one not condition ~ to this and get the (false matching ) records.
print(my_data[~(my_data[['MATH','ENGLISH']].sum(axis=1)>70)])

Comparing columns

List all rows where MATH is greater than ENGLISH.
print(my_data[my_data['MATH']>my_data['ENGLISH']])

using endswith string matching

All names ending with 'x'
print(my_data[my_data['NAME'].str.endswith('x')])
Output
   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
Adding one or | condition
print(my_data[my_data['NAME'].str.endswith('x') | my_data['NAME'].str.endswith('ck')])
Output
   NAME  ID  MATH  ENGLISH
2  Alex   3    50       40
5  Jack   6    80       70

Using isin()

We will check name if present inside a list.
print(my_data[my_data['NAME'].isin(['Raju','King'])])
Output
   NAME  ID  MATH  ENGLISH
1  Raju   2    40       30
4  King   5    70       60

Using apply

Length of the name column is less than 4 ( more on Pandas apply )
print(my_data[my_data['NAME'].apply(lambda x: len(x) < 4)])
Output
  NAME  ID  MATH  ENGLISH
3  Ron   4    60       50

Updating all rows of a column in DataFrame

import pandas as pd 
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],'MATH':[30,40,50,60,70,80],
         'ENGLISH':[20,30,40,50,60,70]}
my_data = pd.DataFrame(data=my_dict)
#print(my_data)
my_data['ENGLISH']=50 # assign value to all rows of DAtaframe
print(my_data)
Output
   NAME  ID  MATH  ENGLISH
0  Ravi   1    30       50
1  Raju   2    40       50
2  Alex   3    50       50
3   Ron   4    60       50
4  King   5    70       50
5  Jack   6    80       50
Filtering columns and create new DataFrame
my_new = my_data.filter(['equipment','category'],axis=1)
OR
cols=['MATH','ENGLISH']
my_new=my_data[cols]
Displaying columns of the DataFrame
print(my_new.columns)
Number of rows in DataFrame.
print(len(my_new))
Highlight rows based on condition by style property
Pandas Pandas DataFrame
contains() to display and delete row based on Conditions
Subscribe to our YouTube Channel here


Subscribe

* indicates required
Subscribe to plus2net

    plus2net.com



    Post your comments , suggestion , error , requirements etc here





    Python Video Tutorials
    Python SQLite Video Tutorials
    Python MySQL Video Tutorials
    Python Tkinter Video Tutorials
    We use cookies to improve your browsing experience. . Learn more
    HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
    ©2000-2024 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer