« Pandas
Apply any function to element or data series of a DataFrame. By using this we can clean data or apply any generic functionality to the elements.
Options
func: Function to apply
axis : Default is 0, we can use 1 for columns.
raw : Default is false, if row or column is passed.
result_type : Default is None , values are 'expand', 'reduce', 'broadcast', None
args : tuple, Positional arguments to passed
Examples using options
Sum of MATH and ENGLISH using axis=1
import pandas as pd
import numpy as np
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
'ID':[1,2,3,4,5,6],
'MATH':[80,40,70,70,82,30],
'ENGLISH':[81,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
my_data['MATH_NEW']=my_data[['MATH','ENGLISH']].apply(np.sum, axis=1)
print(my_data)
Output
NAME ID MATH ENGLISH MATH_NEW
0 Ravi 1 80 81 161
1 Raju 2 40 70 110
2 Alex 3 70 40 110
3 Ron 4 70 50 120
4 King 5 82 60 142
5 Jack 6 30 30 60
Using lambda
We will add 5 mark to MATH column.
my_data['MATH_NEW']=my_data['MATH'].apply(lambda x:x+5)
Output
NAME ID MATH ENGLISH MATH_NEW
0 Ravi 1 80 81 85
1 Raju 2 40 70 45
2 Alex 3 70 40 75
3 Ron 4 70 50 75
4 King 5 82 60 87
5 Jack 6 30 30 35
Using function
One new function my_check() is used to add 5 marks to MATH column.
import pandas as pd
def my_check(a):
sum=a+5
return sum
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
'ID':[1,2,3,4,5,6],
'MATH':[80,40,70,70,82,30],
'ENGLISH':[81,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
my_data['MATH_NEW']=my_data['MATH'].apply(lambda x:my_check(x))
print(my_data)
Use one fucntion to add MATH and ENGLISH
def my_fun(a,b):
sum=a+b
return sum
import pandas as pd
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
'ID':[1,2,3,4,5,6],
'MATH':[80,40,70,70,82,30],
'ENGLISH':[81,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
my_data['total']=my_data.apply(lambda x:my_fun(x['MATH'],x['ENGLISH']),axis=1)
print(my_data)
Output
NAME ID MATH ENGLISH total
0 Ravi 1 80 81 161
1 Raju 2 40 70 110
2 Alex 3 70 40 110
3 Ron 4 70 50 120
4 King 5 82 60 142
5 Jack 6 30 30 60
Using Lambda ( same function we can use by lambda )
my_data['total']=my_data.apply(lambda x:(x['MATH']+x['ENGLISH']),axis=1)
If sum of MATH and ENGLISH is equal or more than 120 then Pass or Fail.
import pandas as pd
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
'ID':[1,2,3,4,5,6],
'MATH':[80,40,70,70,82,30],
'ENGLISH':[81,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
my_data['total']=my_data.apply(lambda x:(x['MATH']+x['ENGLISH']),axis=1)
my_data['status']=my_data['total'].apply(lambda x: x>=120)
#my_data['status']=my_data['total'].apply(lambda x: x>=120 and 'Pass' or 'Fail' )
print(my_data)
Output
NAME ID MATH ENGLISH total status
0 Ravi 1 80 81 161 True
1 Raju 2 40 70 110 False
2 Alex 3 70 40 110 False
3 Ron 4 70 50 120 True
4 King 5 82 60 142 True
5 Jack 6 30 30 60 False
to display Pass or Fail in place of True or False use this line
my_data['status']=my_data['total'].apply(lambda x: x>=120 and 'Pass' or 'Fail' )
« Pandas
« Pandas DataFrame
sort_values
groupby
cut