apply() to use any functions to all elements of the DataFrame or use it on series

Apply any function to element or data series of a DataFrame. By using this we can clean data or apply any generic functionality to the elements.

Options

func: Function to apply
axis : Default is 0, we can use 1 for columns.
raw : Default is false, if row or column is passed.
result_type : Default is None , values are 'expand', 'reduce', 'broadcast', None
args : tuple, Positional arguments to passed

Examples using options

Sum of MATH and ENGLISH using axis=1

import pandas as pd 
import numpy as np
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],
         'MATH':[80,40,70,70,82,30],
         'ENGLISH':[81,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
my_data['MATH_NEW']=my_data[['MATH','ENGLISH']].apply(np.sum, axis=1)
print(my_data)

Output

   NAME  ID  MATH  ENGLISH  MATH_NEW
0  Ravi   1    80       81       161
1  Raju   2    40       70       110
2  Alex   3    70       40       110
3   Ron   4    70       50       120
4  King   5    82       60       142
5  Jack   6    30       30        60

Using lambda

We will add 5 mark to MATH column.

my_data['MATH_NEW']=my_data['MATH'].apply(lambda x:x+5)

Output

   NAME  ID  MATH  ENGLISH  MATH_NEW
0  Ravi   1    80       81        85
1  Raju   2    40       70        45
2  Alex   3    70       40        75
3   Ron   4    70       50        75
4  King   5    82       60        87
5  Jack   6    30       30        35

Using function

One new function my_check() is used to add 5 marks to MATH column.

import pandas as pd 
def my_check(a):
   sum=a+5
   return sum
   
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],
         'MATH':[80,40,70,70,82,30],
         'ENGLISH':[81,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
my_data['MATH_NEW']=my_data['MATH'].apply(lambda x:my_check(x))
print(my_data)

We can add conditions to the above function like 5 marks to be added only those who got less than 50 marks.

def my_check(a):
    if(a<50):
        sum=a+5
    else:
        sum=a
    return sum

Use one fucntion to add MATH and ENGLISH

def my_fun(a,b):
   sum=a+b
   return sum
import pandas as pd 
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],
         'MATH':[80,40,70,70,82,30],
         'ENGLISH':[81,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
my_data['total']=my_data.apply(lambda x:my_fun(x['MATH'],x['ENGLISH']),axis=1)
print(my_data)

Output

   NAME  ID  MATH  ENGLISH  total
0  Ravi   1    80       81    161
1  Raju   2    40       70    110
2  Alex   3    70       40    110
3   Ron   4    70       50    120
4  King   5    82       60    142
5  Jack   6    30       30     60

Using Lambda ( same function we can use by lambda )

my_data['total']=my_data.apply(lambda x:(x['MATH']+x['ENGLISH']),axis=1)

If sum of MATH and ENGLISH is equal or more than 120 then Pass or Fail.

import pandas as pd 
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],
         'MATH':[80,40,70,70,82,30],
         'ENGLISH':[81,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
my_data['total']=my_data.apply(lambda x:(x['MATH']+x['ENGLISH']),axis=1)
my_data['status']=my_data['total'].apply(lambda x: x>=120)
#my_data['status']=my_data['total'].apply(lambda x: x>=120 and 'Pass' or 'Fail' )
print(my_data)

Output

   NAME  ID  MATH  ENGLISH  total  status
0  Ravi   1    80       81    161    True
1  Raju   2    40       70    110   False
2  Alex   3    70       40    110   False
3   Ron   4    70       50    120    True
4  King   5    82       60    142    True
5  Jack   6    30       30     60   False

to display Pass or Fail in place of True or False use this line

my_data['status']=my_data['total'].apply(lambda x: x>=120 and 'Pass' or 'Fail' )

Pandas Pandas DataFrame sort_values groupby cut

Numpy arrays Python & MySQL Python- Tutorials

Subscribe to our YouTube Channel here