We will use option axis=0 ( default ) by adding to above code.
( The last line is only changed )
print(my_data.std(axis=1))
Along the horizontal row ( axis=1 ) the standard deviation among values of two columns ( id and Mark ) is calculated. For example for third row [3,55] is 36.769553. Output is here.
ddof = 0 this is Population Standard Deviation ddof = 1 ( default) , this is Sample Standard Deviation
print(my_data.std(ddof=0))
Output
id 1.309307
mark 11.866606
dtype: float64
Handling NA data using skipna option
We will use skipna=True to ignore the null or NA data. Let us check what happens if it is set to True ( skipna=True )
import numpy as np
import pandas as pd
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
'ID':[1,2,3,4,5,6],
'MATH':[80,40,70,70,70,30],
'ENGLISH':[80,70,np.nan,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
print(my_data.std(skipna=True))
Output
ID 1.870829
MATH 20.000000
ENGLISH 19.235384
dtype: float64
numeric_only
Default value is None, we can set it to True ( numeric_only=True ) to include only float, int, boolean columns. We can included all by setting it to False ( numeric_only=False ) . Let us see the outputs .
print(my_data.std(numeric_only=True))
Output is same as above as we considered ID , MATH and ENGLISH columns. By changing to True we will get error message.
print(my_data.std(numeric_only=False))
TypeError: could not convert string to float: 'Ravi'