describe(): Details of DataFrame

Pandas

We can get descriptive statistics of DataFrame or series by using describe().

percentiles: Default 25%,50% and 75%. We can specify the list as [.45,.68,.89].
include : 'all' , a list, 'None'. List of datatypes to be included in output
exclude :datatypes to be excluded from the output

Examples

We will use the options and check the output.
import pandas as pd 
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],
         'MATH':[80,40,70,70,60,30],
         'ENGLISH':[80,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
print(my_data['MATH'].describe())
Output
count     6.000000
mean     58.333333
std      19.407902
min      30.000000
25%      45.000000
50%      65.000000
75%      70.000000
max      80.000000
We can get for full DataFrame
print(my_data.describe())
Output
             ID       MATH    ENGLISH
count  6.000000   6.000000   6.000000
mean   3.500000  58.333333  55.000000
std    1.870829  19.407902  18.708287
min    1.000000  30.000000  30.000000
25%    2.250000  45.000000  42.500000
50%    3.500000  65.000000  55.000000
75%    4.750000  70.000000  67.500000
max    6.000000  80.000000  80.000000

percentiles

By default we get value for 25%, 50% and 75%. Now we will select our own percentiles like this percentiles=[.45,.68,.89]
print(my_data['MATH'].describe(percentiles=[.45,.68,.89]))
Output
count     6.000000
mean     58.333333
std      19.407902
min      30.000000
45%      62.500000
50%      65.000000
68%      70.000000
89%      74.500000
max      80.000000

include

Let us try by using include='all'
print(my_data.describe(include='all'))
Output
        NAME        ID       MATH    ENGLISH
count      6  6.000000   6.000000   6.000000
unique     6       NaN        NaN        NaN
top     King       NaN        NaN        NaN
freq       1       NaN        NaN        NaN
mean     NaN  3.500000  58.333333  55.000000
std      NaN  1.870829  19.407902  18.708287
min      NaN  1.000000  30.000000  30.000000
25%      NaN  2.250000  45.000000  42.500000
50%      NaN  3.500000  65.000000  55.000000
75%      NaN  4.750000  70.000000  67.500000
max      NaN  6.000000  80.000000  80.000000

include=[np.object]

print(my_data.describe(include=[np.object]))
Output
        NAME
count      6
unique     6
top     King
freq       1

include=[np.number]

print(my_data.describe(include=[np.number]))
Output
             ID       MATH    ENGLISH
count  6.000000   6.000000   6.000000
mean   3.500000  58.333333  55.000000
std    1.870829  19.407902  18.708287
min    1.000000  30.000000  30.000000
25%    2.250000  45.000000  42.500000
50%    3.500000  65.000000  55.000000
75%    4.750000  70.000000  67.500000
max    6.000000  80.000000  80.000000

exclude

print(my_data.describe(exclude=['category']))
Output
        NAME        ID       MATH    ENGLISH
count      6  6.000000   6.000000   6.000000
unique     6       NaN        NaN        NaN
top     King       NaN        NaN        NaN
freq       1       NaN        NaN        NaN
mean     NaN  3.500000  58.333333  55.000000
std      NaN  1.870829  19.407902  18.708287
min      NaN  1.000000  30.000000  30.000000
25%      NaN  2.250000  45.000000  42.500000
50%      NaN  3.500000  65.000000  55.000000
75%      NaN  4.750000  70.000000  67.500000
max      NaN  6.000000  80.000000  80.000000

exclude=[np.number]

print(my_data.describe(exclude=[np.number]))
Output
        NAME
count      6
unique     6
top     King
freq       1

exclude=[np.object]

print(my_data.describe(exclude=[np.object]))
Output
             ID       MATH    ENGLISH
count  6.000000   6.000000   6.000000
mean   3.500000  58.333333  55.000000
std    1.870829  19.407902  18.708287
min    1.000000  30.000000  30.000000
25%    2.250000  45.000000  42.500000
50%    3.500000  65.000000  55.000000
75%    4.750000  70.000000  67.500000
max    6.000000  80.000000  80.000000
Pandas DataFrame cut() segment and sort data values into bins


plus2net.com



Post your comments , suggestion , error , requirements etc here




We use cookies to improve your browsing experience. . Learn more
HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
©2000-2020 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer