pivot_table in Pandas to summerize DataFrame data

Pivot tables are summarized data where data is grouped for giving meaning full insights.

pivot_table(data, values=None, index=None, columns=None,
 aggfunc='mean', fill_value=None, margins=False, dropna=True,
 margins_name='All', observed=False)

data: DataFrame
values: Columns to aggregate
index: columns or grouper
columns: columns or grouper
aggfunc: defalut numpy.mean , functions
fill_value: replace missing value
margins : bool, default False, subtotal grand total
dropna: bool , default True , remove columns if all are NaN.
observed bool, default false
This is our sample DataFrame and we created by using sales.csv file or use this

import pandas as pd 
my_dict={'sale_id':[1,2,3,4,5,6,7,8,9],
         'c_id':[2,2,1,4,2,3,2,3,2],
         'p_id':[3,4,3,2,3,3,2,2,3],
	 'product':['Monitor','CPU','Monitor','RAM','Monitor','Monitor','RAM','RAM','Monitor'],
         'qty':[2,1,3,2,3,2,3,2,2],
         'store':['ABC','DEF','ABC','DEF','ABC','DEF','ABC','DEF','ABC']}
sales = pd.DataFrame(data=my_dict)
print(sales)

sale_id	c_id	p_id	product	qty	store
1	2	3	Monitor	2	ABC
2	2	4	CPU	1	DEF
3	1	3	Monitor	3	ABC
4	4	2	RAM	2	DEF
5	2	3	Monitor	3	ABC
6	3	3	Monitor	2	DEF
7	2	2	RAM	3	ABC
8	3	2	RAM	2	DEF
9	2	3	Monitor	2	ABC

Using this DataFrame we will use the method pivot_table()
Here we are getting the average value of quantity ( qty ) against each product. By using aggfunc we can get sum, max, min value. Here we used one single column product as index.

pvt=pd.pivot_table(sales,values='qty',index=['product'])

	qty
product
CPU	1.000000
Monitor	2.400000
RAM	2.333333

We can use more than one column as index. Here product and store are used.

pvt=pd.pivot_table(sales,values='qty',index=['product','store'])

		qty
product	store
CPU	DEF	1.0
Monitor	ABC	2.5
Monitor	DEF	2.0
RAM	ABC	3.0
RAM	DEF	2.0

aggfunc

default value for aggfunc is mean ( np.mean) , we can specify others line sum , max , min etc or use numpy function ( import numpy before using numpy functions). Here we are using sum to get the sum of quanity ( qty ) column.

pvt=pd.pivot_table(sales,values='qty',
    index=['product','store'],aggfunc=sum)

		qty
product	store
CPU	DEF	1
Monitor	ABC	10
Monitor	DEF	2
RAM	ABC	3
RAM	DEF	4

margins

We can display total or subtotal by using margins=True, by default it is False.

pvt=pd.pivot_table(sales,values='qty',
    index=['product','store'],aggfunc=sum,margins=True)

		qty
product	store
CPU	DEF	1
Monitor	ABC	10
Monitor	DEF	2
RAM	ABC	3
RAM	DEF	4
All		20

columns

pvt=pd.pivot_table(sales,index=['product'],columns=['store'],
values=['qty'])

	qty
store	ABC	DEF
product
CPU	NaN	1.0
Monitor	2.5	2.0
RAM	3.0	2.0

Difference between pivot_table and pivot

pivot_table() : Index column pair need not be unique and it can be used with aggragrate function list sum , average etc
pivot() :If the index column pair has multiple values then it will generate value error. We can't use aggragrate functions.
Pandas pivot() Pandas DataFrame
contains() to display and delete row based on Conditions

Numpy arrays Python & MySQL Python- Tutorials

Subscribe to our YouTube Channel here