Pivot are summarized data without any aggregate functions used.
pivot(index=None, columns=None, values=None)
index: str or object or a list of str, optional
columns: list or Column to use to make new frame’s columns.
values Column(s) to use for populating new frame’s values (optional ).
This is our sample DataFrame ( created by using sales.csv file after removing duplicate column/ rows )
import pandas as pd
my_dict={'sale_id':[1,2,4,6,7],
'c_id':[2,2,4,3,2],
'p_id':[3,4,2,3,2],
'product':['Monitor','CPU','RAM','Monitor','RAM'],
'qty':[2,1,2,2,3],
'store':['ABC','DEF','DEF','DEF','ABC']}
sales = pd.DataFrame(data=my_dict)
print(sales)
| sale_id | c_id | p_id | product | qty | store |
0 | 1 | 2 | 3 | Monitor | 2 | ABC |
1 | 2 | 2 | 4 | CPU | 1 | DEF |
2 | 4 | 4 | 2 | RAM | 2 | DEF |
3 | 6 | 3 | 3 | Monitor | 2 | DEF |
4 | 7 | 2 | 2 | RAM | 3 | ABC |
Using this DataFrame we will use the method pivot()
pvt=pd.pivot(sales,index=['product'],columns=['store'])
| sale_id | c_id | p_id | qty |
store | ABC | DEF | ABC | DEF | ABC | DEF | ABC | DEF |
product | | | | | | | | |
CPU | NaN | 2.0 | NaN | 2.0 | NaN | 4.0 | NaN | 1.0 |
Monitor | 1.0 | 6.0 | 2.0 | 3.0 | 3.0 | 3.0 | 2.0 | 2.0 |
RAM | 7.0 | 4.0 | 2.0 | 4.0 | 2.0 | 2.0 | 3.0 | 2.0 |
values
pvt=sales.pivot(index=['product'],columns=['store'],values='qty')
store | ABC | DEF |
product | | |
CPU | NaN | 1.0 |
Monitor | 2.0 | 2.0 |
RAM | 3.0 | 2.0 |
Difference between pivot_table and pivot
pivot_table() : Index column pair need not be unique and it can be used with aggragrate function list sum , average etc
pivot() :If the index column pair has multiple values then it will generate value error. We can't use aggragrate functions.
« Pandas
« pivot_table()
« Pandas DataFrame
contains() to display and delete row based on Conditions »
← Subscribe to our YouTube Channel here