customer.csv : Two columns, customer id and customer name products.csv : Three columns ,p_id( product id) , product ( name) and price sales.csv Six columns , sale_id, c_id ( customer id ), p_id (product_id), qty ( quantity sold) ,store ( name )
Sample data is kept inside the csv files, click the csv file name to download.
3. List of quantity and total sales against each product
You can un-comment the print commands and check the intermediate results. We used merge to join two DataFrames and to get Product details ( price of the product ).
We added one more column total_sales by multiplying total sales with price.
import pandas as pd
sales=pd.read_csv("sales.csv")
#print(sales)
# using groupby get the list of products and its sum sold
my_sale=sales.groupby(['product','p_id', 'store'])[['qty']].sum()
#print(my_sale)
product=pd.read_csv("products.csv")
#print(product)
my_sum=pd.merge(my_sale,product,how='left',on='p_id')
#print(my_sum)
#We added one more column total_sales by multiplying total sales with price.
my_sum['total_sale']=my_sum['qty']*my_sum['price']
print(my_sum)
4. List of quantity sold against each product and against each store.
import pandas as pd
sales=pd.read_csv("sales.csv")
print(sales.groupby(['product','p_id','store'])[['qty']].sum())
Output
qty
product p_id store
CPU 4 DEF 1
Monitor 3 ABC 10
DEF 2
RAM 2 ABC 3
DEF 4
5. List of quantity sold against each Store with total turnover of the store.
We will first find out the total price ( or turnover ) against each sale by multiplying quantity sold with price of each unit.
To get price of each unit ( which is in product dataframe ) we have to merge sales with product DataFrame using product id ( p_id ).
Once we get the total turnover against each sale, we will use groupby method to find out total sales ( turnover ) and quantity against each store.
import pandas as pd
sales=pd.read_csv("sales.csv")
#print(sales)
product=pd.read_csv("products.csv")
my_sum=pd.merge(sales,product,how='left',on=['p_id'])
my_sum['sales_total']=my_sum['qty']*my_sum['price']
print(my_sum.groupby(['store'])[['qty','sales_total']].sum())
import pandas as pd
products=pd.read_csv("products.csv")
sales=pd.read_csv("sales.csv")
my_data=pd.merge(sales,products,on='p_id',how='right')
#print(my_data['sale_id'].isna())
my_data=my_data[my_data['sale_id'].isnull()] # products which are not sold
print(my_data)
#print(my_data.loc[:,'product_y']) # to display only produts column
Output
sale_id c_id p_id product_x qty store product_y price
9 NaN NaN 1 NaN NaN NaN Hard Disk 80
10 NaN NaN 5 NaN NaN NaN Keyboard 20
11 NaN NaN 6 NaN NaN NaN Mouse 10
12 NaN NaN 7 NaN NaN NaN Motherboard 50
13 NaN NaN 8 NaN NaN NaN Power supply 20
7.List of customers who have not purchased any product.
import pandas as pd
sales=pd.read_csv("sales.csv")
customer=pd.read_csv("customer.csv")
my_data=pd.merge(sales,customer,on='c_id',how='right')
my_data=my_data[my_data['sale_id'].isnull()] # products which are not sold
#print(my_data)
print(my_data.loc[:,'Customer']) # to display customers who has not purchased
Output
9 King
10 Ronn
11 Jem
12 Tom
Name: Customer, dtype: object