Pandas DataFrame Exercise 1-1

Pandas

Part I : Creating and grouping data

Create one student mark list with two subjects for 10 ( variable n ) number of students. Marks are given against two subjects and it can vary from 0 to 100. Use random numbers for generating marks. Your DataFrame should have two subject columns Math and Eng. One more column student ID should start from 1 and continue for all students.
Add one more column my_result based on mark in the subject Math. Result ( my_result ) to be arranged in the range 0– 39 Fail, 40 – 50 Third, 50 – 75 Second, 75 to 100 First.

Note that if a student gets 40 then he is passed, If he gets 75 then he is to be placed at First division. ( same is to be followed for other groups )

Part II ( Data Visualization)

Based on the marks in Math subject plot a scatter graph to show distribution of marks of students.
Create one Pie chart showing the result of total class disributed in bins. Similarly create on Bar chart showing the result.

Solution

import numpy as np
import pandas as pd
n=5 # Number of students , increase this number 
my_id=np.arange(1,n+1) # student id from 1 to n 

my_math=np.random.randint(0,100,size=n) # 0 to 100 random mark
my_english=np.random.randint(0,100,size=n)

my_pd=pd.DataFrame(data=[my_id,my_math,my_english]).T # transpose the matrix 

my_pd.columns=['ID','MATH','ENG'] # adding columns to DataFrame
#print(my_pd.to_string(index=None))

my_labels=['Fail','Third','Second','First'] # labels  
my_pd['my_result'] = pd.cut(x=my_pd['MATH'],
                            bins=[0,40, 50, 75, 100],
                            labels=my_labels,right=False) 
print(my_pd)
Output ( Output will change as we are using random numbers as data (mark))
   ID  MATH  ENG my_result
0   1    98   54     First
1   2    25   29      Fail
2   3    72   25    Second
3   4    26    7      Fail
4   5    78   79     First
Increase the number of students by changing n=25 to get better distribution of data.

Data visualization

To the above code we will add for plotting of graphs.
my_pd.plot.scatter(title='Math  Vs ID ',x='ID',y='MATH')

my_data=my_pd.groupby(['my_result'])[['ID']].count()
print(my_data)
my_data.plot.pie(title="Result ",y='ID',figsize=(4,4))
my_data.plot.bar(title="Result ",y='ID',figsize=(4,4))
Scatter Plot of Marks in Pandas
Pie Plot of Marks in Pandas
Bar Plot of Marks in Pandas
loc mask where query

Pandas Pandas DataFrame iloc - rows and columns by integers


plus2net.com



Post your comments , suggestion , error , requirements etc here




We use cookies to improve your browsing experience. . Learn more
HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
©2000-2020 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer