Exercise : Pandas and Date query

Pandas date & time Pandas

Download log_report.xlsx file

  1. How many records are there for the year 2019
  2. How many records are there for the year 2019 and Month March
  3. Show the records of log_id 28928
  4. When was the last inspection done for the equipment id 3320
  5. How many records are there between April 2019 and Jan 2020
  6. How many days back inspection was completed for eqpt_id 3320 from today.
  7. Display all the records available for last 30 days ( starting from today )
  8. List all records having done ( st_date) on Wednesday ( Weekday =2 )
  9. Display year and number of records in that year.
  10. Display month and number of records in that month.
  11. Display Year and Month with number of records in that year month

How many records are there for the year 2019

You can read on how to use read_excel and create one DataFrame.
import pandas as pd 
my_data = pd.read_excel('D:\log_report.xlsx') # Change your path
print(my_data)
This will display all records from the excel file. Now using this code we will develop further code.

Now let us findout the number of records for the year 2019
#print(my_data[my_data['st_date'].dt.year==2019].count())
print(my_data[my_data['st_date'].dt.year==2019]['log_id'].count())
Output
731
You can un-comment the other line and check how all columns are listed.
We can use set_index() to index date column and then apply the queries.

How many records are there for the year 2019 and Month March

List of records .
print(my_data.set_index('st_date')['2020-03'])
Now the number of records.
print(my_data.set_index('st_date')['2020-03'].count())

Show the records of log_id 28928

print(my_data.loc[my_data['log_id']==28928])
Output
log_id  eqpt_id   dept    st_date  activity_type
0   28928     1720  dept6 2019-01-02  Normalization

When was the last inspection done for the equipment id 3320

import pandas as pd 
my_data = pd.read_excel('D:\log_report.xlsx') # Change your path
my_data['st_date'] = pd.to_datetime(my_data['st_date'])

max_date=my_data.loc[my_data['eqpt_id']==3320]['st_date'].max()
#print(max_date)
df=my_data.loc[my_data['st_date']==max_date]
print(df[df['eqpt_id']==3320])
Output
     log_id  eqpt_id   dept    st_date activity_type
774   40276     3320  dept4 2020-01-27        Others

How many records are there between April 2019 and Jan 2020

import pandas as pd 
my_data = pd.read_excel('D:\log_report.xlsx') # Change your path
#print(my_data)
my_data['st_date'] = pd.to_datetime(my_data['st_date'])

#print(my_data.set_index('st_date')['2019-04':'2020-01']) # total list
print(my_data.set_index('st_date')['2019-04':'2020-01'].count()) # Total Number 
Output
log_id           681
eqpt_id          681
dept             681
activity_type    681

How many days back inspection was completed for eqpt_id 3320 from today.

import pandas as pd 
my_data = pd.read_excel('D:\log_report.xlsx') # Change your path
my_data['st_date'] = pd.to_datetime(my_data['st_date'])

max_date=my_data.loc[my_data['eqpt_id']==3320]['st_date'].max()
#print(max_date)
today=pd.to_datetime('now')
print( "Days difference = ", ( today - max_date))
Output
Days difference =  76 days 07:38:35.699501

Display all the records available for last 30 days ( starting from today )

You can use today’s date by using to_datetime() or any other date for calculating last 30 days records. In the sample Excel file data upto March 2020 is there, so using today’s day after April 2020 will not return any records so use any date before April 2020 to get the matching records.

import pandas as pd 
import datetime
my_data = pd.read_excel('D:\log_report.xlsx') # Change your path
my_data['st_date'] = pd.to_datetime(my_data['st_date'])
#today=pd.to_datetime('now')
today=pd.to_datetime('2020-04-12') # change this date YYYY-MM-DD
new_day=today-datetime.timedelta(30)
print(my_data[(my_data['st_date']>new_day)])
Output
     log_id  eqpt_id   dept    st_date activity_type
867   41698     1506  dept3 2020-03-16     Breakdown
868   41813     1527  dept3 2020-03-18     Breakdown
869   41868     1656  dept3 2020-03-18     Breakdown

List all records having done ( st_date) on Wednesday ( Weekday =2 )

Monday is weekday = 0 , Tuesday =1, Sunday is Weekday = 6
print(my_data[my_data['st_date'].dt.weekday==2])
To check the weekday we can add one more column saying weekday. We can add our format by using strftime
my_data['weekday']=my_data['st_date'].dt.strftime('%A')
Read more on groupby() here

Display year and number of records in that year.

import pandas as pd 
my_data = pd.read_excel('D:\log_report.xlsx') # Change your path
print(my_data.groupby(my_data['st_date'].dt.year).count())

Display month and number of records in that month.

import pandas as pd 
my_data = pd.read_excel('D:\log_report.xlsx') # Change your path
print(my_data.groupby(my_data['st_date'].dt.month).count())

Display Year and Month with number of records in that year month

import pandas as pd 
my_data = pd.read_excel('D:\log_report.xlsx') # Change your path
print(my_data.groupby([my_data['st_date'].dt.year,my_data['st_date'].dt.month]).count())
Pandas Pandas date & time Basic Exercise using datetime 3-2


plus2net.com



Post your comments , suggestion , error , requirements etc here




We use cookies to improve your browsing experience. . Learn more
HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
©2000-2020 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer