timedelta64()

Pandas date & time Pandas Numpy

Date and time calculations using Numpy timedelta64.
Different units are used with timedelta64 for calculations, the list of units are given at the end of this tutorial.
Let us create DataFrame with two datetime columns to calculate the difference.
import pandas as pd 
my_dict={'NAME':['Ravi','Raju','Alex'],
         'dt_start':['1/1/2020','2/1/2020','5/1/2020'],
         'dt_end':['6/15/2022','7/22/2022','11/15/2023']
}
my_data = pd.DataFrame(data=my_dict)
my_data['dt_start'] = pd.to_datetime(my_data['dt_start'])
my_data['dt_end'] = pd.to_datetime(my_data['dt_end'])
print(my_data)
Output
   NAME   dt_start     dt_end
0  Ravi 2020-01-01 2022-06-15
1  Raju 2020-02-01 2022-07-22
2  Alex 2020-05-01 2023-11-15
The column dt_start and dt_end is changed to datetime type.

Difference in Days

We will add one column diff_days to our dataframe and add difference in days to it.
my_data['diff_days']=my_data['dt_end']-my_data['dt_start']
print(my_data)
Output
   NAME   dt_start     dt_end diff_days
0  Ravi 2020-01-01 2022-06-15  896 days
1  Raju 2020-02-01 2022-07-22  902 days
2  Alex 2020-05-01 2023-11-15 1293 days
Here our column diff_days is not an integer field, so we can use any comparison to get days higher than x ( > 50 days ) . For this we have to convert diff_days to integer ( or float if NaN is present ) by dividing with np.timedelta64(1, 'D')

To use Numpy, first import it by adding this line. import numpy as np
my_data['diff_days'] = my_data['diff_days']/ np.timedelta64(1, 'D')
Read how np.timedelta64() is used in our Exercise3-4 to convert date to float value

Now we can filter based on number of days .
my_data=my_data[my_data['diff_days']>900]
Let us findout difference in months. We will add Numpy library as we are going to use timedelta64 now.
import pandas as pd 
import numpy as np
my_dict={'NAME':['Ravi','Raju','Alex'],
         'dt_start':['1/1/2020','2/1/2020','5/1/2020'],
         'dt_end':['6/15/2022','7/22/2022','11/15/2023']
}
my_data = pd.DataFrame(data=my_dict)
my_data['dt_start'] = pd.to_datetime(my_data['dt_start'])
my_data['dt_end'] = pd.to_datetime(my_data['dt_end'])

my_data['diff_days']=my_data['dt_end']-my_data['dt_start']
my_data['diff_months']=my_data['diff_days']/np.timedelta64(1, 'M')
#my_data['diff_months']=my_data['diff_months'].astype(int)#to integer 

print(my_data)
We divided the difference with np.timedelta64(1,'M') to findout the difference in months.
Output , you can display the diff_months as integer by un commenting the line above.
   NAME   dt_start     dt_end diff_days  diff_months
0  Ravi 2020-01-01 2022-06-15  896 days    29.437976
1  Raju 2020-02-01 2022-07-22  902 days    29.635105
2  Alex 2020-05-01 2023-11-15 1293 days    42.481365

Difference in Years

We will use np.timedelta64(1,'Y') to find out difference in Years. Here is the change in code.
my_data['diff_years']=my_data['diff_days']/np.timedelta64(1, 'Y')
Output is here
   NAME   dt_start     dt_end diff_days  diff_years
0  Ravi 2020-01-01 2022-06-15  896 days    2.453165
1  Raju 2020-02-01 2022-07-22  902 days    2.469592
2  Alex 2020-05-01 2023-11-15 1293 days    3.540114

Difference in Weeks

We will use np.timedelta64(1,'W') to find out difference in Weeks . Here is the change in code.
my_data['diff_weeks']=my_data['diff_days']/np.timedelta64(1, 'W')
Output
   NAME   dt_start     dt_end diff_days  diff_weeks
0  Ravi 2020-01-01 2022-06-15  896 days  128.000000
1  Raju 2020-02-01 2022-07-22  902 days  128.857143
2  Alex 2020-05-01 2023-11-15 1293 days  184.714286
Here also we can display difference in weeks by changing the data to integer by adding the line like this.
my_data['diff_weeks']=my_data['diff_days']/np.timedelta64(1, 'W')
my_data['diff_weeks']=my_data['diff_weeks'].astype(int) #  to integer 
Now output will change like this.
   NAME   dt_start     dt_end diff_days  diff_weeks
0  Ravi 2020-01-01 2022-06-15  896 days         128
1  Raju 2020-02-01 2022-07-22  902 days         128
2  Alex 2020-05-01 2023-11-15 1293 days         184

Difference in Time

Difference in Hours we can get by using np.timedelta64(1,'h'). Here is the code
my_data['diff_hours']=my_data['diff_days']/np.timedelta64(1, 'h')
Output
   NAME   dt_start     dt_end diff_days  diff_hours
0  Ravi 2020-01-01 2022-06-15  896 days     21504.0
1  Raju 2020-02-01 2022-07-22  902 days     21648.0
2  Alex 2020-05-01 2023-11-15 1293 days     31032.0
Similarly you can findout difference in minutes, seconds etc. Here is a table to use the date and time units
TimeDate
hHour
mMinute
sSecond
msMilliSecond
usMicroSecond
nsNanoSecond
psPicoSecond
fsFemtoSecond
asAttoSecond
DDay
MMonth
YYear
WWeek
Pandas date& time date_range() period_range() strftime()


plus2net.com



Post your comments , suggestion , error , requirements etc here




We use cookies to improve your browsing experience. . Learn more
HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
©2000-2020 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer