Python Pandas Data Analysis
dtypes : Returns dtypes of DataFrame.
Here is one DataFrame with 7 types (all) of dtypes .
import pandas as pd
td = pd.Series([pd.Timedelta(days=i) for i in range(6)])
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
'ID':[1,2,3,4,5,6],
'MATH':[80,40,70,70,70,30],
'Avg_mark':[45.5,48.09,50.12,55.1,50.6,55.6],
'dt_start':['1/1/2020','2/1/2020','5/1/2020','11/7/2020',
'15/8/2020','31/12/2020'],
'Exam':[True,False,True,True,False,False],
'dt':td,
'grade':['a', 'c', 'b', 'b','b','c']}
my_data = pd.DataFrame(data=my_dict)
my_data['grade']=my_data['grade'].astype('category')
my_data['dt_start'] = pd.to_datetime(my_data['dt_start'])
print(my_data.dtypes)
Output
NAME object
ID int64
MATH int64
Avg_mark float64
dt_start datetime64[ns]
Exam bool
dt timedelta64[ns]
grade category
dtype: object
We can get dtype of particular column.
print(my_data['Avg_mark'].dtypes)
Output
float64
Different Data types ( dtypes )
dtype Uses Code
int64 Integer type 'ID':[1,2,3,4,5,6]
float64 Decimal,float 'Avg_mark':[45.5,48.09,50.12,55.1,50.6,55.6]
datetime64[ns] Date & Time 'dt_start':['1/1/2020','2/1/2020','5/1/2020' ... ]
object String or mixed 'NAME':['Ravi','Raju','Alex','Ron','King','Jack']
bool Boolean 'Exam':[True,False,True,True,False,False]
timedelta64[ns] Timedelta pd.Series([pd.Timedelta(days=i) for i in range(6)])
category Categorical pd.Series(["a", "b", "c", "a"], dtype="category")
Using correct data type is important as our handling of data depends on it. If data type is int64 then we will get output as 4 for 2 + 2 , but we will get 22 as output if data type is object .
You can change the data type by using astype() .
We can successfully convert the data types if data matches to new data type. Otherwise we have to clean the data before using astype()
Data Cleaning
« Pandas
to_timedelta()
astype()
select_dtypes()
timedelta64() categorical data types
← Subscribe to our YouTube Channel here