Python Pandas
Pandas DataFrame column listing and adding new columns using insert, dropping columns and renaming
VIDEO
We can create one DataFrame by using dictionary or by reading any Excel or reading CSV file . We will display the column names of this dataframe here.
By using columns
import pandas as pd
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
'ID':[1,2,3,4,5,6],
'MATH':[80,40,70,70,60,30],
'ENGLISH':[80,70,40,50,60,30]}
df = pd.DataFrame(data=my_dict)
print(df.columns) # Object
Output
Index(['NAME', 'ID', 'MATH', 'ENGLISH'], dtype='object')
Getting any perticular column
print(df.columns[2]) # Math
By using keys()
print(df.keys())
Output
Index(['NAME', 'ID', 'MATH', 'ENGLISH'], dtype='object')
By using for loop
for cols in df.columns:
print(cols)
Output
NAME
ID
MATH
ENGLISH
To get data from each column of DataFrame
for cols in df.columns:
print(df[cols])
Calling a function by passing each column as parameter and display data.
def my_fun(cols): # function to receive column name as parameter
print(df[cols])
for cols in df.columns:
my_fun(cols)
By using list
print(list(df.columns))
Output
['NAME', 'ID', 'MATH', 'ENGLISH']
By using tolist()
print(df.columns.values.tolist())
Output
['NAME', 'ID', 'MATH', 'ENGLISH']
Create empty DataFrame with only column names
df = pd.DataFrame(columns=['A','B','C','D','E','F','G'])
Data type of columns
Here dt_start column is string object, we can convert the string data type to date and time by using to_datetime() function . Check the difference in output after converting the column data type.
import pandas as pd
my_dict={'NAME':['Ravi','Raju','Alex'],
'dt_start':['1-1-2020','2-1-2020','5-1-2020']
}
df = pd.DataFrame(data=my_dict)
#df['dt_start'] = pd.to_datetime(df['dt_start']) # converts to datetime data
print(df.dtypes)
Adding column
Using a list . Take care that number of elements should match with existing data. Otherwise we will get value error ValueError: Length of values (5) does not match length of index (6)
l1=['Four','Three','Five','Six','Two','Three']
df['my_class']=l1
Above code will add the column at the end of the DataFrame. We can use insert()
l1=['Four','Three','Five','Six','Two','Three']
df.insert(2,'my_class',l1,True)
print(df.columns.values.tolist())
Output
['NAME', 'ID', 'my_class' , 'MATH', 'ENGLISH']
Here also we have to match the existing length of data. The Boolean options at the end is to allow duplicate or not, default value is False .
We can use assign() to add a new column but here the original DataFrame remain same ( no change ). Column is also added at the end.
l1=['Four','Three','Five','Six','Two','Three']
df2=df.assign(my_class=l1)
print(df2.columns.values.tolist())
We can use a dictionary with keys ( unique ) and value from any existing column.
d1={'Four':'Ravi','Three':'Raju','Five':'Alex','Six':'Ron',
'Two':'King','Eight':'Jack'}
df['my_class']=d1
print(df.columns.values.tolist())
We can use single value for all the rows of the new column. We will use all above methods.
We are adding my_class column with single value 'Four' for all rows
df['my_class']='Four' # adding column at the end
df.insert(2,'my_class','Four') # at 2nd position
df2=df.assign(my_class='Four') # new DataFrame with added column
How to add column with increasing value?
df = df.reset_index()
df = df.rename(columns={"index":"New_ID"})
df['New_ID'] = df.index + 1000 # starting from 1000
Here reset_index() adds old index as a column, and a new sequential index is used.
Delete a column
using drop
df.drop(labels='Page Value',axis=1,inplace=True)
Changing column names
Assigning new names, maintain sequency.
df.columns = ['Page','p_view','u_view','avg']
Updating single column name
df = df.rename(columns={"my_class":"my_class4"})
« Pandas « DataFrame
describe() head() rename()
← Subscribe to our YouTube Channel here