DataFrame.set_index()

Pandas

Create index using columns .

options

keys : single column or list of columns which will be used as index
drop: Bool, default True. Delete the column after creating the index
append : Bool, default False. Whether to append column to existing index
inplace: Bool, default False. Modify the existing DataFrame or not.
verify_integrity : Bool, default False. Check for duplicates.

keys

import pandas as pd 
my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],
         'MATH':[80,40,70,70,70,30],
         'ENGLISH':[80,70,40,50,60,30]}
my_data = pd.DataFrame(data=my_dict)
my_data.set_index('NAME')
print(my_data)
Output
   NAME  ID  MATH  ENGLISH
0  Ravi   1    80       80
1  Raju   2    40       70
2  Alex   3    70       40
3   Ron   4    70       50
4  King   5    70       60
5  Jack   6    30       30

inplace

Modify the DataFrame or not
my_data.set_index('NAME',inplace=True)
print(my_data)
Output
      ID  MATH  ENGLISH
NAME                   
Ravi   1    80       80
Raju   2    40       70
Alex   3    70       40
Ron    4    70       50
King   5    70       60
Jack   6    30       30

drop

By default the column is deleted ( drop=True ) after marking it as index.
my_data_mod=my_data.set_index('NAME',drop=False)
print(my_data_mod)
Output
      NAME  ID  MATH  ENGLISH
NAME                         
Ravi  Ravi   1    80       80
Raju  Raju   2    40       70
Alex  Alex   3    70       40
Ron    Ron   4    70       50
King  King   5    70       60
Jack  Jack   6    30       30

verify_integrity

We have changed the DataFrame by using duplicate value for NAME column. Now if we will set the verify_integrity=True then we will get ValueError like this
ValueError: Index has duplicate keys: Index(['Ron'], dtype='object', name='NAME')
By changing like this verify_integrity=False we can supress the error and continue.
my_data_mod=my_data.set_index('NAME',verify_integrity=False)
print(my_data_mod)
Output
      ID  MATH  ENGLISH
NAME                   
Ravi   1    80       80
Raju   2    40       70
Alex   3    70       40
Ron    4    70       50
King   5    70       60
Ron    6    30       30

append

Default value is False We will check with append=True
my_data_mod=my_data.set_index('NAME',append=True)
print(my_data_mod)
Output is here
        ID  MATH  ENGLISH
  NAME                   
0 Ravi   1    80       80
1 Raju   2    40       70
2 Alex   3    70       40
3 Ron    4    70       50
4 King   5    70       60
5 Jack   6    30       30
Now let us make append=False
my_data_mod=my_data.set_index('NAME',append=False)
print(my_data_mod)
Output
      ID  MATH  ENGLISH
NAME                   
Ravi   1    80       80
Raju   2    40       70
Alex   3    70       40
Ron    4    70       50
King   5    70       60
Jack   6    30       30

Using set_index in DateTime columns

We can get all records of year 2020 and month March by this. Note that st_date is our datetime column
print(my_data.set_index('st_date')['2020-03'])
Similarly we can get all records between two periods like this.
print(my_data.set_index('st_date')['2019-03':'2019-04'])
You can get more examples of using date column at Exercise3

Pandas reset_index() date_range() to_datetime() period_range()


plus2net.com



Post your comments , suggestion , error , requirements etc here




We use cookies to improve your browsing experience. . Learn more
HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
©2000-2020 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer