Python Pandas DataFrame loc to get rows based on column labels

Column name based position location using loc

Pandas loc to get rows based on column labels using callback functions from DataFrame

Difference between iloc and loc

loc is label based so we have to use column names.
iloc is position based ( integer ).

We used DataFrame of 35 rows by reading from a CSV file.

For better understanding we can use bigger DataFrame with more rows. Copy and use or download the CSV or Excel file to create the sample DataFrame with 35 rows and 5 columns.
Student ( Sample ) DataFrame

Value at position

First rows with two columns

df.loc[0,['class','name']] # first row with two columns class and name

Output

class        Four
name     John Deo
Name: 0, dtype: object

2nd and 3rd rows with two columns

df.loc[[2,3],['class','name']] # 2nd and 3rd row with two columns class and name

Output

	class	name
2	Three	Arnold
3	Four	Krish Star

2nd and 3rd rows with all columns

df.loc[[2,3],:] # 2nd and 3rd row with two columns class and name

Output

	id	name		class	mark	gender
2	3	Arnold		Three	55	male
3	4	Krish Star	Four	60	female

Matching data

While reading csv file we will add index to name column and then search.

import pandas as pd 
df=pd.read_csv('student.csv',index_col='name')
df.loc['Tade Row']

Output

id         15
class    Four
mark       88
gender      male
Name: Tade Row, dtype: object

Matching two names

df.loc[['Tade Row','John Deo']] # printing row with two name matching

Output

		id	class	mark	gender
name				
Tade Row	15	Four	88	male
John Deo	1	Four	75	female

All matching records

import pandas as pd 
df=pd.read_csv('test.csv',index_col='class')
df.loc['Four'] # printing all rows with class='Four'

Output

	id	name		mark	gender
class				
Four	1	John Deo	75	female
Four	4	Krish Star	60	female
Four	5	John Mike	60	female
Four	6	Alex John	55	male
Four	10	Big John	55	female
Four	15	Tade Row	88	male
Four	16	Gimmy	88	male
Four	21	Babby John	69	female
Four	31	Marry Toeey	88	male

Rows between ranges

All the records between two id

import pandas as pd 
df=pd.read_csv('student.csv',index_col='id')
df.loc[4:6] # printing rows between two ids

Output

	name		class	mark	gender
id				
4	Krish Star	Four	60	female
5	John Mike	Four	60	female
6	Alex John	Four	55	male

In above code we have two unique ids and rows between these two ids will be displayed. If data are not unique then there will be an error. Try be using two class names, you will get error "Cannot get left slice bound for non-unique label: 'Five'"

Between two names ( column data)

import pandas as pd 
df=pd.read_csv('test.csv',index_col='name')
#print(df)
df.loc['Ronald':'Bigy']  # Records between two names

Here are records between name 'Ronald' and 'Bigy'

	id	class	mark	gender
name				
Ronald	11	Six	89	female
Recky	12	Six	94	female
Kty	13	Seven	88	female
Bigy	14	Seven	88	female

We can search name column after declaring as index column.

df= pd.read_csv('student.csv',index_col='name') # DataFrame 
print(df.loc['Arnold','mark':]) # Mark of Arnold ( name )
print(df.loc[:,'mark']) # All marks ( with name )
print(df.loc['Arnold',:]) # All columns of Arnold

Using Callable function

print(df.loc[lambda x: x.name == 'Arnold']) 
print(df.loc[lambda x: list(x.mark > 90)]) # mark more than 90
print(df.loc[lambda x: list(x.gender == 'male'),:])
print(df.loc[lambda x: list(x.name.str.len()>10),:]) # name is not index col
print(df.loc[lambda x: list(x.index.str.len()>10),:]) # name is index col

Record of matching string

import pandas as pd 
df=pd.read_csv('student.csv')
print(df.loc[df['name']=='Alex John']) # Search for name

Output


   id       name class  mark   gender
5   6  Alex John  Four    55  male

Partial matching string

In above code we have filtered records with exact matching. Sometime we have to search for partial matching.
Let us find out rows where part of the column string matching with search string.

import pandas as pd 
df=pd.read_csv("student.csv")
df[df.name.str.contains('De',case=False)]

Output is here

	id	name		class	mark	gender
0	1	John Deo	Four	75	female
14	15	Tade Row	Four	88	male

endswith : string ending with

print(df.loc[df['name'].str.endswith('John')])

Output

    id        name class  mark     gender
5    6   Alex John  Four    55    male
9   10    Big John  Four    55  female
20  21  Babby John  Four    69  female

Not ending with

print(df.loc[~df['name'].str.endswith('John')])

startswith : string starting with

import pandas as pd 
df=pd.read_csv("student.csv")
df[df.name.str.startswith('Jo')]

Output

	id	name		class	mark	gender
0	1	John Deo	Four	75	female
4	5	John Mike	Four	60	female

Not starting with string

print(df[~df.name.str.startswith('Jo')])

Any Matching with more than one string

We can match multiple strings where any one can match ( OR )

import pandas as pd 
df=pd.read_csv("student.csv")
df[df.name.str.contains('|'.join(['Jo','Al']))]

name column can have Jo or Al, output is here

	id	name		class	mark	gender
0	1	John Deo	Four	75	female
4	5	John Mike	Four	60	female
5	6	Alex John	Four	55	male
6	7	My John Rob	Fifth	78	male
9	10	Big John	Four	55	female
20	21	Babby John	Four	69	female

Matching more than one string

All search strings must be present within the string ( AND )

import pandas as pd 
df=pd.read_csv("student.csv")
df[df.name.str.contains('Jo') & df.name.str.contains('Al')]

Both strings Jo and Al must be present within name column.

	id	name		class	mark	gender
5	6	Alex John	Four	55	male

Using OR condition |

df2 = df[df.name.str.contains("Jo") | df.name.str.contains("Al")]

Matching two columns with OR. Here one is string and other column is integer. Either name contains Mike OR id == 6 .
don't forget to wrap the sub-statements with ()

df2 = df[((df.name.str.contains("Mike")) | (df["id"] == 6))]

Combine conditions

str1 = df["id"] == 5 # id column value 
str2 = df.name.str.contains("Al", case=False) # name column value matching Al
str3 = df["mark"] < 50 # mark column value less than 50
df2 = df[str1 | str2 | str3] # combine all conditions using | operator

Not matching

All rows where searched string is NOT present ( not matching )

df[~df.name.str.contains('Jo')]

When any one of the string is not present.

df[~df.name.str.contains('|'.join(['Jo','Al']))]

Getting rows from end

Last record

import pandas as pd 
df=pd.read_csv('student.csv')
print(df.tail(1)) # Last  records

Output

    id        name class  mark     gender
34  35  Rows Noump   Six    88  female

Last n records ( five records from end )

print(df.tail(5)) # Last five records

Output

    id         name  class  mark     gender
30  31  Marry Toeey   Four    88    male
31  32    Binn Rott  Seven    90  female
32  33    Kenn Rein    Six    96  female
33  34     Gain Toe  Seven    69    male
34  35   Rows Noump    Six    88  female

Adding list to DataFrame

We will create a list and then add the list ( new_list ) at the end of the DataFrame. the Function len() will return the number of rows, the index position of last row is 1 less than the number of rows. So the new row will be added at the end of the DataFrame. We will use loc to add the list.

new_list=[36,'new name','Four',77,'male']
df.loc[len(df)]=new_list
print(df)

Replacing data based on condition using loc

Make the mark column value = 100 for students who got mark>=90.)

df.loc[df["mark"] >=90, "mark"] = 100

Adding multiple conditions ( this will replace with mark = 100 for class Six with more than or equal to 90 mark.

condition=(df['mark']>=90) & (df['class']=='Six')
df.loc[condition, "mark"] = 100

Instead of updating mark, change the class value to 'Seven' for class ='Six' and mark >=90

condition=(df['mark']>=90) & (df['class']=='Six')
df.loc[condition, "class"] = 'Seven'

df.loc[df['name']=='Rows Noump','name']='New Name'

Note that here the new list or data is added to the DataFrame only, not to the csv file at source.
Replace data based multiple condition like CASE THEN ( SQL ) by using np.where
at mask contains() to display and delete row based on Conditions

Pandas Pandas DataFrame iloc - rows and columns by integers

Numpy arrays Python & MySQL Python- Tutorials

Subscribe to our YouTube Channel here