replace() to Replace all or some occurrence of matching string in DataFrame

Replacing matching string in all columns
Replacing matching number in all columns
Replacing (to_replace) matching List in all columns
Using dictionary as to_replace to match in all columns
Using regular expression pattern matching
Replacing matching in a Series

replace(): Replace string using regular expression in Pandas DataFrame #C04

Create Sample DataFrame.

import pandas as pd 
my_dict={
  'id':[1,2,3,4,5,4,2],
  'name':['John','Max','Arnold','Krish','John','Krish','Max'],
  'class1':['Four','Three','Three','Four','Four','Four','Three'],
  'mark':[75,85,55,60,60,60,85],
  'gender':['female','male','male','female','female','female','male']
	}
df = pd.DataFrame(data=my_dict)

String to replace in all columns

df=df.replace('Max','Jim') # replace in all columns

Output

   id    name class1  mark  gender
0   1    John   Four    75  female
1   2     Jim  Three    85    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5    John   Four    60  female
5   4   Krish   Four    60  female
6   2     Jim  Three    85    male

Number to replace in all columns

df=df.replace(85,100) # replace in all columns

Output

   id    name class1  mark  gender
0   1    John   Four    75  female
1   2     Max  Three   100    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5    John   Four    60  female
5   4   Krish   Four    60  female
6   2     Max  Three   100    male

Using List to replace matching string and number

df=df.replace(['John',85],['Jim',100])

Output

   id    name class1  mark  gender
0   1     Jim   Four    75  female
1   2     Max  Three   100    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5     Jim   Four    60  female
5   4   Krish   Four    60  female
6   2     Max  Three   100    male

Using dictionary

df=df.replace({'John':'Jim',85:100}) # using dictionary

Output

   id    name class1  mark  gender
0   1     Jim   Four    75  female
1   2     Max  Three   100    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5     Jim   Four    60  female
5   4   Krish   Four    60  female
6   2     Max  Three   100    male

Replace a list of matching number with one value

df=df.replace([75,85,60],100) # Matching list with one

Output

   id    name class1  mark  gender
0   1    John   Four   100  female
1   2     Max  Three   100    male
2   3  Arnold  Three    55    male
3   4   Krish   Four   100  female
4   5    John   Four   100  female
5   4   Krish   Four   100  female
6   2     Max  Three   100    male

Replace a list of matching string with one value

df=df.replace({'name':'John','class1':'Four'},'Jim')

Output

   id    name class1  mark  gender
0   1     Jim    Jim    75  female
1   2     Max  Three    85    male
2   3  Arnold  Three    55    male
3   4   Krish    Jim    60  female
4   5     Jim    Jim    60  female
5   4   Krish    Jim    60  female
6   2     Max  Three    85    male

Based on specific column we can replace string.

df['class1']=df['class1'].str.replace('Three','Ten')

output

   id    name class1  mark  gender
0   1    John   Four    75  female
1   2     Max    Ten    85    male
2   3  Arnold    Ten    55    male
3   4   Krish   Four    60  female
4   5    John   Four    60  female
5   4   Krish   Four    60  female
6   2     Max    Ten    85    male

Using regular expression

replace starting A or F in all columns

df=df.replace(regex='^[AF]',value='*')

Output

   id    name class1  mark  gender
0   1    John   *our    75  female
1   2     Max  Three    85    male
2   3  *rnold  Three    55    male
3   4   Krish   *our    60  female
4   5    John   *our    60  female
5   4   Krish   *our    60  female
6   2     Max  Three    85    male

Starting with M and three char length

df=df.replace(regex={r'^M..$':'foo'})

Output

   id    name class1  mark  gender
0   1    John   Four    75  female
1   2     foo  Three    85    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5    John   Four    60  female
5   4   Krish   Four    60  female
6   2     foo  Three    85    male

replace last two matching chars

df=df.replace(regex={r'hn$':'foo'})

Output

   id    name class1  mark  gender
0   1   Jofoo   Four    75  female
1   2     Max  Three    85    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5   Jofoo   Four    60  female
5   4   Krish   Four    60  female
6   2     Max  Three    85    male

Series

Number of occurrences of pattern in a string.
Returns Series or Index

replacing string

All @ are replaced by #

import pandas as pd 
my_dict={'email':['Ravi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('@','#'))

Output

0    Ravi#example.com
1    Raju#example.com
2    Alex#example.com

Case insensitive search and replace

By using option case=False we can make case insensitive search and replace.

import pandas as pd 
my_dict={'email':['Ravi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('ravi','Ronn',case=False))

Output ( Ravi is replaced by Ronn )

0    Ronn@example.com
1    Raju@example.com
2    Alex@example.com

Using Regular expression

import pandas as pd 
my_dict={'email':['Ra2vi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('^[AC]','*'))

Output ( Char starting with A or C are replaced with * , so A at Alex is replaced )

0    Ra2vi@example.com
1     Raju@example.com
2     *lex@example.com

Let us replace only digits

import pandas as pd 
my_dict={'email':['Ra2vi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('[0-9]','*'))

Output

0    Ra*vi@example.com
1     Raju@example.com
2     Alex@example.com

Let us replace a or b chars

import pandas as pd 
my_dict={'email':['Ra2vi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('[a|b]','*'))

Output is here

0    R*2vi@ex*mple.com
1     R*ju@ex*mple.com
2     Alex@ex*mple.com

Number of replacements

In above code we have replaced all the occurrences. Now let us limit to one only replacement.

import pandas as pd 
my_dict={'email':['Ra2vi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('[a|b]','*',n=1))

Output

0    R*2vi@example.com
1     R*ju@example.com
2     Alex@ex*mple.com

Pandas contains() Converting char case slice() split()

Numpy arrays Python & MySQL Python- Tutorials

Subscribe to our YouTube Channel here

str.replace()

String to replace in all columns

Number to replace in all columns

Using List to replace matching string and number

Using dictionary

Replace a list of matching number with one value

Replace a list of matching string with one value

Based on specific column we can replace string.

Using regular expression

Series

replacing string

Case insensitive search and replace

Using Regular expression

Number of replacements

Subscribe