str.replace()

  1. Replacing matching string in all columns
  2. Replacing matching number in all columns
  3. Replacing (to_replace) matching List in all columns
  4. Using dictionary as to_replace to match in all columns
  5. Using regular expression pattern matching
  6. Replacing matching in a Series


replace(): Replace string using regular expression in Pandas DataFrame #C04


Create Sample DataFrame.
import pandas as pd 
my_dict={
  'id':[1,2,3,4,5,4,2],
  'name':['John','Max','Arnold','Krish','John','Krish','Max'],
  'class1':['Four','Three','Three','Four','Four','Four','Three'],
  'mark':[75,85,55,60,60,60,85],
  'gender':['female','male','male','female','female','female','male']
	}
df = pd.DataFrame(data=my_dict)

String to replace in all columns

df=df.replace('Max','Jim') # replace in all columns
Output
   id    name class1  mark  gender
0   1    John   Four    75  female
1   2     Jim  Three    85    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5    John   Four    60  female
5   4   Krish   Four    60  female
6   2     Jim  Three    85    male

Number to replace in all columns

df=df.replace(85,100) # replace in all columns
Output
   id    name class1  mark  gender
0   1    John   Four    75  female
1   2     Max  Three   100    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5    John   Four    60  female
5   4   Krish   Four    60  female
6   2     Max  Three   100    male

Using List to replace matching string and number

df=df.replace(['John',85],['Jim',100])
Output
   id    name class1  mark  gender
0   1     Jim   Four    75  female
1   2     Max  Three   100    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5     Jim   Four    60  female
5   4   Krish   Four    60  female
6   2     Max  Three   100    male

Using dictionary

df=df.replace({'John':'Jim',85:100}) # using dictionary
Output
   id    name class1  mark  gender
0   1     Jim   Four    75  female
1   2     Max  Three   100    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5     Jim   Four    60  female
5   4   Krish   Four    60  female
6   2     Max  Three   100    male

Replace a list of matching number with one value

df=df.replace([75,85,60],100) # Matching list with one 
Output
   id    name class1  mark  gender
0   1    John   Four   100  female
1   2     Max  Three   100    male
2   3  Arnold  Three    55    male
3   4   Krish   Four   100  female
4   5    John   Four   100  female
5   4   Krish   Four   100  female
6   2     Max  Three   100    male

Replace a list of matching string with one value

df=df.replace({'name':'John','class1':'Four'},'Jim')
Output
   id    name class1  mark  gender
0   1     Jim    Jim    75  female
1   2     Max  Three    85    male
2   3  Arnold  Three    55    male
3   4   Krish    Jim    60  female
4   5     Jim    Jim    60  female
5   4   Krish    Jim    60  female
6   2     Max  Three    85    male

Based on specific column we can replace string.

df['class1']=df['class1'].str.replace('Three','Ten')
output
   id    name class1  mark  gender
0   1    John   Four    75  female
1   2     Max    Ten    85    male
2   3  Arnold    Ten    55    male
3   4   Krish   Four    60  female
4   5    John   Four    60  female
5   4   Krish   Four    60  female
6   2     Max    Ten    85    male

Using regular expression

replace starting A or F in all columns
df=df.replace(regex='^[AF]',value='*') 
Output
   id    name class1  mark  gender
0   1    John   *our    75  female
1   2     Max  Three    85    male
2   3  *rnold  Three    55    male
3   4   Krish   *our    60  female
4   5    John   *our    60  female
5   4   Krish   *our    60  female
6   2     Max  Three    85    male
Starting with M and three char length
df=df.replace(regex={r'^M..$':'foo'}) 
Output
   id    name class1  mark  gender
0   1    John   Four    75  female
1   2     foo  Three    85    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5    John   Four    60  female
5   4   Krish   Four    60  female
6   2     foo  Three    85    male
replace last two matching chars
df=df.replace(regex={r'hn$':'foo'}) 
Output
   id    name class1  mark  gender
0   1   Jofoo   Four    75  female
1   2     Max  Three    85    male
2   3  Arnold  Three    55    male
3   4   Krish   Four    60  female
4   5   Jofoo   Four    60  female
5   4   Krish   Four    60  female
6   2     Max  Three    85    male

Series

Number of occurrences of pattern in a string.
Returns Series or Index

replacing string

All @ are replaced by #
import pandas as pd 
my_dict={'email':['Ravi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('@','#'))
Output
0    Ravi#example.com
1    Raju#example.com
2    Alex#example.com

Case insensitive search and replace

By using option case=False we can make case insensitive search and replace.
import pandas as pd 
my_dict={'email':['Ravi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('ravi','Ronn',case=False))
Output ( Ravi is replaced by Ronn )
0    Ronn@example.com
1    Raju@example.com
2    Alex@example.com

Using Regular expression

import pandas as pd 
my_dict={'email':['Ra2vi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('^[AC]','*'))
Output ( Char starting with A or C are replaced with * , so A at Alex is replaced )
0    Ra2vi@example.com
1     Raju@example.com
2     *lex@example.com
Let us replace only digits
import pandas as pd 
my_dict={'email':['Ra2vi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('[0-9]','*'))
Output
0    Ra*vi@example.com
1     Raju@example.com
2     Alex@example.com
Let us replace a or b chars
import pandas as pd 
my_dict={'email':['Ra2vi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('[a|b]','*'))
Output is here
0    R*2vi@ex*mple.com
1     R*ju@ex*mple.com
2     Alex@ex*mple.com

Number of replacements

In above code we have replaced all the occurrences. Now let us limit to one only replacement.
import pandas as pd 
my_dict={'email':['Ra2vi@example.com','Raju@example.com','Alex@example.com']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.replace('[a|b]','*',n=1))
Output
0    R*2vi@example.com
1     R*ju@example.com
2     Alex@ex*mple.com
Pandas contains() Converting char case slice() split()
Subscribe to our YouTube Channel here


Subscribe

* indicates required
Subscribe to plus2net

    plus2net.com



    Post your comments , suggestion , error , requirements etc here





    Python Video Tutorials
    Python SQLite Video Tutorials
    Python MySQL Video Tutorials
    Python Tkinter Video Tutorials
    We use cookies to improve your browsing experience. . Learn more
    HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
    ©2000-2024 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer