We can specify number of splits to apply, by default all matching occurrences are used ( n=-1 ). We have changed our sample data to include more number of delimiters.
import numpy as np
import pandas as pd
my_dict={'email':['id.Ravi@example.co.in','id.Raju@example.co.in',np.nan,'id.Alex@example.co.in']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.split('.',expand=True,n=1))
Output
0 1
0 id Ravi@example.co.in
1 id Raju@example.co.in
2 NaN NaN
3 id Alex@example.co.in
rsplit()
We can break or split the string starting from right side or from end by using rsplit()
import numpy as np
import pandas as pd
my_dict={'email':['id.Ravi@example.co.in','id.Raju@example.co.in',np.nan,'id.Alex@example.co.in']}
df = pd.DataFrame(data=my_dict)
print(df.email.str.rsplit('.',expand=True,n=1))
Output
0 1
0 id.Ravi@example.co in
1 id.Raju@example.co in
2 NaN NaN
3 id.Alex@example.co in
Uses of split()
One of the common requirement is to separate directory and file from the path. Here are some sample data where some addresses ( URLs) are given. Let us try to collect directory name and file name from the data.
For multi level matching of columns we can use like this. This will help when we are not sure about the number of columns we will get in return. Sometime 2 columns sometime more than 2 columns.
df3 = df['page'].str.split('/', expand=True)
df3.columns = ['page_id{}'.format(x+1) for x in df3.columns]
df = df.join(df3)