« Pandas
Searches for string or pattern matching with different options.
Returns boolean searies
We will read data from one excel file ( student.xlsx ) by using read_excel() to create a DataFrame.
import pandas as pd
my_data = pd.read_excel('student.xlsx')
print(my_data)
This will return all the rows
We will use contains() to get only rows having Al in name column. We used the option case=False so this is a case insensitive matching. You can make it case sensitive by changing case option to case=True
import pandas as pd
my_data = pd.read_excel('student.xlsx')
my_data=my_data[my_data['name'].str.contains('Al',case=False)]
print(my_data)
Output
id name class mark sex
5 6 Alex John Four 55 male
10 11 Ronald Six 89 female
regex=True | False
We can use regual expression pattern matching by setting the option regex=True.
We will collect rows where name column is starting with A or B
import pandas as pd
my_data = pd.read_excel('student.xlsx')
my_data=my_data[my_data['name'].str.contains('^[AB]',case=True,regex=True)]
print(my_data)
Name column ending with d
my_data=my_data[my_data['name'].str.contains('d$',regex=True)]
Name column ending with a or d
my_data=my_data[my_data['name'].str.contains('[ad]$',regex=True)]
Name column not having Al
my_data=my_data[my_data['name'].str.contains('^((?!Al).)*$',case=True,regex=True)]
Or
my_data=my_data[~my_data['name'].str.contains('Al',case=False)]
« str.contains.sum()
« Pandas
read_csv()
read_excel()
to_excel()