keep | Optional , 'first' default, all duplicates are marked True except first one 'last' , all duplicates are marked True except last one 'False' ,all duplicates are marked True
|
import pandas as pd
my_data=pd.Series(['One','Two','Three','Two','Four'])
my_data.duplicated()
Output ( default value of keep is first , keep='first')
0 False
1 False
2 False
3 True
4 False
dtype: bool
with keep='last'
, duplicate values are marked True except last one
import pandas as pd
my_data=pd.Series(['One','Two','Three','Two','Four'])
my_data.duplicated(keep='last')
0 False
1 True
2 False
3 False
4 False
dtype: bool
with keep=False
, all duplicate values are marked True
import pandas as pd
my_data=pd.Series(['One','Two','Three','Two','Four'])
my_data.duplicated(keep=False)
Output
0 False
1 True
2 False
3 True
4 False
dtype: bool
print(my_data[my_data.duplicated()])
Output
3 Two
dtype: object
print(my_data[~my_data.duplicated()])
Output
0 One
1 Two
2 Three
4 Four
dtype: object
We can use unique()
print(my_data.unique())
Output
['One' 'Two' 'Three' 'Four']
Data CleaningAuthor
🎥 Join me live on YouTubePassionate about coding and teaching, I publish practical tutorials on PHP, Python, JavaScript, SQL, and web development. My goal is to make learning simple, engaging, and project‑oriented with real examples and source code.