keep | Optional , 'first' default, all duplicates are marked True except first one 'last' , all duplicates are marked True except last one 'False' ,all duplicates are marked True
|
import pandas as pd
my_data=pd.Series(['One','Two','Three','Two','Four'])
df=my_data.drop_duplicates()
print(df)
Output ( default value of keep is first , keep='first')
0 One
1 Two
2 Three
4 Four
dtype: object
With keep='last'
, duplicate values are deleted except last one
df=my_data.drop_duplicates(keep='last')
print(df)
0 One
2 Three
3 Two
4 Four
dtype: object
with keep=False
, all duplicate values are deleted
df=my_data.drop_duplicates(keep=False)
print(df)
Output
0 One
2 Three
4 Four
dtype: object
inplace=False
, so our main dataframe my_data is not altered when we use drop_duplicates(). So in above codes we have used another DataFrame df to store the output of drop_duplicates(). By using inplace=True
we can modify our main DataFrame my_data
import pandas as pd
my_data=pd.Series(['One','Two','Three','Two','Four'])
my_data.drop_duplicates(keep='last',inplace=True)
print(my_data)
Output
0 One
2 Three
3 Two
4 Four
dtype: object
Data CleaningAuthor
🎥 Join me live on YouTubePassionate about coding and teaching, I publish practical tutorials on PHP, Python, JavaScript, SQL, and web development. My goal is to make learning simple, engaging, and project‑oriented with real examples and source code.