Cleaning Data

Data we get from different sources may not be in the form or format for direct use in our applications. We need to correct these data by using various methods available in Pandas.

threshUsing thresh option of dropna() removing rows and columns
dropnaDelete NaN rows or columns
df.drop_duplicatesDelete Duplicate rows from DataFrame
duplicatedduplicate rows from DataFrame
Series.duplicatedduplicate value from Series
Series.drop_duplicatesDelete Duplicate data from Series
replaceReplace data
notnullCheck for Not Null and NaN data
fillnaFill NA/NaN values
dtypesPandas Data types
select_dtypesSubset of DataFrame based on data type
Removing comma from a string column
df['p_view']=df['p_view'].apply(lambda x: x.replace(',',''))
Converting to integer data type
df['p_view'] = df['p_view'].astype('int')
A regex which selects only characters in UTF-8, removing the rest, for each field in the dataframe.
df.replace({r'[^\x00-\x7F]+':''}, regex=True, inplace=True)

Pandas columns() add_prefix() add_suffix()
Subscribe to our YouTube Channel here


Subscribe

* indicates required
Subscribe to plus2net

    plus2net.com



    Post your comments , suggestion , error , requirements etc here





    Python Video Tutorials
    Python SQLite Video Tutorials
    Python MySQL Video Tutorials
    Python Tkinter Video Tutorials
    We use cookies to improve your browsing experience. . Learn more
    HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
    ©2000-2024 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer