Pickle is a process to serialize ( or de-serialize ) Python object to binary protocol.
This process of serialization and di-serialization is also known as pickling and un-pickling. In the old format it is also known as marshalling and flattening.
Purpose of Pickling
By doing pickling we can transport or store the object for further use. We get a binary file to store or share with others.
Disadvantage
Pickling is not universal format to store data. The binary file can’t be used in other languages like we use JSON, CSV, XML or other formats. This works within Python only. So we can open and use files within Python environment only.
Pickling does not create human readable format and the binary file can be infected by malicious codes. So we must open pickled files from trusted sources only.
If pickling is done using one version then un-pickling may not work in other versions.
What can be pickled
Here is a list of objects which can be pickled ( or un-pickled).
Booleans : True or False , None
Strings , bytes and byte arrays
Integers , float , complex numbers
Tuple, list , sets and dictionary
Functions defined at top level ( not lambda )
Built in functions at top level
Classes defined at top level
Pickle verses JSON
JSON (JavaScript Object Notation ) is a standard language for data exchange between different processes. Let us see the difference between JSON and Pickle.
Format
JSON is human readable format , Pickle is binary format so not human readable.
Standardization
JSON is acceptable format over various platforms. However other languages may not support Pickle, so this can’t be used as a standard data exchange format across the platforms.
Security
JSON formatted data does not in itself create any vulnerability. However the binary data may content malicious codes so we must trust the source while un-pickling.
JSON because of its standard format may not offer python specific classes. As pickle supports large type of Python objects so can be handy for specific uses.
pickle.load() & pickle.dump()
pickle.load(file) : Uses the open file object and un-pickle the binary data. pickle.dump(py_obj,file) : Uses the py_obj object to pickle and store in file.
How to pickle
Here are two examples, one is how to Pickle one simple dictionary and other one is one how to un-pickle the dictionary.
import pickle
my_dict={
'NAME':['Ravi','Raju','Alex'],
'ID':[1,2,3],'MATH':[30,40,50],
'ENGLISH':[20,30,40]
}
fob = open('my_pickle','wb') # file handling object is created
pickle.dump(my_dict,fob) # generated the Pickle
fob.close() # file object is closed
In above code we created one file object ( fob ) to handle the file my_file ( without any extension )
fob=open('my_pickle','rb')
my_dict1=pickle.load(fob) # reading the Pickle
fob.close()
print(my_dict1)