read_sql
to execute query and store the details in Pandas DataFrame.
sql | Query or name of the database table to collect data to DataFrame |
con | Database connection string |
params | default = None, Parameters to be passed along with the Query |
index_col | Column to be used as index in DataFrame |
coerce_float | bool, Converts non-string, non-numeric to float |
parse_date | default=None, List of columns to be parse as Date |
columns | List of columns to return, by default all columns are available. This option is to be used when in place of SQL table name is used. ( for reading only ) |
chunksize | Number of rows to be included on each Chunk, iterator is returned. |
import mysql.connector
import pandas as pd
my_conn = mysql.connector.connect(
host="localhost",
user="userid",
passwd="******",
database="my_tutorial"
)
####### end of connection ####
my_data = pd.read_sql("SELECT * FROM student WHERE class='four'",my_conn)
print(my_data)
Output is here
id name class mark sex
0 1 John Deo Four 75 female
1 4 Krish Star Four 60 female
2 5 John Mike Four 60 female
3 6 Alex John Four 55 male
4 10 Big John Four 55 female
5 15 Tade Row Four 88 male
6 16 Gimmy Four 88 male
7 21 Babby John Four 69 female
8 31 Marry Toeey Four 88 male
To get only specified columns
print(my_data[['name','class']])
( Query can be changed to retrieve only required columns )
import pandas as pd
from sqlalchemy import create_engine
my_conn = create_engine("mysql+mysqldb://usrid:password@localhost/my_db")
my_data = pd.read_sql("SELECT * FROM student WHERE class='four'",my_conn)
print(my_data)
query="SELECT * FROM student WHERE class=%s"
my_data = pd.read_sql(query,my_conn,params=('Five',))
Note that params takes list or tuple or dictionary. In the above code we used one tuple ( note the comma at the end of the tupple).
query="SELECT * FROM student WHERE class=%s"
my_data = pd.read_sql(query,my_conn,params={'Four'})
Using List
query="SELECT * FROM student WHERE class=%s"
my_data = pd.read_sql(query,my_conn,params=['Four'])
Using more parameters with SQL in ( read more on WHERE query )
query="SELECT * FROM student WHERE class=%s and id in(%s,%s)"
my_data = pd.read_sql(query,my_conn,params=['Four',6,10] )
print(my_data)
q="SELECT * FROM `table_name` WHERE page LIKE '%%abcd%%'"
df=pd.read_sql(q,mysql_conn)
query="SELECT * FROM student WHERE class='Five'"
my_data=pd.read_sql(query,my_conn,index_col='id')
print(my_data)
Output ( DataFrame uses the id column as index column )
name class mark gender
id
7 My John Rob Five 78 male
8 Asruid Five 85 male
18 Honny Five 75 male
my_data=pd.read_sql('student',my_conn,columns=['id','name'])
print(my_data)
Output ( sample rows only, there are more records )
id name
0 1 John Deo
1 2 Max Ruin
2 3 Arnold
3 4 Krish Star
-----
-----
to get all columns
my_data = pd.read_sql("student",my_conn)
print(my_data.columns)
To print all column names by looping
my_data = pd.read_sql("student",my_conn)
print(my_data.columns) # output
for i in my_data.columns:
print(i)
query="SELECT * FROM student "
my_data = pd.read_sql(query,my_conn,chunksize=3 )
print(next(my_data))
print("--End of first set of records ---")
print(next(my_data))
Output is here
id name class mark sex
0 1 John Deo Four 75 female
1 2 Max Ruin Three 85 male
2 3 Arnold Three 55 male
--End of first set of records ---
id name class mark sex
0 4 Krish Star Four 60 female
1 5 John Mike Four 60 female
2 6 Alex John Four 55 male
query="CREATE TABLE IF NOT EXISTS student5 (`id` bigint(20) NOT NULL,\
`name` text,`class` text,`mark` bigint(20) DEFAULT NULL,gender text,\
PRIMARY KEY (`id`),UNIQUE KEY `id` (`id`)) ENGINE=InnoDB DEFAULT
CHARSET=latin1;"
pd.read_sql(query,my_conn)
We can run multiple queries
query = query + query2
pd.read_sql(query,my_conn)
In above sample codes you can see the basic connection and execution remain same and only by changing query we can get data in different ways and types.
Some of the sample queries are listed here .
#To get all records of a particular class
query="SELECT * FROM student WHERE class='Five'"
#To get highest mark among all records
query="SELECT max(mark) FROM student"
#To get average mark among all records
query="SELECT avg(mark) FROM student"
#To get highest mark of each class
query="SELECT class,count(*) as no FROM student GROUP BY class"
#To get average mark of each class
query="SELECT class,avg(*) as avg FROM student GROUP BY class"
#To get minimum mark of each class
query="SELECT class,min(*) as min FROM student GROUP BY class"
#To get record details of highest class
query="SELECT * FROM student WHERE class='Five' ORDER BY mark DESC LIMIT 0,1"
#To get second highest record of any class.
query="SELECT * FROM student WHERE class='Five' ORDER BY mark DESC LIMIT 1,1"
You can read more on various SQL in our SQL Exercise section.