Name of the SQL table to be read.
import sqlite3
import pandas as pd
# Create an in-memory SQLite database and sample data
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
# Create a table
cursor.execute('CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, salary REAL, department TEXT)')
cursor.executemany('INSERT INTO employees (name, salary, department) VALUES (?, ?, ?)', [('Alice', 75000.5, 'HR'), ('Bob', 80000.0, 'Engineering'), ('Chloe', 85000.0, 'Marketing')])
conn.commit()
# Use pandas.read_sql_query to fetch data
query = "SELECT id, name, salary FROM employees WHERE department = 'Engineering'"
df = pd.read_sql_query(sql=query, con=conn)
conn.close()
print(df)
'''
Output:
id name salary
0 2 Bob 80000.0
'''
SQLAlchemy connection engine or database connection.
import sqlite3
import pandas as pd
# Create an in-memory SQLite database and sample data
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
# Create a table
cursor.execute('CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, salary REAL, department TEXT)')
cursor.executemany('INSERT INTO employees (name, salary, department) VALUES (?, ?, ?)', [('Alice', 75000.5, 'HR'), ('Bob', 80000.0, 'Engineering'), ('Chloe', 85000.0, 'Marketing')])
conn.commit()
# Use pandas.read_sql_query to fetch data
query = "SELECT id, name, salary FROM employees WHERE department = 'HR'"
df = pd.read_sql_query(sql=query, con=conn)
conn.close()
print(df)
'''
Output:
id name salary
0 1 Alice 75000.5
'''
Column(s) to set as the DataFrame index.
import sqlite3
import pandas as pd
# Create an in-memory SQLite database and sample data
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
# Create a table
cursor.execute('CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, salary REAL, department TEXT)')
cursor.executemany('INSERT INTO employees (name, salary, department) VALUES (?, ?, ?)', [('Alice', 75000.5, 'HR'), ('Bob', 80000.0, 'Engineering'), ('Chloe', 85000.0, 'Marketing')])
conn.commit()
# Use pandas.read_sql_query to fetch data
query = "SELECT id, name, salary FROM employees WHERE department = 'Marketing'"
df = pd.read_sql_query(sql=query, con=conn, index_col="id")
conn.close()
print(df)
'''
Output:
name salary
id
3 Chloe 85000.0
'''
It is used to control whether non-numeric data that can be interpreted as numeric values should be converted to float type when reading data from the SQL table into a DataFrame.
Columns to parse as dates.
import sqlite3
import pandas as pd
# Create an in-memory SQLite database and sample data
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
# Create a table
cursor.execute('CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, salary REAL, date TEXT)')
cursor.executemany('INSERT INTO employees (name, salary, date) VALUES (?, ?, ?)', [('Alice', 75000.5, "2003-10-02"), ('Bob', 80000.0, '2003-04-16'), ('Chloe', 85000.0, '2023-12-29')])
conn.commit()
# Use pandas.read_sql_query to fetch data
df = pd.read_sql(sql="SELECT id, name, salary, date FROM employees", con=conn, parse_dates=["date"])
conn.close()
print(df['date'])
'''
Output:
0 2003-10-02
1 2003-04-16
2 2023-12-29
Name: date, dtype: datetime64[ns]
'''
It allows you to safely pass values to a SQL query as parameters.
import sqlite3
import pandas as pd
# Create an in-memory SQLite database and sample data
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
cursor.execute('CREATE TABLE employees (id INTEGER, name TEXT, salary REAL)')
cursor.executemany('INSERT INTO employees (id, name, salary) VALUES (?, ?, ?)', [(1, 'Alice', 75000.5), (2, 'Bob', 80000.0), (3, 'Chloe', 85000.0)])
conn.commit()
# Query with parameters using ?
query = "SELECT id, name, salary FROM employees WHERE salary > ?"
df = pd.read_sql_query(sql=query, con=conn, params=[78000.0])
conn.close()
print(df)
'''
Output:
id name salary
0 2 Bob 80000.0
1 3 Chloe 85000.0
'''
Number of rows per chunk for iteration.
import sqlite3
import pandas as pd
# Create an in-memory SQLite database and sample data
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
cursor.execute('CREATE TABLE emp (id INTEGER, name TEXT, salary REAL)')
cursor.executemany('INSERT INTO emp (id, name, salary) VALUES (?, ?, ?)', [(1, 'Alice', 75000.5), (2, 'Bob', 80000.0), (3, 'Chloe', 85000.0)])
conn.commit()
query = "SELECT * FROM emp"
df = pd.read_sql_query(sql=query, con=conn, chunksize=2)
print(next(df))
conn.close()
'''
Output:
id name salary
0 1 Alice 75000.5
1 2 Bob 80000.0
'''
The connection must always be closed after iterating through all the chunks from the DataFrame Iterator.
Dictionary specifying data types for specific columns.
import sqlite3
import pandas as pd
# Create an in-memory SQLite database and sample data
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
cursor.execute('CREATE TABLE emp (id INTEGER, name TEXT, salary REAL)')
cursor.executemany('INSERT INTO emp (id, name, salary) VALUES (?, ?, ?)', [(1, 'Alice', 75000.5), (2, 'Bob', 80000.0), (3, 'Chloe', 85000.0)])
conn.commit()
query = "SELECT * FROM emp"
df = pd.read_sql_query(sql=query, con=conn, dtype={'id': 'float16'})
conn.close()
print(df)
'''
Output:
id name salary
0 1.0 Alice 75000.5
1 2.0 Bob 80000.0
2 3.0 Chloe 85000.0
'''
int8: 8-bit signed integer (range: -128 to 127).
int16: 16-bit signed integer (range: -32,768 to 32,767).
int32: 32-bit signed integer (range: -2,147,483,648 to 2,147,483,647).
int64: 64-bit signed integer (large integer range).
uint8: 8-bit unsigned integer (range: 0 to 255).
uint16: 16-bit unsigned integer (range: 0 to 65,535).
uint32: 32-bit unsigned integer (range: 0 to 4,294,967,295).
uint64: 64-bit unsigned integer (large positive integer range).
float16: Half precision floating-point (16-bit, for low-precision computations).
float32: Single precision floating-point (32-bit).
float64: Double precision floating-point (64-bit, the default float in NumPy).
float128: Extended precision floating-point (128-bit, availability depends on system).
complex64: Complex number represented by two 32-bit floats (for real and imaginary parts).
complex128: Complex number represented by two 64-bit floats (default complex dtype).
complex256: Complex number represented by two 128-bit floats (system-dependent).
bool: Boolean type, can be either True or False (stored as 1-bit but takes up a full byte).
str: Fixed-length Unicode string, specified by S + length (e.g., S10 for a 10-character string).
unicode: Fixed-length Unicode string with support for multiple characters (uses U).
object: Allows storing any Python object, including mixed types, strings, or other arrays. Useful for heterogeneous data but slower than native NumPy types.
datetime64: Stores dates and times with varying precisions (e.g., Y, M, D, h, m, s, ms, us, ns, ps, fs, as). Example: datetime64('2003-10-02')
timedelta64: Represents time durations with units (same units as datetime64).
The dtype_backend parameter is new in Pandas 2.0 which is used to specify the backend for handling the types of data when reading a file.