Creation | pd.read_fwf()

It is built on top of pd.read_table() method.

Previous Next

Method:

pd.read_fwf(filepath_or_buffer, colspecs='infer', widths=None, infer_nrows=100, dtype_backend=<no_default>, interator=False, chunksize=None, **kwds)

Reads a table of fixed-width formatted lines into DataFrame.

Returns:

pandas.core.frame.DataFrame

Parameters:

filepath_or_buffer: (file_path or buffer), Optional-

Path or file-like object containing the fixed-width file.

import pandas as pd

# Create dataset
data = "123  Alice  25\n456  Bob    30\n789  Charlie 35"
file_path = "file1.fwf"

with open(file_path, "w") as f:
    f.write(data)

# Read the file
df = pd.read_fwf(filepath_or_buffer=file_path)
print(df)
'''
Output:
   123   Alice  25
0  456   Bob    30
1  789  Charlie 35
'''

colspecs: List of tuples, Optional-

List of tuples specifying start and end positions of each column; 'infer' detects automatically.

import pandas as pd

# Create dataset
data = "123Alice   25\n456Bob     30\n789Charlie 35"
file_path = "file2.fwf"

with open(file_path, "w") as f:
    f.write(data)

# Specify column start and end positions
colspecs = [(0, 3), (3, 10), (10, 13)]
df1 = pd.read_fwf(file_path, colspecs='infer')
df2 = pd.read_fwf(file_path, colspecs=colspecs)
print(f"Infer:\n{df1}\nList of tuples:\n{df2}")
'''
Output:
Infer:
     123Alice  25
0      456Bob  30
1  789Charlie  35
List of tuples:
   123    Alice  25
0  456      Bob  30
1  789  Charlie  35
'''

widths: array-like, Optional-

Alternative to colspecs, specifying widths of each column.

import pandas as pd

# Create dataset
data = "123Alice  25\n456Bob    30\n789Charlie35"
file_path = "file3.fwf"

with open(file_path, "w") as f:
    f.write(data)

# Specify column widths
widths = [3, 6, 2]
df = pd.read_fwf(file_path, widths=widths)
print(df)
'''
Output:
   123   Alice   2
0  456     Bob   3
1  789  Charli  e3
'''

infer_nrows: int, Optional-

Number of rows to infer column widths when colspecs='infer'.

import pandas as pd

# Create dataset
data = "123 Alice 25\n456 Bob   30\n789 Charlie 35"
file_path = "file4.fwf"

with open(file_path, "w") as f:
    f.write(data)

# Infer column widths from the first 2 rows
df = pd.read_fwf(file_path, colspecs="infer", infer_nrows=2)
print(df)
'''
Output:
   123  Alice  25
0  456    Bob  30
1  789  Charl   e
'''

dtype_backend: None, Optional-

The dtype_backend parameter is new in Pandas 2.0 which is used to specify the backend for handling the types of data when reading a file.

iterator: (True or False), Optional-

Returns an iterator instead of a full DataFrame.

iterator = False (default) +

Returns DataFrame.

import pandas as pd

# Create dataset
data = "ID   Name     Age\n123  Alice    25\n456  Bob      30\n789  Charlie  35"
file_path = "file6.fwf"

with open(file_path, "w") as f:
    f.write(data)

# Use iterator to read chunks
df_iterator = pd.read_fwf(file_path, iterator=False)
print(df_iterator)
'''
Output:
    ID     Name  Age
0  123    Alice   25
1  456      Bob   30
2  789  Charlie   35
'''

memory_map = True +

It enables memory-mapped file reading, which uses the operating system's virtual memory to map the contents of the file directly into memory, improving performance for very large files.

import pandas as pd

# Create dataset
data = "ID   Name     Age\n123  Alice    25\n456  Bob      30\n789  Charlie  35"
file_path = "file6.fwf"

with open(file_path, "w") as f:
    f.write(data)

# Use iterator to read chunks
df_iterator = pd.read_fwf(file_path, iterator=True, chunksize=1)
print(next(df_iterator))
'''
Output:
    ID     Name  Age
0  123    Alice   25
'''

chunksize: int, Optional-

Number of rows per chunk when using an iterator.

import pandas as pd

# Create dataset
data = "ID   Name     Age\n123  Alice    25\n456  Bob      30\n789  Charlie  35"
file_path = "file6.fwf"

with open(file_path, "w") as f:
    f.write(data)

# Use iterator to read chunks
df_iterator = pd.read_fwf(file_path, iterator=True, chunksize=2)
print(next(df_iterator))
'''
Output:
    ID     Name  Age
0  123    Alice   25
1  456    Bob   30
'''

**kwds Optional-

This parameter allows you to pass additional keyword arguments that are forwarded to the underlying pd.read_table method.

import pandas as pd

# Create dataset
data = "123Alice   25\n456Bob     30\n789Charlie 35"
file_path = "file8.txt"

with open(file_path, "w") as f:
    f.write(data)

# Additional arguments being passed to pd.read_table()
df = pd.read_fwf(file_path, names=["ID", "Name", "Age"])
print(df)
'''
Output:
           ID  Name  Age
0    123Alice    25  NaN
1      456Bob    30  NaN
2  789Charlie    35  NaN
'''

Previous Next

BetterDocs

Support

EmailDiscordForms

Documentations

Python

Company

AboutDocs

Policies

Terms of ServicePrivacy Policy