How to create a pivot table in pandas

Creating a pivot table in pandas reshapes long-form records into a summary table with row groups, column groups, and aggregated values. It is useful when transactions, measurements, or logs need the same cross-tab view that a spreadsheet pivot table would show.

The pd.pivot_table() function takes a source DataFrame, chooses one or more fields for the row index, one or more fields for the output columns, and aggregates a value column. Duplicate row and column combinations are expected, so the chosen aggfunc determines how records collapse into one cell.

Use a numeric aggregation such as sum for totals and mean for averages, then fill missing row and column combinations only when a downstream report needs visible zeros. A separate grouped-source check should compare at least one pivot cell and the grand total before the table feeds a report or chart.

Steps to create a pandas pivot table:

Save the pivot-table script.

pivot_table_demo.py

import pandas as pd
 
sales = pd.DataFrame(
    {
        "region": ["East", "East", "East", "West", "West", "West", "North", "North"],
        "quarter": ["Q1", "Q1", "Q2", "Q1", "Q2", "Q2", "Q1", "Q3"],
        "category": [
            "hardware",
            "software",
            "hardware",
            "hardware",
            "hardware",
            "software",
            "software",
            "hardware",
        ],
        "revenue": [1200, 800, 950, 1400, 1100, 650, 500, 700],
    }
)
 
pivot = pd.pivot_table(
    sales,
    values="revenue",
    index="region",
    columns="quarter",
    aggfunc="sum",
    fill_value=0,
    margins=True,
    margins_name="Total",
)
 
print(pivot.to_string())
print()
print("West Q2 revenue:", pivot.loc["West", "Q2"])
print("grand total:", pivot.loc["Total", "Total"])
 
expected = (
    sales.groupby(["region", "quarter"])["revenue"]
    .sum()
    .unstack(fill_value=0)
)
 
assert pivot.loc["West", "Q2"] == expected.loc["West", "Q2"]
assert pivot.loc["Total", "Total"] == sales["revenue"].sum()
 
print("verification: pivot totals match grouped source data")

pivot_table() is the right function when more than one source row can land in the same row and column cell. Use pivot() only when each row and column combination is already unique and no aggregation is needed.

Run the script.

$ python3 pivot_table_demo.py
quarter    Q1    Q2   Q3  Total
region                         
East     2000   950    0   2950
North     500     0  700   1200
West     1400  1750    0   3150
Total    3900  2700  700   7300
 
West Q2 revenue: 1750
grand total: 7300
verification: pivot totals match grouped source data

Read the row and column axes in the printed table.

index=“region” creates the output rows, columns=“quarter” creates the output columns, and values=“revenue” selects the numbers to aggregate.
Keep aggfunc=“sum” when the pivot cells should be totals.

The West and Q2 cell shows 1750 because two source rows match that row and column pair, with revenues 1100 and 650.
Keep fill_value=0 only when blank pivot cells should display as zero.

fill_value replaces missing cells in the pivot-table result after aggregation. It does not repair missing values in the source DataFrame.
Keep margins=True when the table needs row and column totals.

margins_name=“Total” names the summary row and column, and the totals use the same aggfunc as the main pivot cells.
Reuse the grouped-source check with the real grouping fields.

The two assert lines compare one pivot cell and the grand total with values computed from the source rows. Add more checks for report-critical cells before exporting or charting the table.

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.