Creating a pivot table in pandas reshapes long-form records into a summary table with row groups, column groups, and aggregated values. It is useful when transactions, measurements, or logs need the same cross-tab view that a spreadsheet pivot table would show.
The pd.pivot_table() function takes a source DataFrame, chooses one or more fields for the row index, one or more fields for the output columns, and aggregates a value column. Duplicate row and column combinations are expected, so the chosen aggfunc determines how records collapse into one cell.
Use a numeric aggregation such as sum for totals and mean for averages, then fill missing row and column combinations only when a downstream report needs visible zeros. A separate grouped-source check should compare at least one pivot cell and the grand total before the table feeds a report or chart.
Related: How to read CSV files with pandas
Steps to create a pandas pivot table:
- Save the pivot-table script.
- pivot_table_demo.py
import pandas as pd sales = pd.DataFrame( { "region": ["East", "East", "East", "West", "West", "West", "North", "North"], "quarter": ["Q1", "Q1", "Q2", "Q1", "Q2", "Q2", "Q1", "Q3"], "category": [ "hardware", "software", "hardware", "hardware", "hardware", "software", "software", "hardware", ], "revenue": [1200, 800, 950, 1400, 1100, 650, 500, 700], } ) pivot = pd.pivot_table( sales, values="revenue", index="region", columns="quarter", aggfunc="sum", fill_value=0, margins=True, margins_name="Total", ) print(pivot.to_string()) print() print("West Q2 revenue:", pivot.loc["West", "Q2"]) print("grand total:", pivot.loc["Total", "Total"]) expected = ( sales.groupby(["region", "quarter"])["revenue"] .sum() .unstack(fill_value=0) ) assert pivot.loc["West", "Q2"] == expected.loc["West", "Q2"] assert pivot.loc["Total", "Total"] == sales["revenue"].sum() print("verification: pivot totals match grouped source data")
pivot_table() is the right function when more than one source row can land in the same row and column cell. Use pivot() only when each row and column combination is already unique and no aggregation is needed.
- Run the script.
$ python3 pivot_table_demo.py quarter Q1 Q2 Q3 Total region East 2000 950 0 2950 North 500 0 700 1200 West 1400 1750 0 3150 Total 3900 2700 700 7300 West Q2 revenue: 1750 grand total: 7300 verification: pivot totals match grouped source data
- Read the row and column axes in the printed table.
index=“region” creates the output rows, columns=“quarter” creates the output columns, and values=“revenue” selects the numbers to aggregate.
- Keep aggfunc=“sum” when the pivot cells should be totals.
The West and Q2 cell shows 1750 because two source rows match that row and column pair, with revenues 1100 and 650.
- Keep fill_value=0 only when blank pivot cells should display as zero.
fill_value replaces missing cells in the pivot-table result after aggregation. It does not repair missing values in the source DataFrame.
- Keep margins=True when the table needs row and column totals.
margins_name=“Total” names the summary row and column, and the totals use the same aggfunc as the main pivot cells.
- Reuse the grouped-source check with the real grouping fields.
The two assert lines compare one pivot cell and the grand total with values computed from the source rows. Add more checks for report-critical cells before exporting or charting the table.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.