Resampling time series data in pandas converts timestamped rows into regular time buckets, such as daily sales totals from event-level orders. It is useful when logs, sensor readings, or transactions need to align with reporting periods before charting or modeling.
A timestamp column can stay as normal data when on=“sold_at” tells DataFrame.resample() which datetime values define the buckets. The aggregation then works like a time-based groupby, with one reducer per output column.
Daily resampling is a downsampling case because several rows collapse into one row per day. Check the bucket labels, the frequency on the result index, and at least one source-to-result total before a resampled table feeds a report or plot.
Related: How to plot a pandas DataFrame
Related: How to read CSV files with pandas
Steps to resample time series data in pandas:
- Save the resampling script.
- resample_sales.py
import pandas as pd sales = pd.DataFrame( { "sold_at": [ "2026-06-17 08:15", "2026-06-17 11:40", "2026-06-18 09:05", "2026-06-18 13:10", "2026-06-18 18:30", "2026-06-19 10:20", ], "region": ["east", "east", "east", "west", "west", "west"], "revenue": [120, 95, 180, 245, 400, 160], } ) sales["sold_at"] = pd.to_datetime(sales["sold_at"]) daily = ( sales.resample("D", on="sold_at", label="left", closed="left") .agg(revenue=("revenue", "sum"), orders=("revenue", "size")) ) daily.index.name = "sales_day" print("Daily revenue buckets") print(daily.to_string()) print() print("index name:", daily.index.name) print("index frequency:", daily.index.freqstr) print("2026-06-18 revenue:", int(daily.loc["2026-06-18", "revenue"])) print("2026-06-18 orders:", int(daily.loc["2026-06-18", "orders"])) print("source revenue:", int(sales["revenue"].sum())) print("resampled revenue:", int(daily["revenue"].sum())) assert daily.index.freqstr == "D" assert int(daily.loc["2026-06-18", "revenue"]) == 825 assert int(daily.loc["2026-06-18", "orders"]) == 3 assert int(daily["revenue"].sum()) == int(sales["revenue"].sum()) print("verification: daily buckets match source revenue totals")
pd.to_datetime() makes the timestamp column datetime-like before resample() uses it for the bucket index.
- Run the script.
$ python3 resample_sales.py Daily revenue buckets revenue orders sales_day 2026-06-17 215 2 2026-06-18 825 3 2026-06-19 160 1 index name: sales_day index frequency: D 2026-06-18 revenue: 825 2026-06-18 orders: 3 source revenue: 1200 resampled revenue: 1200 verification: daily buckets match source revenue totals - Keep on=“sold_at” when the timestamp values should remain a column.
DataFrame.resample() needs a datetime-like index, level, or on column. Use set_index(“sold_at”) first when downstream code should treat timestamps as the row labels.
Related: How to set an index in pandas - Set the target bucket size in the resample(“D”) rule.
“D” creates daily buckets. Use a smaller fixed frequency such as “h” for hourly buckets or a period-end offset such as “ME” for month-end reporting.
- Keep label=“left” and closed=“left” when each output label should mark the start of its bucket.
Weekly and period-end frequencies can use right-edge defaults. Set label and closed explicitly when report labels must not shift a later value into an earlier-looking bucket.
- Choose named aggregations for each output column.
The script sums revenue and counts rows with size. Replace those reducers with mean, max, min, or another aggregation that matches the metric being reported.
- Verify the resampled table before using it downstream.
The assertions check the daily frequency, one known daily total, one order count, and total revenue preservation between the source rows and the resampled output.
- Remove the sample script after confirming the resampling behavior.
$ rm resample_sales.py
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.