Reshaping a pandas DataFrame from wide to long turns repeated measurement columns into row values. Analysts use that shape before grouping by period, plotting categories, or joining measurements to lookup tables.
DataFrame.melt() keeps identifier columns such as region and product while unpivoting selected measurement columns such as q1_sales, q2_sales, and q3_sales. The output contains one row for each identifier and measured column, with separate fields for the source column name and the measured value.
Name the output fields with var_name and value_name so the long table matches the downstream data model. If the original index carries information, move it into a column before melting or set ignore_index=False and confirm that repeated index labels will not confuse later joins.
Related: How to create a pandas DataFrame
Related: How to aggregate data with pandas groupby
Related: How to create a pivot table in pandas
Steps to reshape a pandas DataFrame from wide to long with melt:
- Create a Python script named melt_sales.py.
- melt_sales.py
import pandas as pd wide = pd.DataFrame( { "region": ["North", "South"], "product": ["Widget", "Widget"], "q1_sales": [120, 90], "q2_sales": [135, 104], "q3_sales": [148, 110], } ) id_columns = ["region", "product"] value_columns = ["q1_sales", "q2_sales", "q3_sales"] long = wide.melt( id_vars=id_columns, value_vars=value_columns, var_name="quarter", value_name="sales_usd", ) expected_rows = len(wide) * len(value_columns) print(f"pandas {pd.__version__}") print() print("WIDE_DATA") print(wide.to_string(index=False)) print() print("LONG_DATA") print(long.to_string(index=False)) print() print("VERIFY_SHAPE") print(f"wide rows: {len(wide)}") print(f"value columns: {len(value_columns)}") print(f"expected long rows: {expected_rows}") print(f"actual long rows: {len(long)}") print(f"columns: {', '.join(long.columns)}") print() print("SALES_BY_QUARTER") print(long.groupby("quarter", sort=False)["sales_usd"].sum().to_string())
id_columns stay fixed, while value_columns become rows. Keep value_name different from any existing column label.
- Run the script and confirm the long-form rows.
$ python3 melt_sales.py pandas 3.0.3 WIDE_DATA region product q1_sales q2_sales q3_sales North Widget 120 135 148 South Widget 90 104 110 LONG_DATA region product quarter sales_usd North Widget q1_sales 120 South Widget q1_sales 90 North Widget q2_sales 135 South Widget q2_sales 104 North Widget q3_sales 148 South Widget q3_sales 110 VERIFY_SHAPE wide rows: 2 value columns: 3 expected long rows: 6 actual long rows: 6 columns: region, product, quarter, sales_usd SALES_BY_QUARTER quarter q1_sales 210 q2_sales 239 q3_sales 258
- Set the identifier columns with id_vars.
id_columns = ["region", "product"] long = wide.melt( id_vars=id_columns, value_vars=value_columns, var_name="quarter", value_name="sales_usd", )
Identifier columns are copied into each long-form row. Use customer, product, timestamp, or other keys that explain each measurement.
- Set the measurement columns with value_vars.
value_columns = ["q1_sales", "q2_sales", "q3_sales"]
If value_vars is omitted, pandas melts every column not listed in id_vars.
- Name the source and value columns.
var_name="quarter", value_name="sales_usd"
value_name cannot match an existing column label.
- Verify the row count and column names.
expected_rows = len(wide) * len(value_columns) print(f"expected long rows: {expected_rows}") print(f"actual long rows: {len(long)}") print(f"columns: {', '.join(long.columns)}")
The long table should have one row for every original row and selected measurement column.
- Group the melted values as a smoke test.
print(long.groupby("quarter", sort=False)["sales_usd"].sum().to_string())
- Remove the sample script after adapting the pattern.
$ rm melt_sales.py
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.