Reading and writing JSON with pandas is useful when Python data work has to exchange records with APIs, log exports, or small handoff files. A DataFrame can load record-shaped JSON, inspect the columns and values, and write the result back to a file that another program can read.
JSON Lines keeps one JSON object on each non-blank line, which maps directly to DataFrame rows without wrapping the whole file in one large array. In pandas, read_json(…, lines=True) reads that shape, while to_json(…, orient=“records”, lines=True) writes the same row-per-line layout.
Identifiers such as order numbers, account numbers, and ticket IDs should be read as strings when exact text matters. Date columns are clearer when listed explicitly in convert_dates, and date_format=“iso” keeps exported timestamps readable during the next import.
Steps to read and write JSON records with pandas:
- Create a JSON Lines input file named orders.jsonl.
- orders.jsonl
{"order_id":"A100","customer":"Ada","total":125.5,"ordered_at":"2026-06-01T09:30:00Z"} {"order_id":"A101","customer":"Lin","total":88.0,"ordered_at":"2026-06-01T10:15:00Z"} {"order_id":"A102","customer":"Mira","total":142.25,"ordered_at":"2026-06-02T14:05:00Z"}
Use one complete JSON object per non-blank line. When checking an external file first, choose NDJSON / JSON Lines in the validator.
Tool: JSON Validator - Create a Python script named json_roundtrip.py.
- json_roundtrip.py
from pathlib import Path import pandas as pd source = Path("orders.jsonl") output = Path("orders-out.jsonl") orders = pd.read_json( source, lines=True, dtype={"order_id": "string"}, convert_dates=["ordered_at"], ) orders.to_json( output, orient="records", lines=True, date_format="iso", ) round_trip = pd.read_json( output, lines=True, dtype={"order_id": "string"}, convert_dates=["ordered_at"], ) print(orders.loc[:, ["order_id", "customer", "total"]]) print() print(orders.dtypes.reindex(["order_id", "customer", "total", "ordered_at"])) print(f"written rows: {len(round_trip)}") print(f"same order IDs: {round_trip['order_id'].tolist() == orders['order_id'].tolist()}")
lines=True tells pandas to treat each line as a record. orient=“records” is required when to_json() writes line-delimited JSON.
- Run the script.
$ python3 json_roundtrip.py order_id customer total 0 A100 Ada 125.50 1 A101 Lin 88.00 2 A102 Mira 142.25 order_id string customer str total float64 ordered_at datetime64[us, UTC] dtype: object written rows: 3 same order IDs: True
- Check the written JSON Lines file before handing it to another tool.
$ cat orders-out.jsonl {"order_id":"A100","customer":"Ada","total":125.5,"ordered_at":"2026-06-01T09:30:00.000Z"} {"order_id":"A101","customer":"Lin","total":88.0,"ordered_at":"2026-06-01T10:15:00.000Z"} {"order_id":"A102","customer":"Mira","total":142.25,"ordered_at":"2026-06-02T14:05:00.000Z"}The timestamp values are ISO strings because the write step uses date_format=“iso”.
- Remove the sample files when they were created only for testing.
$ rm orders.jsonl orders-out.jsonl json_roundtrip.py
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.