Creating a pandas DataFrame from in-memory Python data gives tabular rows and columns a labeled structure before filtering, joining, cleaning, or exporting. The DataFrame constructor fits data that is already in Python as dictionaries, lists, arrays, or another object rather than coming from a file.

The pd.DataFrame() constructor accepts dictionaries, list-like records, arrays, and other DataFrame inputs. For row-oriented data, a list of dictionaries keeps each input row readable, while columns fixes the output order and index supplies row labels when the default numeric index is not meaningful.

A finished check should show the expected shape, column order, row labels, inferred dtypes, and at least one selected row. In pandas 3, string columns are inferred as str by default, so dtype checks can differ from older examples that showed object for text columns.

Steps to create a pandas DataFrame:

  1. Save a short DataFrame creation script.
    create_dataframe.py
    import pandas as pd
     
    records = [
        {
            "order_id": 1001,
            "customer": "Ada Lovelace",
            "region": "EMEA",
            "total_usd": 149.50,
            "paid": True,
        },
        {
            "order_id": 1002,
            "customer": "Lin Chen",
            "region": "APAC",
            "total_usd": 89.00,
            "paid": False,
        },
        {
            "order_id": 1003,
            "customer": "Maya Patel",
            "region": "AMER",
            "total_usd": 212.00,
            "paid": True,
        },
    ]
     
    column_order = ["order_id", "customer", "region", "total_usd", "paid"]
    row_labels = ["order-1001", "order-1002", "order-1003"]
     
    df = pd.DataFrame(records, columns=column_order, index=row_labels)
     
    print(f"pandas {pd.__version__}")
    print()
     
    print("DATAFRAME")
    print(df)
    print()
     
    print("VERIFY_SHAPE")
    print(df.shape)
    print()
     
    print("VERIFY_COLUMNS")
    print(df.columns.tolist())
    print()
     
    print("VERIFY_INDEX")
    print(df.index.tolist())
    print()
     
    print("VERIFY_DTYPES")
    print(df.dtypes)
    print()
     
    print("VERIFY_ROW")
    print(df.loc["order-1002", ["customer", "total_usd"]])

    Use a list of dictionaries when each input item represents one row. If the source is column-oriented, pass a dictionary of equal-length lists instead.

  2. Run the script and confirm the table and verification output.
    $ python3 create_dataframe.py
    pandas 3.0.3
    
    DATAFRAME
                order_id      customer region  total_usd   paid
    order-1001      1001  Ada Lovelace   EMEA      149.5   True
    order-1002      1002      Lin Chen   APAC       89.0  False
    order-1003      1003    Maya Patel   AMER      212.0   True
    
    VERIFY_SHAPE
    (3, 5)
    
    VERIFY_COLUMNS
    ['order_id', 'customer', 'region', 'total_usd', 'paid']
    
    VERIFY_INDEX
    ['order-1001', 'order-1002', 'order-1003']
    
    VERIFY_DTYPES
    order_id       int64
    customer         str
    region           str
    total_usd    float64
    paid            bool
    dtype: object
    
    VERIFY_ROW
    customer     Lin Chen
    total_usd        89.0
    Name: order-1002, dtype: object
  3. Create the DataFrame from the record list.
    df = pd.DataFrame(records, columns=column_order, index=row_labels)

    columns keeps the output order explicit. index replaces the default RangeIndex with labels that can be used with .loc.

  4. Verify the row and column count.
    print(df.shape)

    The shape tuple is rows, columns, so (3, 5) means three records and five fields.

  5. Verify the column and row labels.
    print(df.columns.tolist())
    print(df.index.tolist())
  6. Check inferred dtypes before analysis continues.
    print(df.dtypes)

    Use .astype() after construction when a column needs a specific dtype before calculations, joins, or exports.
    Related: How to convert data types in pandas

  7. Select one known row to confirm representative values.
    print(df.loc["order-1002", ["customer", "total_usd"]])

    .loc selects by row label and column label, which makes it a direct check that the custom index and field names match the intended data.