Converting data types in pandas changes imported text columns into dtypes that calculations, joins, date operations, and exports can use deliberately. Use explicit conversions after reading CSV, Excel, database, or API data when identifiers, numbers, dates, booleans, and repeated labels should not rely on automatic inference.

Use astype() when values already match the target dtype. Use to_numeric() and to_datetime() when source strings need parsing and invalid values should raise an error or become missing values for review. convert_dtypes() is useful after broader cleanup because it moves compatible columns to pandas nullable extension dtypes that support pd.NA.

Keep failed conversions visible before replacing the original columns in production code. A column that converts cleanly should show the intended dtype, while bad numeric text becomes <NA> after nullable integer conversion and bad date text becomes NaT when parsing is coerced.

Steps to convert pandas DataFrame column data types:

  1. Save a dtype conversion check script.
    dtype_conversion_check.py
    import pandas as pd
     
     
    df = pd.DataFrame(
        {
            "order_id": ["1001", "1002", "BAD", "1004"],
            "ordered_at": ["2026-06-01", "2026-06-02", "not a date", "2026-06-04"],
            "quantity": ["2", "5", "", "3"],
            "paid": ["true", "false", "true", "false"],
            "region": ["EMEA", "APAC", "EMEA", "AMER"],
        }
    )
     
    print(f"pandas {pd.__version__}")
    print()
     
    print("source dtypes")
    print(df.dtypes)
    print()
     
    converted = df.copy()
    converted["order_id"] = pd.to_numeric(converted["order_id"], errors="coerce").astype("Int64")
    converted["quantity"] = pd.to_numeric(converted["quantity"], errors="coerce").astype("Int64")
    converted["ordered_at"] = pd.to_datetime(converted["ordered_at"], errors="coerce")
    converted["paid"] = converted["paid"].map({"true": True, "false": False}).astype("boolean")
    converted["region"] = converted["region"].astype("category")
     
    converted = converted.convert_dtypes()
     
    print("converted dtypes")
    print(converted.dtypes)
    print()
     
    print("converted values")
    print(converted.to_string(index=False))
    print()
     
    failed = converted[
        converted.filter(["order_id", "quantity", "ordered_at", "paid"]).isna().any(axis=1)
    ]
     
    print("rows with failed conversions")
    print(failed.to_string(index=False))

    Replace the source df with the DataFrame already loaded in the working script. Keep the original data available until the converted dtypes and failed-value checks match the expected source rules.

  2. Run the check script.
    $ python3 dtype_conversion_check.py
    pandas 3.0.3
    
    source dtypes
    order_id      str
    ordered_at    str
    quantity      str
    paid          str
    region        str
    dtype: object
    
    converted dtypes
    order_id               Int64
    ordered_at    datetime64[us]
    quantity               Int64
    paid                 boolean
    region              category
    dtype: object
    
    converted values
     order_id ordered_at  quantity  paid region
         1001 2026-06-01         2  True   EMEA
         1002 2026-06-02         5 False   APAC
         <NA>        NaT      <NA>  True   EMEA
         1004 2026-06-04         3 False   AMER
    
    rows with failed conversions
     order_id ordered_at  quantity  paid region
         <NA>        NaT      <NA>  True   EMEA

    pandas 3.x infers text columns as str in this source DataFrame. Older projects may show object for the same input, but the explicit conversions still target the same output dtypes.

  3. Convert already-valid columns with an astype() mapping.
    df = df.astype(
        {
            "customer_id": "string",
            "region": "category",
        }
    )

    astype() raises when values cannot be cast to the requested dtype. Avoid errors=“ignore” during validation because it can leave a column on the original dtype without making the failed conversion obvious.

  4. Convert integer-like text with to_numeric() before assigning a nullable integer dtype.
    df["quantity"] = pd.to_numeric(df["quantity"], errors="coerce").astype("Int64")

    Use Int64 when missing values may remain after parsing. Use Float64 instead when decimal values are valid and should be preserved.

  5. Convert date text with to_datetime().
    df["ordered_at"] = pd.to_datetime(df["ordered_at"], errors="coerce")

    Invalid or out-of-range date strings become NaT with errors=“coerce”. Use errors=“raise” when the import should stop on the first bad date.
    Related: How to parse datetimes in pandas

  6. Convert controlled text labels to nullable booleans.
    df["paid"] = df["paid"].map({"true": True, "false": False}).astype("boolean")

    Map the allowed labels before casting so unexpected labels become missing and can be counted.

  7. Convert repeated labels to category when the label set should be compact or ordered.
    df["region"] = df["region"].astype("category")

    category is best for repeated labels such as regions, teams, states, priorities, or status names.
    Related: How to convert columns to categorical data in pandas

  8. Run convert_dtypes() after parsing columns that need pandas nullable dtypes.
    df = df.convert_dtypes()

    convert_dtypes() converts compatible columns to nullable pandas dtypes, but it does not replace explicit parsing for arbitrary date strings, currency text, or custom boolean labels.

  9. Verify the converted dtypes and missing-value counts.
    print(df.dtypes)
    print(df.filter(["order_id", "quantity", "ordered_at", "paid"]).isna().sum())

    The dtype output should match the intended target columns. Any nonzero missing-value count after coercion needs review before joins, calculations, or exports use the converted data.

  10. Remove the temporary check script after the project code covers the same conversion checks.
    $ rm dtype_conversion_check.py