How to convert data types in pandas

Converting data types in pandas changes imported text columns into dtypes that calculations, joins, date operations, and exports can use deliberately. Use explicit conversions after reading CSV, Excel, database, or API data when identifiers, numbers, dates, booleans, and repeated labels should not rely on automatic inference.

Use astype() when values already match the target dtype. Use to_numeric() and to_datetime() when source strings need parsing and invalid values should raise an error or become missing values for review. convert_dtypes() is useful after broader cleanup because it moves compatible columns to pandas nullable extension dtypes that support pd.NA.

Keep failed conversions visible before replacing the original columns in production code. A column that converts cleanly should show the intended dtype, while bad numeric text becomes <NA> after nullable integer conversion and bad date text becomes NaT when parsing is coerced.

Steps to convert pandas DataFrame column data types:

Save a dtype conversion check script.

dtype_conversion_check.py

import pandas as pd
 
 
df = pd.DataFrame(
    {
        "order_id": ["1001", "1002", "BAD", "1004"],
        "ordered_at": ["2026-06-01", "2026-06-02", "not a date", "2026-06-04"],
        "quantity": ["2", "5", "", "3"],
        "paid": ["true", "false", "true", "false"],
        "region": ["EMEA", "APAC", "EMEA", "AMER"],
    }
)
 
print(f"pandas {pd.__version__}")
print()
 
print("source dtypes")
print(df.dtypes)
print()
 
converted = df.copy()
converted["order_id"] = pd.to_numeric(converted["order_id"], errors="coerce").astype("Int64")
converted["quantity"] = pd.to_numeric(converted["quantity"], errors="coerce").astype("Int64")
converted["ordered_at"] = pd.to_datetime(converted["ordered_at"], errors="coerce")
converted["paid"] = converted["paid"].map({"true": True, "false": False}).astype("boolean")
converted["region"] = converted["region"].astype("category")
 
converted = converted.convert_dtypes()
 
print("converted dtypes")
print(converted.dtypes)
print()
 
print("converted values")
print(converted.to_string(index=False))
print()
 
failed = converted[
    converted.filter(["order_id", "quantity", "ordered_at", "paid"]).isna().any(axis=1)
]
 
print("rows with failed conversions")
print(failed.to_string(index=False))

Replace the source df with the DataFrame already loaded in the working script. Keep the original data available until the converted dtypes and failed-value checks match the expected source rules.

Run the check script.

$ python3 dtype_conversion_check.py
pandas 3.0.3

source dtypes
order_id      str
ordered_at    str
quantity      str
paid          str
region        str
dtype: object

converted dtypes
order_id               Int64
ordered_at    datetime64[us]
quantity               Int64
paid                 boolean
region              category
dtype: object

converted values
 order_id ordered_at  quantity  paid region
     1001 2026-06-01         2  True   EMEA
     1002 2026-06-02         5 False   APAC
     <NA>        NaT      <NA>  True   EMEA
     1004 2026-06-04         3 False   AMER

rows with failed conversions
 order_id ordered_at  quantity  paid region
     <NA>        NaT      <NA>  True   EMEA

pandas 3.x infers text columns as str in this source DataFrame. Older projects may show object for the same input, but the explicit conversions still target the same output dtypes.

Convert already-valid columns with an astype() mapping.
```
df = df.astype(
    {
        "customer_id": "string",
        "region": "category",
    }
)
```
astype() raises when values cannot be cast to the requested dtype. Avoid errors=“ignore” during validation because it can leave a column on the original dtype without making the failed conversion obvious.
Convert integer-like text with to_numeric() before assigning a nullable integer dtype.
```
df["quantity"] = pd.to_numeric(df["quantity"], errors="coerce").astype("Int64")
```
Use Int64 when missing values may remain after parsing. Use Float64 instead when decimal values are valid and should be preserved.
Convert date text with to_datetime().
```
df["ordered_at"] = pd.to_datetime(df["ordered_at"], errors="coerce")
```
Invalid or out-of-range date strings become NaT with errors=“coerce”. Use errors=“raise” when the import should stop on the first bad date.
Related: How to parse datetimes in pandas
Convert controlled text labels to nullable booleans.
```
df["paid"] = df["paid"].map({"true": True, "false": False}).astype("boolean")
```
Map the allowed labels before casting so unexpected labels become missing and can be counted.
Convert repeated labels to category when the label set should be compact or ordered.
```
df["region"] = df["region"].astype("category")
```
category is best for repeated labels such as regions, teams, states, priorities, or status names.
Related: How to convert columns to categorical data in pandas
Run convert_dtypes() after parsing columns that need pandas nullable dtypes.
```
df = df.convert_dtypes()
```
convert_dtypes() converts compatible columns to nullable pandas dtypes, but it does not replace explicit parsing for arbitrary date strings, currency text, or custom boolean labels.
Verify the converted dtypes and missing-value counts.
```
print(df.dtypes)
print(df.filter(["order_id", "quantity", "ordered_at", "paid"]).isna().sum())
```
The dtype output should match the intended target columns. Any nonzero missing-value count after coercion needs review before joins, calculations, or exports use the converted data.
Remove the temporary check script after the project code covers the same conversion checks.
```
$ rm dtype_conversion_check.py
```

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.