Converting data types in pandas changes imported text columns into dtypes that calculations, joins, date operations, and exports can use deliberately. Use explicit conversions after reading CSV, Excel, database, or API data when identifiers, numbers, dates, booleans, and repeated labels should not rely on automatic inference.
Use astype() when values already match the target dtype. Use to_numeric() and to_datetime() when source strings need parsing and invalid values should raise an error or become missing values for review. convert_dtypes() is useful after broader cleanup because it moves compatible columns to pandas nullable extension dtypes that support pd.NA.
Keep failed conversions visible before replacing the original columns in production code. A column that converts cleanly should show the intended dtype, while bad numeric text becomes <NA> after nullable integer conversion and bad date text becomes NaT when parsing is coerced.
Related: How to read CSV files with pandas
Related: How to reduce pandas DataFrame memory usage
import pandas as pd df = pd.DataFrame( { "order_id": ["1001", "1002", "BAD", "1004"], "ordered_at": ["2026-06-01", "2026-06-02", "not a date", "2026-06-04"], "quantity": ["2", "5", "", "3"], "paid": ["true", "false", "true", "false"], "region": ["EMEA", "APAC", "EMEA", "AMER"], } ) print(f"pandas {pd.__version__}") print() print("source dtypes") print(df.dtypes) print() converted = df.copy() converted["order_id"] = pd.to_numeric(converted["order_id"], errors="coerce").astype("Int64") converted["quantity"] = pd.to_numeric(converted["quantity"], errors="coerce").astype("Int64") converted["ordered_at"] = pd.to_datetime(converted["ordered_at"], errors="coerce") converted["paid"] = converted["paid"].map({"true": True, "false": False}).astype("boolean") converted["region"] = converted["region"].astype("category") converted = converted.convert_dtypes() print("converted dtypes") print(converted.dtypes) print() print("converted values") print(converted.to_string(index=False)) print() failed = converted[ converted.filter(["order_id", "quantity", "ordered_at", "paid"]).isna().any(axis=1) ] print("rows with failed conversions") print(failed.to_string(index=False))
Replace the source df with the DataFrame already loaded in the working script. Keep the original data available until the converted dtypes and failed-value checks match the expected source rules.
$ python3 dtype_conversion_check.py
pandas 3.0.3
source dtypes
order_id str
ordered_at str
quantity str
paid str
region str
dtype: object
converted dtypes
order_id Int64
ordered_at datetime64[us]
quantity Int64
paid boolean
region category
dtype: object
converted values
order_id ordered_at quantity paid region
1001 2026-06-01 2 True EMEA
1002 2026-06-02 5 False APAC
<NA> NaT <NA> True EMEA
1004 2026-06-04 3 False AMER
rows with failed conversions
order_id ordered_at quantity paid region
<NA> NaT <NA> True EMEA
pandas 3.x infers text columns as str in this source DataFrame. Older projects may show object for the same input, but the explicit conversions still target the same output dtypes.
df = df.astype( { "customer_id": "string", "region": "category", } )
astype() raises when values cannot be cast to the requested dtype. Avoid errors=“ignore” during validation because it can leave a column on the original dtype without making the failed conversion obvious.
df["quantity"] = pd.to_numeric(df["quantity"], errors="coerce").astype("Int64")
Use Int64 when missing values may remain after parsing. Use Float64 instead when decimal values are valid and should be preserved.
df["ordered_at"] = pd.to_datetime(df["ordered_at"], errors="coerce")
Invalid or out-of-range date strings become NaT with errors=“coerce”. Use errors=“raise” when the import should stop on the first bad date.
Related: How to parse datetimes in pandas
df["paid"] = df["paid"].map({"true": True, "false": False}).astype("boolean")
Map the allowed labels before casting so unexpected labels become missing and can be counted.
df["region"] = df["region"].astype("category")
category is best for repeated labels such as regions, teams, states, priorities, or status names.
Related: How to convert columns to categorical data in pandas
df = df.convert_dtypes()
convert_dtypes() converts compatible columns to nullable pandas dtypes, but it does not replace explicit parsing for arbitrary date strings, currency text, or custom boolean labels.
print(df.dtypes) print(df.filter(["order_id", "quantity", "ordered_at", "paid"]).isna().sum())
The dtype output should match the intended target columns. Any nonzero missing-value count after coercion needs review before joins, calculations, or exports use the converted data.
$ rm dtype_conversion_check.py