How to migrate pandas code for Copy-on-Write

Migrating pandas code for Copy-on-Write means replacing updates that depend on a view changing its parent DataFrame with assignments that name the object that should change. pandas 3.0 makes subsets and returned Series objects behave as copies from the user API, so chained assignments and column-view mutations no longer update the original data.

Copy-on-Write still lets pandas share memory internally until a write occurs, but user code should treat derived DataFrame and Series objects as independent results. A migration pass should focus on writes through chained brackets, writes through a selected column, and external mutations of arrays returned from pandas objects.

Run the migration against tests, notebooks, or small fixtures that exercise the old mutating paths. Migrated code should show the intended parent update, leave unrelated subsets unchanged, and use explicit copies when NumPy code needs a writable array.

Steps to migrate pandas code for Copy-on-Write:

  1. Enable Copy-on-Write warning mode during the final pandas 2.x test run.
    import pandas as pd
     
    pd.options.mode.copy_on_write = "warn"

    The warning mode is intentionally noisy. In pandas 3.0, Copy-on-Write is the only behavior and pd.options.mode.copy_on_write no longer changes how assignments work.

  2. Replace chained assignment with one loc assignment on the parent DataFrame.
    # Before
    df["score"][df["name"].eq("Lin")] = 100
     
    # After
    df.loc[df["name"].eq("Lin"), "score"] = 100

    A single loc assignment updates the parent DataFrame directly instead of writing through a temporary Series.

  3. Replace selected-column inplace methods with reassignment to the same column.
    # Before
    df["score"].replace(88, 100, inplace=True)
     
    # After
    df["score"] = df["score"].replace(88, 100)

    inplace=True on df[“score”] targets a selected Series, not the parent DataFrame, so pandas 3.0 warns and leaves the parent value unchanged.

  4. Move shared Series mutations back to the parent object.
    # Before
    score = df["score"]
    score.iloc[0] = 10
     
    # After
    df.loc[df.index[0], "score"] = 10

    Keep a separate Series only when the parent DataFrame should remain unchanged.

  5. Copy NumPy arrays before mutating them outside pandas.
    # Before
    arr = df["score"].to_numpy()
    arr[0] = 10
     
    # After
    arr = df["score"].to_numpy().copy()
    arr[0] = 10

    When the NumPy calculation should update the DataFrame, assign the finished array back through df.loc[:, “score”] = arr.

  6. Create a focused migration check fixture from the rewritten patterns.
    cow_migration_check.py
    import warnings
     
    import pandas as pd
     
     
    def frame():
        return pd.DataFrame(
            {
                "name": ["Ada", "Lin", "Bo"],
                "score": [95, 88, 91],
            }
        )
     
     
    def score_for(df, name):
        row = df["name"].eq(name)
        return int(df.loc[row, "score"].iloc[0])
     
     
    print(f"pandas {pd.__version__}")
     
    df = frame()
    with warnings.catch_warnings(record=True) as caught:
        warnings.simplefilter("always")
        df["score"][df["name"].eq("Lin")] = 100
     
    warning_name = caught[0].category.__name__ if caught else "none"
    print(f"chained assignment warning: {warning_name}")
    print(f"Lin after chained assignment: {score_for(df, 'Lin')}")
     
    df = frame()
    df.loc[df["name"].eq("Lin"), "score"] = 100
    print(f"Lin after loc rewrite: {score_for(df, 'Lin')}")
     
    df = frame()
    with warnings.catch_warnings(record=True) as caught:
        warnings.simplefilter("always")
        df["score"].replace(88, 100, inplace=True)
     
    warning_name = caught[0].category.__name__ if caught else "none"
    print(f"inplace column warning: {warning_name}")
    print(f"Lin after inplace replace: {score_for(df, 'Lin')}")
     
    df["score"] = df["score"].replace(88, 100)
    print(f"Lin after column reassignment: {score_for(df, 'Lin')}")
     
    df = frame()
    score_view = df["score"]
    score_view.iloc[0] = 10
    print(f"Ada parent score after Series mutation: {score_for(df, 'Ada')}")
     
    df = frame()
    arr = df["score"].to_numpy()
    try:
        arr[0] = 10
    except ValueError as exc:
        print(f"NumPy view write: {exc}")
     
    arr = df["score"].to_numpy().copy()
    arr[0] = 10
    print(f"copied NumPy array first value: {int(arr[0])}")
    print(f"Ada parent score after array copy: {score_for(df, 'Ada')}")

    Use project column names and fixtures when the migrated code depends on specific indexes, missing values, or extension dtypes.

  7. Run the migration check fixture.
    $ python3 cow_migration_check.py
    pandas 3.0.3
    chained assignment warning: ChainedAssignmentError
    Lin after chained assignment: 88
    Lin after loc rewrite: 100
    inplace column warning: ChainedAssignmentError
    Lin after inplace replace: 88
    Lin after column reassignment: 100
    Ada parent score after Series mutation: 95
    NumPy view write: assignment destination is read-only
    copied NumPy array first value: 10
    Ada parent score after array copy: 95
  8. Remove the temporary migration check fixture after the migrated project tests cover the same cases.
    $ rm cow_migration_check.py