Filtering rows in a pandas DataFrame keeps records that meet a condition before analysis, reporting, or export. Boolean masks work well for thresholds, membership checks, status fields, and text matches because each row receives a True or False decision.

Use .loc when the filter also selects output columns, because the expression before the comma chooses rows and the expression after the comma chooses columns. A mask built from a column comparison must align with the DataFrame rows, and pandas preserves the original index labels in the filtered result.

Combine conditions with & and | inside parentheses rather than Python and or or, which cannot reduce a Series of booleans to one truth value. Create a .copy() of the filtered result before changing it independently from the source DataFrame.

Steps to filter pandas DataFrame rows:

  1. Save a short row-filtering script.
    filter_rows.py
    import pandas as pd
     
    orders = pd.DataFrame(
        {
            "order_id": [1001, 1002, 1003, 1004, 1005, 1006],
            "customer": ["Ada", "Lin", "Maya", "Omar", "Nia", "Kai"],
            "region": ["EMEA", "APAC", "AMER", "EMEA", "APAC", "AMER"],
            "category": ["hardware", "software", "hardware", "service", "software", "hardware"],
            "qty": [3, 12, 7, 2, 15, 9],
            "total_usd": [150.0, 240.0, 875.0, 95.0, 360.0, 420.0],
            "status": ["paid", "paid", "open", "paid", "open", "paid"],
        }
    )
     
    print(f"pandas {pd.__version__}")
    print()
     
    print("BASE")
    print(orders.to_string(index=False))
    print()
     
    high_value = orders.loc[
        orders["total_usd"] >= 300,
        ["order_id", "customer", "region", "total_usd"],
    ]
     
    print("FILTER_TOTAL")
    print(high_value.to_string(index=False))
    print()
     
    regions = ["EMEA", "APAC"]
    priority_mask = (orders["region"].isin(regions)) & (orders["qty"] >= 10)
    priority = orders.loc[
        priority_mask,
        ["order_id", "customer", "region", "qty", "total_usd"],
    ]
     
    print("FILTER_REGION_QTY")
    print(priority.to_string(index=False))
    print()
     
    paid_hardware = orders.loc[
        (orders["status"] == "paid") & (orders["category"] == "hardware"),
        ["order_id", "customer", "category", "status", "total_usd"],
    ]
     
    print("FILTER_STATUS_CATEGORY")
    print(paid_hardware.to_string(index=False))
    print()
     
    customer_match = orders.loc[
        orders["customer"].str.contains("a", case=False, regex=False, na=False),
        ["order_id", "customer", "region"],
    ]
     
    print("FILTER_CUSTOMER_TEXT")
    print(customer_match.to_string(index=False))
    print()
     
    print("VERIFY")
    print(f"priority rows: {len(priority)}")
    print(f"priority index: {priority.index.tolist()}")
    print(f"all priority qty >= 10: {priority['qty'].ge(10).all()}")
    print(f"source rows unchanged: {len(orders)}")

    Replace the sample orders object with the DataFrame already loaded in the working script.

  2. Run the script and confirm that each section returns only matching rows.
    $ python3 filter_rows.py
    pandas 3.0.3
    
    BASE
     order_id customer region category  qty  total_usd status
         1001      Ada   EMEA hardware    3      150.0   paid
         1002      Lin   APAC software   12      240.0   paid
         1003     Maya   AMER hardware    7      875.0   open
         1004     Omar   EMEA  service    2       95.0   paid
         1005      Nia   APAC software   15      360.0   open
         1006      Kai   AMER hardware    9      420.0   paid
    
    FILTER_TOTAL
     order_id customer region  total_usd
         1003     Maya   AMER      875.0
         1005      Nia   APAC      360.0
         1006      Kai   AMER      420.0
    
    FILTER_REGION_QTY
     order_id customer region  qty  total_usd
         1002      Lin   APAC   12      240.0
         1005      Nia   APAC   15      360.0
    
    FILTER_STATUS_CATEGORY
     order_id customer category status  total_usd
         1001      Ada hardware   paid      150.0
         1006      Kai hardware   paid      420.0
    
    FILTER_CUSTOMER_TEXT
     order_id customer region
         1001      Ada   EMEA
         1003     Maya   AMER
         1004     Omar   EMEA
         1005      Nia   APAC
         1006      Kai   AMER
    
    VERIFY
    priority rows: 2
    priority index: [1, 4]
    all priority qty >= 10: True
    source rows unchanged: 6
  3. Filter rows with a numeric threshold.
    high_value = orders.loc[
        orders["total_usd"] >= 300,
        ["order_id", "customer", "region", "total_usd"],
    ]

    The row condition comes before the comma, and the selected output columns come after it.
    Related: How to select DataFrame rows and columns with loc and iloc in pandas

  4. Match rows whose value appears in an allowed list.
    regions = ["EMEA", "APAC"]
    priority_mask = (orders["region"].isin(regions)) & (orders["qty"] >= 10)
    priority = orders.loc[
        priority_mask,
        ["order_id", "customer", "region", "qty", "total_usd"],
    ]

    Use parentheses around each comparison when combining masks. Python and and or raise an ambiguous truth-value error with pandas Series objects.

  5. Filter rows by exact status and category values.
    paid_hardware = orders.loc[
        (orders["status"] == "paid") & (orders["category"] == "hardware"),
        ["order_id", "customer", "category", "status", "total_usd"],
    ]
  6. Filter text rows with a string condition.
    customer_match = orders.loc[
        orders["customer"].str.contains("a", case=False, regex=False, na=False),
        ["order_id", "customer", "region"],
    ]

    na=False keeps missing text values from matching the filter.

  7. Verify the filtered row count, preserved index labels, and source row count.
    print(f"priority rows: {len(priority)}")
    print(f"priority index: {priority.index.tolist()}")
    print(f"all priority qty >= 10: {priority['qty'].ge(10).all()}")
    print(f"source rows unchanged: {len(orders)}")
  8. Copy a filtered result before changing it independently.
    priority_review = priority.copy()
    priority_review["review_required"] = True

    .copy() makes the filtered DataFrame a separate working object before later assignment.
    Related: How to migrate pandas code for Copy-on-Write

  9. Remove the sample script after confirming the filter behavior.
    $ rm filter_rows.py