How to segment foreground with GrabCut in OpenCV

Foreground segmentation turns a full image into pixels that belong to the subject and pixels that can be ignored. OpenCV GrabCut helps when the subject can be boxed roughly but color thresholding alone cannot separate it from nearby background.

The Python script uses cv.grabCut with GC_INIT_WITH_RECT so the first label pass comes from one rectangle. Pixels outside the rectangle start as background, while pixels inside the rectangle are refined into probable foreground and probable background labels.

Choose a rectangle with a small background margin around the target object and avoid clipping the subject. A nonzero foreground-pixel count plus written mask and cutout files confirms the script ran; visual inspection still matters when the object has weak contrast, hairline edges, or shadows.

Steps to segment foreground with OpenCV GrabCut:

Open a Python environment with OpenCV and NumPy available, then place the source image at input/scene.png.

Use any image path that cv.imread can read. The rectangle in the run command uses x,y,width,height from the image's top-left corner.

Save the GrabCut script as segment_grabcut.py.

segment_grabcut.py

from argparse import ArgumentParser
from pathlib import Path
 
import cv2 as cv
import numpy as np
 
 
def parse_rect(value):
    try:
        x, y, width, height = [int(part) for part in value.split(",")]
    except ValueError as exc:
        raise SystemExit("Use --rect as x,y,width,height.") from exc
    if x < 0 or y < 0 or width <= 0 or height <= 0:
        raise SystemExit("Rectangle values must be positive, with x and y at zero or greater.")
    return x, y, width, height
 
 
parser = ArgumentParser(description="Segment foreground with OpenCV GrabCut.")
parser.add_argument("image", help="Input image path.")
parser.add_argument("cutout", help="Output PNG path for the foreground cutout.")
parser.add_argument("--rect", required=True, help="Foreground rectangle as x,y,width,height.")
parser.add_argument("--iterations", type=int, default=5, help="GrabCut iteration count.")
parser.add_argument("--mask-output", default="foreground-mask.png", help="Output PNG path for the binary mask.")
args = parser.parse_args()
 
image = cv.imread(args.image, cv.IMREAD_COLOR)
if image is None:
    raise SystemExit(f"Could not read image: {args.image}")
 
height, width = image.shape[:2]
x, y, rect_width, rect_height = parse_rect(args.rect)
if x + rect_width > width or y + rect_height > height:
    raise SystemExit(f"Rectangle {args.rect} extends beyond the {width}x{height} image.")
 
mask = np.zeros((height, width), np.uint8)
background_model = np.zeros((1, 65), np.float64)
foreground_model = np.zeros((1, 65), np.float64)
 
cv.grabCut(
    image,
    mask,
    (x, y, rect_width, rect_height),
    background_model,
    foreground_model,
    args.iterations,
    cv.GC_INIT_WITH_RECT,
)
 
foreground_mask = np.where(
    (mask == cv.GC_FGD) | (mask == cv.GC_PR_FGD),
    255,
    0,
).astype("uint8")
 
cutout = cv.cvtColor(image, cv.COLOR_BGR2BGRA)
cutout[:, :, 3] = foreground_mask
 
mask_path = Path(args.mask_output)
cutout_path = Path(args.cutout)
mask_path.parent.mkdir(parents=True, exist_ok=True)
cutout_path.parent.mkdir(parents=True, exist_ok=True)
 
if not cv.imwrite(str(mask_path), foreground_mask):
    raise SystemExit(f"Could not write mask: {mask_path}")
if not cv.imwrite(str(cutout_path), cutout):
    raise SystemExit(f"Could not write cutout: {cutout_path}")
 
foreground_pixels = int(np.count_nonzero(foreground_mask))
coverage = foreground_pixels / foreground_mask.size * 100
 
print(f"image={args.image} shape={width}x{height}")
print(f"rect={x},{y},{rect_width},{rect_height} iterations={args.iterations}")
print(f"foreground_pixels={foreground_pixels} coverage={coverage:.2f}%")
print(f"wrote_mask={mask_path}")
print(f"wrote_cutout={cutout_path}")

Run GrabCut with a rectangle that fully contains the foreground subject.

$ python3 segment_grabcut.py input/scene.png output/foreground.png --rect 395,65,210,210 --mask-output output/foreground-mask.png
image=input/scene.png shape=720x480
rect=395,65,210,210 iterations=5
foreground_pixels=22701 coverage=6.57%
wrote_mask=output/foreground-mask.png
wrote_cutout=output/foreground.png

A rectangle that cuts into the subject can force those clipped pixels into the background model. Widen the rectangle before increasing the iteration count.

Verify the output mask and alpha cutout have matching foreground pixels.

$ python3 - <<'PY'
import cv2 as cv

mask = cv.imread("output/foreground-mask.png", cv.IMREAD_GRAYSCALE)
cutout = cv.imread("output/foreground.png", cv.IMREAD_UNCHANGED)

print(f"mask_shape={mask.shape} nonzero={cv.countNonZero(mask)}")
print(f"cutout_channels={cutout.shape[2]} alpha_nonzero={cv.countNonZero(cutout[:, :, 3])}")
PY
mask_shape=(480, 720) nonzero=22701
cutout_channels=4 alpha_nonzero=22701

The matching counts show that the transparent cutout was built from the same foreground mask.