Python|使用 Grabcut 算法提取图像中的前景
让我们讨论一种从图像背景中提取前景的有效方法。这里的想法是找到前景,并删除背景。
前景提取是允许提取图像前景以进行进一步处理(如对象识别、跟踪等)的任何技术。此处用于前景提取的算法是GrabCut 算法。在该算法中,根据前景绘制区域,在其上绘制一个矩形。这是包围我们的主要对象的矩形。区域坐标是通过理解前景蒙版来决定的。但是这种分割并不完美,因为它可能将某些前景区域标记为背景,反之亦然。这个问题可以手动避免。这种前景提取技术的功能就像电影中的绿屏一样。
- 感兴趣区域由要执行的前景和背景的分割量决定,并由用户选择。 ROI 之外的所有内容都被视为背景并变为黑色。 ROI 内的元素仍然未知。
- 然后使用高斯混合模型(GMM)对前景和背景进行建模。然后,根据用户提供的数据,GMM 学习并为未知像素创建标签,并根据颜色统计对每个像素进行聚类。
- 从此像素分布生成一个图形,其中像素被视为节点,并添加了两个额外的节点,即 Source 节点和 Sink 节点。所有前景像素都连接到 Source 节点,每个 Background 像素都连接到 Sink 节点。将像素连接到 Source 节点和 End 节点的边的权重由像素位于前景或背景中的概率定义。
- 如果在像素颜色中发现巨大的差异,则将低权重分配给该边缘。然后应用该算法对图进行分割。该算法将图分割成两部分,在成本函数的帮助下将源节点和汇节点分开,成本函数是被分割的边的所有权重的总和。
- 分割后,连接到 Source 节点的像素被标记为前景,连接到 Sink 节点的像素被标记为背景。此过程针对用户指定的多次迭代完成。这给了我们提取的前景。
这里使用的函数是 cv2.grabCut()
Syntax: cv2.grabCut(image, mask, rectangle, backgroundModel, foregroundModel, iterationCount[, mode])
Parameters:
- image: Input 8-bit 3-channel image.
- mask: Input/output 8-bit single-channel mask. The mask is initialized by the function when mode is set to GC_INIT_WITH_RECT. Its elements may have one of following values:
- GC_BGD defines an obvious background pixels.
- GC_FGD defines an obvious foreground (object) pixel.
- GC_PR_BGD defines a possible background pixel.
- GC_PR_FGD defines a possible foreground pixel.
- rectangle: It is the region of interest containing a segmented object. The pixels outside of the ROI are marked as obvious background. The parameter is only used when mode==GC_INIT_WITH_RECT.
- backgroundModel: Temporary array for the background model.
- foregroundModel: Temporary array for the foreground model.
- iterationCount: Number of iterations the algorithm should make before returning the result. Note that the result can be refined with further calls with mode==GC_INIT_WITH_MASK or mode==GC_EVAL.
- mode: It defines the Operation mode. It can be one of the following:
- GC_INIT_WITH_RECT: The function initializes the state and the mask using the provided rectangle. After that it runs iterCount iterations of the algorithm.
- GC_INIT_WITH_MASK: The function initializes the state using the provided mask. Note that GC_INIT_WITH_RECT and GC_INIT_WITH_MASK can be combined. Then, all the pixels outside of the ROI are automatically initialized with GC_BGD.
- GC_EVAL: The value means that the algorithm should just resume.
下面是实现:
Python3
# Python program to illustrate
# foreground extraction using
# GrabCut algorithm
# organize imports
import numpy as np
import cv2
from matplotlib import pyplot as plt
# path to input image specified and
# image is loaded with imread command
image = cv2.imread('image.jpg')
# create a simple mask image similar
# to the loaded image, with the
# shape and return type
mask = np.zeros(image.shape[:2], np.uint8)
# specify the background and foreground model
# using numpy the array is constructed of 1 row
# and 65 columns, and all array elements are 0
# Data type for the array is np.float64 (default)
backgroundModel = np.zeros((1, 65), np.float64)
foregroundModel = np.zeros((1, 65), np.float64)
# define the Region of Interest (ROI)
# as the coordinates of the rectangle
# where the values are entered as
# (startingPoint_x, startingPoint_y, width, height)
# these coordinates are according to the input image
# it may vary for different images
rectangle = (20, 100, 150, 150)
# apply the grabcut algorithm with appropriate
# values as parameters, number of iterations = 3
# cv2.GC_INIT_WITH_RECT is used because
# of the rectangle mode is used
cv2.grabCut(image, mask, rectangle,
backgroundModel, foregroundModel,
3, cv2.GC_INIT_WITH_RECT)
# In the new mask image, pixels will
# be marked with four flags
# four flags denote the background / foreground
# mask is changed, all the 0 and 2 pixels
# are converted to the background
# mask is changed, all the 1 and 3 pixels
# are now the part of the foreground
# the return type is also mentioned,
# this gives us the final mask
mask2 = np.where((mask == 2)|(mask == 0), 0, 1).astype('uint8')
# The final mask is multiplied with
# the input image to give the segmented image.
image = image * mask2[:, :, np.newaxis]
# output segmented image with colorbar
plt.imshow(image)
plt.colorbar()
plt.show()
输入图像:
输出:
在这里,我们获取了大小为 500X281 的输入图像,并相应地确定了矩形的坐标。输出图像显示图像左侧的对象如何成为前景的一部分并减去背景。