Python OpenCV：使用 Homography 进行对象跟踪

在本文中，我们尝试使用已经给出的图像来跟踪视频中的对象。我们还可以跟踪图像中的对象。在使用单应性进行对象跟踪之前，让我们了解一些基础知识。

什么是单应性？

单应性是一种将一个点中的点映射到另一幅图像中对应点的变换。单应性是一个 3×3 矩阵：

如果 2 个点不在同一个平面上，那么我们必须使用 2 个同形异义词。同样，对于 n 个平面，我们必须使用 n 个同形异义词。如果我们有更多的同形异义词，那么我们需要正确处理所有这些同形异义词。这就是我们使用特征匹配的原因。

导入图像数据：我们将读取以下图像：

上图是书的封面，存储为“img.jpg”。

Python

# importing the required libraries
import cv2
import numpy as np
 
# reading image in grayscale
img = cv2.imread("img.jpg", cv2.IMREAD_GRAYSCALE)
 
# initializing web cam
cap = cv2.VideoCapture(0)

Python

# creating the SIFT algorithm
sift = cv2.xfeatures2d.SIFT_create()
 
# find the keypoints and descriptors with SIFT
kp_image, desc_image =sift.detectAndCompute(img, None)
 
# initializing the dictionary
index_params = dict(algorithm = 0, trees = 5)
search_params = dict()
 
# by using Flann Matcher
flann = cv2.FlannBasedMatcher(index_params, search_params)

Python

# reading the frame
_, frame = cap.read()
 
# converting the frame into grayscale
grayframe = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
 
# find the keypoints and descriptors with SIFT
kp_grayframe, desc_grayframe = sift.detectAndCompute(grayframe, None)
 
# finding nearest match with KNN algorithm
matches= flann.knnMatch(desc_image, desc_grayframe, k=2)
 
# initialize list to keep track of only good points
good_points=[]
 
for m, n in matches:
    #append the points according
    #to distance of descriptors
    if(m.distance < 0.6*n.distance):
        good_points.append(m)

Python

# maintaining list of index of descriptors
# in query descriptors
query_pts = np.float32([kp_image[m.queryIdx]
                 .pt for m in good_points]).reshape(-1, 1, 2)
 
# maintaining list of index of descriptors
# in train descriptors
train_pts = np.float32([kp_grayframe[m.trainIdx]
                 .pt for m in good_points]).reshape(-1, 1, 2)
 
# finding  perspective transformation
# between two planes
matrix, mask = cv2.findHomography(query_pts, train_pts, cv2.RANSAC, 5.0)
 
# ravel function returns
# contiguous flattened array
matches_mask = mask.ravel().tolist()

Python

# initializing height and width of the image
h, w = img.shape
 
# saving all points in pts
pts = np.float32([[0, 0], [0, h], [w, h], [w, 0]])
            .reshape(-1, 1, 2)
 
# applying perspective algorithm
dst = cv2.perspectiveTransform(pts, matrix)

Python

# using drawing function for the frame
homography = cv2.polylines(frame, [np.int32(dst)], True, (255, 0, 0), 3)
 
# showing the final output
# with homography
cv2.imshow("Homography", homography)

特征匹配：特征匹配是指根据搜索距离从两个相似的数据集中找到对应的特征。现在将使用 sift 算法和 flann 类型的特征匹配。

Python

# creating the SIFT algorithm
sift = cv2.xfeatures2d.SIFT_create()
 
# find the keypoints and descriptors with SIFT
kp_image, desc_image =sift.detectAndCompute(img, None)
 
# initializing the dictionary
index_params = dict(algorithm = 0, trees = 5)
search_params = dict()
 
# by using Flann Matcher
flann = cv2.FlannBasedMatcher(index_params, search_params)

现在，我们还必须将视频捕获转换为灰度，并且通过使用适当的匹配器，我们必须将图像中的点匹配到帧。

在这里，我们在绘制匹配时可能会遇到异常，因为在两个平面上都会有很多点。为了处理这种情况，我们应该只考虑一些点，为了得到一些准确的点，我们可以改变距离障碍。

Python

# reading the frame
_, frame = cap.read()
 
# converting the frame into grayscale
grayframe = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
 
# find the keypoints and descriptors with SIFT
kp_grayframe, desc_grayframe = sift.detectAndCompute(grayframe, None)
 
# finding nearest match with KNN algorithm
matches= flann.knnMatch(desc_image, desc_grayframe, k=2)
 
# initialize list to keep track of only good points
good_points=[]
 
for m, n in matches:
    #append the points according
    #to distance of descriptors
    if(m.distance < 0.6*n.distance):
        good_points.append(m)

Homography ：要检测对象的单应性，我们必须获得矩阵并使用函数findHomography() 来获得对象的单应性。

Python

# maintaining list of index of descriptors
# in query descriptors
query_pts = np.float32([kp_image[m.queryIdx]
                 .pt for m in good_points]).reshape(-1, 1, 2)
 
# maintaining list of index of descriptors
# in train descriptors
train_pts = np.float32([kp_grayframe[m.trainIdx]
                 .pt for m in good_points]).reshape(-1, 1, 2)
 
# finding  perspective transformation
# between two planes
matrix, mask = cv2.findHomography(query_pts, train_pts, cv2.RANSAC, 5.0)
 
# ravel function returns
# contiguous flattened array
matches_mask = mask.ravel().tolist()

到目前为止，一切都已完成，但是当我们尝试将对象更改或移动到另一个方向时，计算机无法找到其同形异义词来处理这个问题，我们必须使用透视变换。例如，人类可以看到近处的物体比远处的物体大，这里的视角正在改变。这称为透视变换。

Python

# initializing height and width of the image
h, w = img.shape
 
# saving all points in pts
pts = np.float32([[0, 0], [0, h], [w, h], [w, 0]])
            .reshape(-1, 1, 2)
 
# applying perspective algorithm
dst = cv2.perspectiveTransform(pts, matrix)

最后，让我们看看输出

Python

# using drawing function for the frame
homography = cv2.polylines(frame, [np.int32(dst)], True, (255, 0, 0), 3)
 
# showing the final output
# with homography
cv2.imshow("Homography", homography)

输出：