从Mask-RCNN中的Mask矩阵（布尔矩阵）中找到Mask（矩形）角的坐标？

2024-05-19 • 问答

在我的项目中，我尝试检测我的数据集中的商店标志。我正在使用Mask-Rcnn。图像尺寸为512x512。 shop sign images with Mask-RCNN

results = model.detect([image],verbose=1)
r = results[0]
masked = r['masks']
rois = r['rois']

运行上述代码后，“ rois”为我提供了商店招牌边框的坐标（例如[40、52、79、249]）。 r ['masks']给了我一个布尔矩阵，它表示图像中的每个掩模。如果此像素位于遮罩区域中，则遮罩矩阵中的像素值为“ True”。如果此像素不在遮罩区域内，则像素值为“ False”。如果模型在图像中检测到7个商店标志（即7个遮罩），则r ['masks']的大小为512x512x7。每个通道代表不同的蒙版。

我必须分别处理每个蒙版，因此我分离了每个通道，假设得到第一个。然后我在“ True”像素的遮罩数组中找到了坐标。

array = masked[:,:,0]

true_points = []
for i in range(512):
    for j in range(512):
        if array[i][j] == True:
            true_points.append([j,i])

所以，我的问题是如何从这个布尔矩阵中获取蒙版角（即商店标志）的坐标？大多数商店的招牌都是矩形的，但可以旋转。我有边界框的坐标，但是旋转商店标志时它不准确。我有“真实”点的坐标。您能建议一种算法来找到拐角“真”值吗？

如果您知道旋转角度，只需旋转bbox角，例如usig cv2.warpAffine在拐角点。如果您不这样做，那么您可以像这样

H,W = array.shape
left_edges = np.where(array.any(axis=1),array.argmax(axis=1),W+1)
flip_lr = cv2.flip(array,1) #1 horz vert 0
right_edges = W-np.where(flip_lr.any(axis=1),flip_lr.argmax(axis=1),W+1)
top_edges = np.where(array.any(axis=0),array.argmax(axis=0),H+1)
flip_ud = cv2.flip(array,0) #1 horz vert 0
bottom_edges = H - np.where(flip_ud.any(axis=0),flip_ud.argmax(axis=0),H+1)
leftmost = left_edges.min()
rightmost = right_edges.max()
topmost = top_edges.min()
bottommost = bottom_edges.max()

您的bbox的角落（最左侧，最顶部），（最右侧，最底部），here's是我尝试过的一个示例。顺便说一句，如果您发现自己在像素上循环，则应该知道几乎总是有一个numpy向量化的操作可以更快地完成它。

透视变换可以用于此问题：

从检测到的口罩中找到商店招牌的拐角点。（src点）
所需的矩形框点（dst点）
使用cv2.getPerspectiveTransform和cv2.warpPerspective生成新图像

对于拐角点检测，我们可以使用cv2.findContours和cv2.approxPolyDP

cv2.findContours在二进制图像中找到轮廓。

contours,_ = cv2.findContours(r['masks'][:,:,0].astype(np.uint8),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

然后使用cv2.approxPolyDP从轮廓中近似矩形：

cv2.approxPolyDP中的关键点是epsilon（近似精度的参数）。自定义矩形点检测阈值（下）

def Contour2Quadrangle(contour):
    def getApprox(contour,alpha):
        epsilon = alpha * cv2.arcLength(contour,True)
        approx = cv2.approxPolyDP(contour,epsilon,True)
        return approx

    # find appropriate epsilon
    def getQuadrangle(contour):
        alpha = 0.1
        beta = 2 # larger than 1
        approx = getApprox(contour,alpha)
        if len(approx) < 4:
            while len(approx) < 4:
                alpha = alpha / beta
                approx = getApprox(contour,alpha)  
            alpha_lower = alpha
            alpha_upper = alpha * beta
        elif len(approx) > 4:
            while len(approx) > 4:
                alpha = alpha * beta
                approx = getApprox(contour,alpha)  
            alpha_lower = alpha / beta
            alpha_upper = alpha
        if len(approx) == 4:
            return approx
        alpha_middle = (alpha_lower * alpha_upper ) ** 0.5
        approx_middle = getApprox(contour,alpha_middle)
        while len(approx_middle) != 4:
            if len(approx_middle) < 4:
                alpha_upper = alpha_middle
                approx_upper = approx_middle
            if len(approx_middle) > 4:
                alpha_lower = alpha_middle
                approx_lower = approx_middle
            alpha_middle = ( alpha_lower * alpha_upper ) ** 0.5
            approx_middle = getApprox(contour,alpha_middle)
        return approx_middle

    def getQuadrangleWithRegularOrder(contour):
        approx = getQuadrangle(contour)
        hashable_approx = [tuple(a[0]) for a in approx]
        sorted_by_axis0 = sorted(hashable_approx,key=lambda x: x[0])
        sorted_by_axis1 = sorted(hashable_approx,key=lambda x: x[1])
        topleft_set = set(sorted_by_axis0[:2]) & set(sorted_by_axis1[:2])
        assert len(topleft_set) == 1
        topleft = topleft_set.pop()
        topleft_idx = hashable_approx.index(topleft)
        approx_with_reguler_order = [ approx[(topleft_idx + i) % 4] for i in range(4) ]
        return approx_with_reguler_order

    return getQuadrangleWithRegularOrder(contour)

最后，我们生成了具有所需目标坐标的新图像。

contour = max(contours,key=cv2.contourArea)
corner_points = Contour2Quadrangle(contour)
src = np.float32(list(map(lambda x: x[0],corner_points)))
dst = np.float32([[0,0],[0,200],[400,[200,0]])

M = cv2.getPerspectiveTransform(src,dst)
transformed = cv2.warpPerspective(img,M,(rect_img_w,rect_img_h))
plt.imshow(transformed) # check the results

从Mask-RCNN中的Mask矩阵（布尔矩阵）中找到Mask（矩形）角的坐标？

aheiheiliu 回答：从Mask-RCNN中的Mask矩阵（布尔矩阵）中找到Mask（矩形）角的坐标？

大家都在问