NMS（python代码详解）-CFANZ编程社区

遍历所有类别，提取当前类别信息：位置信息、类别概率。过滤同一类别的预测框。

# NMS  (第二次筛选)  -> bbox_pred, scores, cls_inds 
keep = np.zeros(len(bbox_pred), dtype=np.int)  # 继续需要保留的bbox(n,). 0/1. 1 is keep.
for i in range(self.num_classes):  # 遍历类别，每次处理同一个类别的所有
    inds = np.where(cls_inds == i)[0]  # 找到类别i的索引
    if len(inds) == 0:  # 不存在类别数为i
        continue
    c_bboxes = bbox_pred[inds]  # 根据类别的索引，找到位置信息 (num_bbox_i_cls,4)
    c_scores = scores[inds]  # 类别概率  (num_bbox_i_cls,)
    c_keep = self.nms(c_bboxes, c_scores)  # (num_bbox_i_cls,4), (num_bbox_i_cls,)。返回类别i需要保留的bbox索引
    keep[inds[c_keep]] = 1  # 根据类别ic_bboxes中需要保留的索引c_keep，找到bbox_pred原始索引位置inds[c_keep]。

keep = np.where(keep > 0)
bbox_pred = bbox_pred[keep]
scores = scores[keep]
cls_inds = cls_inds[keep]

目的：过滤同一类别的预测框；

过滤用到的信息：位置、类别概率。

方法：

（1）将预测框类别概率从大到小排序；
（2）添加最大概率对应的预测框M到最终预测框中；
（3）分别计算预测框与其他剩余所有框之间交并比，保留交集小的预测框；
（4）从（2）开始重复，直到保留的预测框为0.

def nms(self, bboxes, scores):
    """"
    Pure Python NMS baseline.
    :param bboxes: (m,4). bboxes[0] = [xmin, ymin, xmax, ymax]。
                    属于某个类别的所有预测框位置
    :param scores: (m,). 属于某个类别的所有预测框分数
    :return
        keep: 需要保留的所有预测框的索引。
    """
    x1 = bboxes[:, 0]  # xmin. (m,)，所有预测框的左上角
    y1 = bboxes[:, 1]  # ymin. (m,)
    x2 = bboxes[:, 2]  # xmax. (m,)，所有预测框的右下角
    y2 = bboxes[:, 3]  # ymax. (m,)

    areas = (x2 - x1) * (y2 - y1)  # the size of bbox. 计算所有预测框的面积
    order = scores.argsort()[::-1]  # sort bounding boxes by decreasing order 概率递减。
    # order记录是递减序列元素的原始scores位置索引
    keep = []  # store the final bounding boxes
    while order.size > 0:
        i = order[0]  # the index of the bbox with highest confidence
        keep.append(order[0])  # save it to keep。记录分数最高类别M的原始位置索引
        # 将类别M的框分别与其他所有剩下的框求交集，求坐标最大值得交集的左上角坐标，最小值得交集的右下角坐标
        xx1 = np.maximum(x1[order[0]], x1[order[1:]])  # (num_bbox-1, ) 将类别M的框分别与其他所有剩下的框求交集
        yy1 = np.maximum(y1[order[0]], y1[order[1:]])  # (num_bbox-1, ) 求最大值，得到交集的左上角
        xx2 = np.minimum(x2[order[0]], x2[order[1:]])  # 求最小值，得到交集的右下角，可画一个示意图。
        yy2 = np.minimum(y2[order[0]], y2[order[1:]])

        w = np.maximum(1e-28, xx2 - xx1)  # (num_bbox-1, ) 分别求与M交集的宽，必须大于0
        h = np.maximum(1e-28, yy2 - yy1)  # 分别求与M交集的高，必须大于0
        inter = w * h

        # Cross Area / (bbox + particular area - Cross Area)，分别计算M与其他所有框之间的交集/并集
        ovr = inter / (areas[order[0]] + areas[order[1:]] - inter)  # (num_bbox-1, )
        # reserve all the boundingbox whose ovr less than thresh
        inds = np.where(ovr <= self.nms_thresh)[0]  # 保留与M交集小的。交集小说明预测的可能是同一个目标。
        order = order[inds + 1]  # 记录剩下的与M交集小的原始位置索引。inds是ovr交集的索引，对应到所有预测框order的索引需要加1.
        # 然后重新遍历，又是将第一个，即分数最大的保留，查看后面其他与它的交集情况。

    return keep