Progressive Co-Attention Network for Fine-grained Visual Classification-CFANZ编程社区

Progressive Co-Attention Network for Fine-grained Visual Classification

一、动机

细粒度的视觉分类旨在识别属于同一类别中多个子类别的图像。由于高度混淆的类别之间存在固有的细微差异，因此这是一项具有挑战性的任务。大多数现有方法仅将单个图像作为输入，这可能会限制模型识别来自不同图像的对比线索的能力。在本文中，我们提出了一种有效的方法，称为渐进式共同注意力网络（PCA-Net）来解决这个问题。具体来说，我们通过鼓励同类别图像对内的特征通道之间的互动来计算通道的相似性，以捕捉共同的辨别特征。考虑到互补信息对识别也是至关重要的，我们删除了因通道互动而增强的突出区域，以迫使网络专注于其他鉴别性的区域。所提出的模型在三个细粒度的视觉分类基准数据集上取得了有竞争力的结果。CUB-200-2011, Stanford Cars, and FGVC Aircraft.

把同一个类别的两张图片作为一对，输入网络中。

二、数据集

三、网络结构

CA，AE，center loss三者消融实验

（CUB-200-2011 dataset backbone resnet50）

四、效果

D. Visualizations In order to further evaluate the effectiveness of our method, we apply Grad-CAM [33] to visualize the images of the CUB-200-2011 dataset. Grad-CAM is formed by weighted summation of feature maps, which can show the importance of each area to its classification. We compare the visualization results of our method with the base model (ResNet-50), as shown in Figure 3. It can be observed that the base model only learns the most prominent area of the image, such as the bird’s beak. Our method can learn more abundant and discriminative features, including wings and claws. This is because that our method can distribute attention to each area to make the prediction more comprehensive, which can not only focus on the salient features, but also capture the subtle and fine-grained features.

五、实验结果（三个数据集），和其他的网络API-Net、CIN和MAMC等进行比较。

0 条评论