0
点赞
收藏
分享

微信扫一扫

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning


文章目录

  • ​​Active Object Localization with Deep Reinforcement Learning​​
  • ​​做了什么​​
  • ​​怎么做的​​
  • ​​动作​​
  • ​​状态​​
  • ​​奖励​​
  • ​​结果​​


【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器

Active Object Localization with Deep Reinforcement Learning

​​https://arxiv.org/pdf/1511.06015.pdf​​

做了什么

该论文将强化学习应用于图像中的目标定位。我们用手机看图片时,会通过放大,滑动屏幕等等操作来定位目标:

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_02

该论文让智能体学习类似的操作。

怎么做的

对目标定位进行马尔可夫决策过程建模,使用DQN算法让智能体学习定位策略。

动作

总共九个动作:

(1)八个动作用于定位边框的变换:

这些操作如图2所示,分成四个子集:在水平轴(horizontal)和垂直轴(vertical)上移动边框、改变边框的比例(scale)和修改边框的长宽比( aspect ratio)。
【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_03

边框由其两个角的像素坐标表示:【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_04。动作对边框的更改值【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_深度学习_05【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_06与边框的当前大小相关:

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_07

其中【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_08,实验中设置为0.2。

例如,

向右水平移动边框可以将【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_09【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_搜索_10【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_11相加;

而减小纵横比可以将【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_09【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_11相减,并将【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_搜索_10【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_11相加。

请注意,图像平面中的原点位于左上角。

(2)一个动作用于结束目标搜索
【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_深度学习_16

触发器(trigger)不变换边框,而是用于指示对象已被定位。一旦执行了这个动作,目标搜索将终止,并在初始位置重新启动边框以进行新一轮目标搜索。触发器还对图像进行了修改:它用黑➕标记了上次搜索得到的目标区域,通过这样阻止已定位的目标再次被定位,实现多目标的定位。

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_搜索_17

一些例子:

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_18

状态

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_19

状态是一个元组【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_20,其中【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_21是边框包围区域的特征向量,【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_22所采取历史动作组成的向量。

边框内的区域都会被扩展到原始框周围的16个像素,变换为224×224以匹配网络的输入大小,输入预训练CNN,输出4096维的特征向量【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_21

历史向量【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_22 包含10个历史动作。每个动作都用one-hot向量表示。这意味着【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_25

奖励

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_26为目前的边框,【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_27为目标边框。【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_26【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_27的IoU为:

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_30

八个变换动作的奖励

当智能体选择动作【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_31从状态【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_搜索_32移动到状态【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_33时,边框由【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_26变为【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_35。智能体获得的奖励:
【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_深度学习_36

直观地说,等式(2)表示如果从状态【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_搜索_32到状态【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_33时IoU变大,则奖励为【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_深度学习_39,否则为【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_40

终止动作的奖励

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_41

其中【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_42是终止动作,【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_43是终止奖励,实验中设置为3.0,【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_44是一个阈值,表示允许将检测到区域视为真正(TP)的最小IoU,实验中【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_44设置为0.6。

结果

所有参与区域(All joining regions, AAR):对智能体处理的所有区域进行评分。

终端区域(Terminal regions,TR):只考虑智能体使用触发器指示为存在目标的区域。

准确率评估

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_46

召回率评估

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_47

图5绘制了正确检测到对象所需的步数的分布图。

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_深度学习_48

定性评估
【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_机器学习_49

错误例子
【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_触发器_50

敏感性分析

评估特征有:遮挡(occ)、截断(trn)、大小、纵横比(asp)、物体视点(view)和部分可见(parts)。

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning_计算机视觉_51

举报

相关推荐

0 条评论