夕颜合欢落

关注

Loss is its own Reward: Self-Supervision for Reinforcement Learning

夕颜合欢落

关注

阅读 80

2022-07-18

作者用action, reward, state等当做lalbel,进行有监督训练。

 

黄世宇/Shiyu Huang's Personal Page:​​https://huangshiyu13.github.io/​​



相关推荐

书呆鱼

领域自适应论文(六十五):Unsupervised Domain Adaptation through Self-Supervision论文原理

书呆鱼 79 0 0

北溟有渔夫

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

北溟有渔夫 61 0 0

雪域迷影

reward model learning papers

雪域迷影 80 0 0

幸福的无所谓

Playing FPS Games with Deep Reinforcement Learning

幸福的无所谓 66 0 0

汤姆torn

Incentivizing exploration in reinforcement learning with deep predictive models

汤姆torn 54 0 0

曾宝月

《Reinforcement Learning: An Introduction》第8章笔记

曾宝月 38 0 0

书呆鱼

论文阅读-Policy Optimization for Continuous Reinforcement Learning

书呆鱼 9 0 0

phpworkerman

从baselines库的common/vec_env/vec_normalize.py看reinforcement learning算法中的reward shape方法

phpworkerman 36 0 0

胡桑_b06e

深度学习之:强化学习 Reinforcement Learning

胡桑_b06e 74 0 0

王小沫

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning

王小沫 33 0 0

精彩评论(0)

0 0 举报