Qisen Yang's Homepage
Qisen Yang's Homepage
Home
News
Selected
Publications
Experience
Awards
Contact
Light
Dark
Automatic
article-journal
Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning
The commonly used action matching principle may lead to irrelevant or misplaced feature attribution when different DNNs’ outputs lead to the same rewards or different rewards result from the same outputs.
Qisen Yang
,
Huanqian Wang
,
Mukun Tong
,
Wenjie Shi
,
Gao Huang
,
Shiji Song
PDF
Cite
Arxiv
Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance
Offline reinforcement learning (RL) optimizes the policy on a previously collected dataset without any interactions with the …
Qisen Yang
,
Shenzhi Wang
,
Qihang Zhang
,
Gao Huang
,
Shiji Song
PDF
Cite
Arxiv
Cite
×