逆強化学習における学習効率を最大化する報酬関数の推定
逆強化学習における学習効率を最大化する報酬関数の推定
カテゴリ: 論文誌(論文単位)
グループ名: 【C】電子・情報・システム部門
発行日: 2018/06/01
タイトル(英語): Estimation of reward function maximizing learning efficiency in inverse reinforcement learning
著者名: 北里 勇樹(千葉大学工学部),荒井 幸代(千葉大学工学部)
著者名(英語): Yuki Kitazato (Graduate School & Faculty of Engineering Chiba University), Sachiyo Arai (Graduate School & Faculty of Engineering Chiba University)
キーワード: 逆強化学習,学習効率 inverse reinforcement learning,learning efficiency
要約(英語): Inverse Reinforcement Learning (IRL) is a promising framework for estimating a reward function under given behaviors of the expert. However, the IRL problem is ill-posed in that several reward functions that can reproduce expert's behavior will be available. The previous studies of IRL have just focused on the reproduction rate of original behavior of expert's to select the most appropriate reward function. This evaluation measure seems not enough to shape the candidate of reward functions. To select the most appropriate one from the alternative reward functions, we introduce another objective function into the existing IRL algorithms of Ng et al. Specifically, we focus on the learning efficiency as an additional objective function to make the faster convergence of RL via introducing Genetic Algorithm. Consequently, our proposed IRL algorithm guarantees to output the reward function by which agent acquires both effective and optimal policy. We show the effectiveness of our approach by comparing the performance of the proposed method to those of the previous algorithms.
本誌: 電気学会論文誌C(電子・情報・システム部門誌) Vol.138 No.6 (2018) 特集:臨床・介護モニタリング研究の新展開
本誌掲載ページ: 720-727 p
原稿種別: 論文/日本語
電子版へのリンク: https://www.jstage.jst.go.jp/article/ieejeiss/138/6/138_720/_article/-char/ja/
受取状況を読み込めませんでした
