逆強化学習における学習効率を最大化する報酬関数の推定

¥770 JPY

セール売り切れ

税込

カテゴリ: 論文誌(論文単位)

グループ名: 【C】電子・情報・システム部門

発行日: 2018/06/01

タイトル(英語): Estimation of reward function maximizing learning efficiency in inverse reinforcement learning

著者名: 北里　勇樹（千葉大学工学部），荒井　幸代（千葉大学工学部）

著者名(英語): Yuki Kitazato (Graduate School & Faculty of Engineering Chiba University), Sachiyo Arai (Graduate School & Faculty of Engineering Chiba University)

キーワード: 逆強化学習，学習効率　　inverse reinforcement learning，learning efficiency

要約(英語): Inverse Reinforcement Learning (IRL) is a promising framework for estimating a reward function under given behaviors of the expert. However, the IRL problem is ill-posed in that several reward functions that can reproduce expert's behavior will be available. The previous studies of IRL have just focused on the reproduction rate of original behavior of expert's to select the most appropriate reward function. This evaluation measure seems not enough to shape the candidate of reward functions. To select the most appropriate one from the alternative reward functions, we introduce another objective function into the existing IRL algorithms of Ng et al. Specifically, we focus on the learning efficiency as an additional objective function to make the faster convergence of RL via introducing Genetic Algorithm. Consequently, our proposed IRL algorithm guarantees to output the reward function by which agent acquires both effective and optimal policy. We show the effectiveness of our approach by comparing the performance of the proposed method to those of the previous algorithms.

本誌: 電気学会論文誌C（電子・情報・システム部門誌） Vol.138 No.6 （2018）特集：臨床・介護モニタリング研究の新展開

本誌掲載ページ: 720-727 p

原稿種別: 論文／日本語

電子版へのリンク: https://www.jstage.jst.go.jp/article/ieejeiss/138/6/138_720/_article/-char/ja/

販売タイプ冊子印刷（一般価格770円/会員価格550円）

書籍サイズ A4

ページ数 8

数量

詳細を表示する

国/地域

逆強化学習における学習効率を最大化する報酬関数の推定

逆強化学習における学習効率を最大化する報酬関数の推定