複数のエキスパートから方策推定を行う敵対的逆強化学習
複数のエキスパートから方策推定を行う敵対的逆強化学習
カテゴリ: 論文誌(論文単位)
グループ名: 【C】電子・情報・システム部門
発行日: 2021/12/01
タイトル(英語): Adversarial Inverse Reinforcement Learning to Estimate Policies from Multiple Experts
著者名: 山下 廣大(横浜国立大学大学院理工学府),濱上 知樹(横浜国立大学大学院工学研究院)
著者名(英語): Kodai Yamashita (Graduate School of Engineering Science, Yokohama National University), Tomoki Hamagami (Facluty of Engineering, Yokohama National University)
キーワード: 逆強化学習,模倣学習,敵対的逆強化学習 inverse reinforcement learning,imitation learning,adversarial inverse reinforcement learning
要約(英語): Inverse reinforcement learning is used for complex control tasks by using experts. However, since the learning results depend on the expert, it is impossible to imitate ungiven policies from expert when there are multiple optimal polices for the same goal, or when the environment changes from the training. The problems can be solved by giving multiple experts and representing their features in the latent space. the proposed method extends information maximizing generative adversarial imitation learning with adversarial inverse reinforcement learning to deal with such environment. Experiments show that the proposed method can not only imitate multiple experts, but also estimate ungiven polices.
本誌: 電気学会論文誌C(電子・情報・システム部門誌) Vol.141 No.12 (2021) 特集Ⅰ:電気・電子・情報関係学会東海支部連合大会 特集Ⅱ:研究会優秀論文
本誌掲載ページ: 1405-1410 p
原稿種別: 論文/日本語
電子版へのリンク: https://www.jstage.jst.go.jp/article/ieejeiss/141/12/141_1405/_article/-char/ja/
受取状況を読み込めませんでした
