合理的な忘却型Profit Sharing強化学習法
合理的な忘却型Profit Sharing強化学習法
カテゴリ: 論文誌(論文単位)
グループ名: 【C】電子・情報・システム部門
発行日: 2012/03/01
タイトル(英語): A Rationally Oriented Forgettable Profit Sharing
著者名: 幸若 完壮(北海道大学大学院情報科学研究科),渡辺 浩太(北海道大学大学院情報科学研究科),五十嵐 一(北海道大学大学院情報科学研究科)
著者名(英語): Sadamori Koujaku (Graduate School of Imformation Science Technology Hokkaidou University), Kota Watanabe (Graduate School of Imformation Science Technology Hokkaidou University), Hajime Igarashi (Graduate School of Imformation Science Technology Hokkaidou University)
キーワード: 強化学習,Profit Sharing,宮崎の合理性定理 Reinforcement Learning,Profit Sharing,Miyazaki Rational Theorem
要約(英語): In this paper, Rationally oriented Forgettable Profit Sharing method (RFPS) for reinforcement learning is proposed. Although the Profit Sharing (PS) provides good performances in real environments, its learning is often slow in long term tasks because it is difficult to determine the adequate discount rate which satisfies the Miyazaki rational theorem. There are several rationality-relaxed PS methods which work well for such tasks. However, these PS may result in many irrational loops. The proposed method fulfills the rationality by forgetting the reinforced irrational loops. This method can be easily combined with ordinary PS methods and performs well in long term tasks. The simulation results show that the proposed method can learn more efficiently than the conventional PS methods.
本誌: 電気学会論文誌C(電子・情報・システム部門誌) Vol.132 No.3 (2012) 特集:エネルギーハーベスティングと無線電力伝送
本誌掲載ページ: 448-454 p
原稿種別: 論文/日本語
電子版へのリンク: https://www.jstage.jst.go.jp/article/ieejeiss/132/3/132_3_448/_article/-char/ja/
受取状況を読み込めませんでした
