報酬が周期的に変化する環境のための強化学習
報酬が周期的に変化する環境のための強化学習
カテゴリ: 論文誌(論文単位)
グループ名: 【C】電子・情報・システム部門
発行日: 2014/09/01
タイトル(英語): Reinforcement Learning for Environment with Cyclic Reward Depending on the Time
著者名: 澁谷 長史(筑波大学システム情報系),安信 誠二(筑波大学システム情報系)
著者名(英語): Takeshi Shibuya (Faculty of Engineering, Information and Systems, University of Tsukuba), Seiji Yasunobu (Faculty of Engineering, Information and Systems, University of Tsukuba)
キーワード: 強化学習,周期報酬環境,正弦波状の行動価値関数の重ね合わせ,周期行動価値関数 reinforcement learning,environment with cyclic reward depending on the time,superposing sinusoidal action-value function,cyclic action-value function
要約(英語): This paper proposes a new reinforcement learning method to construct agents in environments with cyclic reward depending on time. The proposed method consists of two parts: (a) a cyclic action-value function by superposing sinusoidal action-value function in phasor representation and (b) an algorithm to use it. Reinforcement learning is a widely used framework to develop agent which can decide suitable action. It enables the agent to learn suitable action only in stationary environments. Contrast to conventional methods, the proposed reinforcement learning method can be applied to learning in environments with cyclic reward depending on the time. Experimental results show that the proposed method performs much better than conventional methods.
本誌: 電気学会論文誌C(電子・情報・システム部門誌) Vol.134 No.9 (2014) 特集Ⅰ:制御系設計における適応・学習・同定・モデリングの新展開 特集Ⅱ:インテリジェント・システム
本誌掲載ページ: 1325-1332 p
原稿種別: 論文/日本語
電子版へのリンク: https://www.jstage.jst.go.jp/article/ieejeiss/134/9/134_1325/_article/-char/ja/
受取状況を読み込めませんでした
