失敗確率伝播アルゴリズムEFPAの提案とマルチエージェント環境下での有効性の検証
失敗確率伝播アルゴリズムEFPAの提案とマルチエージェント環境下での有効性の検証
カテゴリ: 論文誌(論文単位)
グループ名: 【C】電子・情報・システム部門
発行日: 2016/03/01
タイトル(英語): Proposal of a Propagation Algorithm of the Expected Failure Probability and the Effectiveness on Multi-agent Environments
著者名: 村岡 宏紀(日野自動車(株) エンジン設計部),宮崎 和光(独立行政法人大学評価・学位授与機構研究開発部),小林 博明(明治大学大学院理工学研究科)
著者名(英語): Hiroki Muraoka (Engine Engineering Div., Hino Motors, Ltd.), Kazuteru Miyazaki (Research Department, National Institution for Academic Degrees and University Evaluation), Hiroaki Kobayashi (Department of Mechanical Engineering Informatics, Meiji University)
キーワード: 経験強化型学習,強化学習,マルチエージェント学習,同時学習問題 Exploitaion-oritented Learning,Reinforcement Learning,Multi-agent learning,concurrent leraning problem
要約(英語): It is known that Improved Penalty Avoiding Rational Policy Making algorithm (IPARP) can learn policies by a reward and a penalty. IPARP aims to identify penalty rules that have a high possibility to receive a penalty. Though IPARP is effective in many cases, it needs many trial-and-error searches due to memory constraints. In this paper, we propose a method called Expected Failure Probability Algorithm (EFPA) to speed it up. In addition, we extend EFPA to multi-agent environments. In multi-agent learning, it is important to avoid concurrent learning problem that occurs when multiple agents learn simultaneously. We also propose a method to avoid the problem and confirm the effectiveness by numerical experiments.
本誌: 電気学会論文誌C(電子・情報・システム部門誌) Vol.136 No.3 (2016) 特集:機械学習が拓くシステムイノベーション
本誌掲載ページ: 273-281 p
原稿種別: 論文/日本語
電子版へのリンク: https://www.jstage.jst.go.jp/article/ieejeiss/136/3/136_273/_article/-char/ja/
受取状況を読み込めませんでした
