{"product_id":"ieej-rc13608-025","title":"エージェントの行動履歴を活用したQ-learningアルゴリズムの提案","description":"\u003cp\u003e\u003cstrong\u003eカテゴリ: \u003c\/strong\u003e論文誌(論文単位)\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003eグループ名: \u003c\/strong\u003e【C】電子・情報・システム部門\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003e発行日: \u003c\/strong\u003e2016\/08\/01\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003eタイトル(英語): \u003c\/strong\u003eImproving Q-learning by Using the Agent's Action History\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003e著者名: \u003c\/strong\u003e齋藤　雅矩（神奈川大学大学院 工学研究科 経営工学専攻），瀬古沢　照治（神奈川大学 工学部 情報システム創成学科）\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003e著者名(英語): \u003c\/strong\u003eMasanori Saito (Graduate School of Engineering, Kanagawa University), Teruji Sekozawa (Dept. Information Systems Creation, Faculty of Engineering, Kanagawa University)\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003eキーワード: \u003c\/strong\u003e機械学習，強化学習，Q-learning，行動履歴，行動選択，タブーサーチ　　machine learning，reinforcement learning，Q-learning，action history，action select，tabu search\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003e要約(英語): \u003c\/strong\u003eQ-learning is learning the optimal policy by updating in action-state value function(Q-value) to maximize a expectation reward by a trial and error search. However, there is major issues slowness of learning speed. Therefore, we added technique agent memorize environmental information and useing with update of the Q-value in many states. By updating the Q-value in the number of conditions to give a lot of information to the agent, be able to reduce learning time. Further, by incorporating the stored environmental information into action selection method, and the action selection to avoid the failure behavior, such as learning to stagnation, improved the learning speed of learning the initial stage. In addition, we design a new action area value function, in order to search for much more statas from the learning initial. Finally, numerical examples which solved maze problem showed the usefulness of the proposed method.\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003e本誌: \u003c\/strong\u003e\u003ca href=\"\/products\/ieej-rc13608\"\u003e電気学会論文誌C（電子・情報・システム部門誌） Vol.136 No.8 （2016） 特集Ⅰ：知能メカトロニクス分野と連携する知覚情報技術　特集Ⅱ：国際会議ICESS 2015\u003c\/a\u003e\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003e本誌掲載ページ: \u003c\/strong\u003e1209-1217 p\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003e原稿種別: \u003c\/strong\u003e論文／日本語\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003e電子版へのリンク: \u003c\/strong\u003e\u003ca target=\"_blank\" href=\"https:\/\/www.jstage.jst.go.jp\/article\/ieejeiss\/136\/8\/136_1209\/_article\/-char\/ja\/\"\u003ehttps:\/\/www.jstage.jst.go.jp\/article\/ieejeiss\/136\/8\/136_1209\/_article\/-char\/ja\/\u003c\/a\u003e\u003c\/p\u003e","brand":"IEEJ-P10","offers":[{"title":"冊子印刷（一般価格770円\/会員価格550円） \/ A4 \/ 9","offer_id":46349953237231,"sku":"IEEJ-RC13608-025-PRT","price":770.0,"currency_code":"JPY","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0718\/9512\/2159\/files\/IEEJ-RC13608_72ffcb54-5c1d-4898-9ed1-8f9f7bc80a80.png?v=1743166092","url":"https:\/\/ieej.bookpark.ne.jp\/products\/ieej-rc13608-025","provider":"電気学会 電子図書館","version":"1.0","type":"link"}