REINFORCED LEARNING METHOD, REINFORCED LEARNING DEVICE, AND REINFORCED LEARNING PROGRAM FOR EFFICIENT LEARNING

Title:

REINFORCED LEARNING METHOD, REINFORCED LEARNING DEVICE, AND REINFORCED LEARNING PROGRAM FOR EFFICIENT LEARNING

Document Type and Number:

Japanese Patent JP2020166795

Kind Code:

A

Abstract:

To provide a reinforced learning method that learns expression specialized in states where an attention situation such as environment reset occurs so that efficient learning can be performed.SOLUTION: A reinforced learning method that optimizes a behavior policy of an agent from a result learned using a learning device that learns based on a state observed from environmental data, includes determining whether a state where a preset attention situation has occurred is observed during learning of environmental data in one episode. When the state where the attention situation has occurred is observed, a feature extractor (first learning device) learns by using two pieces of environmental data of environmental data of the state where the preset attention situation has occurred and environmental data at time before by one, and performs expression learning. A state classifier (second learning device) learns difference between pieces of feature data, and parameters of the first learning device and the second learning device are updated based on estimated output data and real data.SELECTED DRAWING: Figure 3

Inventors:

MATSUBARA TAKASHI
UEHARA KUNIAKI
ZENG XIAO
NOMOTO YOICHI

Application Number:

JP2019069533A

Publication Date:

October 08, 2020

Filing Date:

March 31, 2019

Export Citation:

Click for automatic bibliography generation Help

Assignee:

UNIV KOBE
EQUOS RES CO LTD

International Classes:

G06N20/00; G06N3/04

Attorney, Agent or Firm:

Global intellectual property corporation

Previous Patent: 収支予測装置、収支予測プログラム及び収支予測方法

Next Patent: 電子申込処理装置及び電子申込処理方法