Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
REINFORCEMENT LEARNING DEVICE, REINFORCEMENT LEARNING METHOD, AND RECORDING MEDIUM
Document Type and Number:
WIPO Patent Application WO/2024/047829
Kind Code:
A1
Abstract:
In order to provide reinforcement learning that is suitable when using a single estimation amount, a reinforcement learning device (1) comprises: a setting unit (11) that sets an initial value of a risk corrected reward; a calculation unit (12) that calculates a correction coefficient; an update unit (13) that updates the risk corrected reward; an estimation unit (14) that estimates the risk corrected reward by a process for calculating the correction coefficient and a process for updating the risk corrected reward; and a learning unit (15) that performs learning on the basis of the estimated risk corrected reward.

Inventors:
SATO NATSUHIKO (JP)
YOSHIDA HIROSHI (JP)
Application Number:
PCT/JP2022/032905
Publication Date:
March 07, 2024
Filing Date:
September 01, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NEC CORP (JP)
International Classes:
G06N20/00
Foreign References:
US20210232970A12021-07-29
JP2014130520A2014-07-10
Other References:
IDO GREENBERG; YINLAM CHOW; MOHAMMAD GHAVAMZADEH; SHIE MANNOR: "Efficient Risk-Averse Reinforcement Learning", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 10 May 2022 (2022-05-10), 201 Olin Library Cornell University Ithaca, NY 14853, XP091224319
Attorney, Agent or Firm:
HARAKENZO WORLD PATENT & TRADEMARK (JP)
Download PDF: