REINFORCEMENT LEARNING DEVICE, REINFORCEMENT LEARNING METHOD, AND RECORDING MEDIUM

Title:

REINFORCEMENT LEARNING DEVICE, REINFORCEMENT LEARNING METHOD, AND RECORDING MEDIUM

Document Type and Number:

WIPO Patent Application WO/2024/047829

Kind Code:

A1

Abstract:

In order to provide reinforcement learning that is suitable when using a single estimation amount, a reinforcement learning device (1) comprises: a setting unit (11) that sets an initial value of a risk corrected reward; a calculation unit (12) that calculates a correction coefficient; an update unit (13) that updates the risk corrected reward; an estimation unit (14) that estimates the risk corrected reward by a process for calculating the correction coefficient and a process for updating the risk corrected reward; and a learning unit (15) that performs learning on the basis of the estimated risk corrected reward.

More Like This:

JP2023168364	DIRECTING TRAJECTORIES THROUGH COMMUNICATION DECISION TREE USING ITERATIVE ARTIFICIAL INTELLIGENCE
WO/2023/146917	A MACHINE LEARNING PIPELINE FOR HIGHLY SENSITIVE ASSESSMENT OF ROTATOR CUFF FUNCTION
JP7396133	Parameter adjustment device, inference device, parameter adjustment method, and parameter adjustment program

Inventors:

SATO NATSUHIKO (JP)
YOSHIDA HIROSHI (JP)

Application Number:

PCT/JP2022/032905

Publication Date:

March 07, 2024

Filing Date:

September 01, 2022

Export Citation:

Click for automatic bibliography generation Help

Assignee:

NEC CORP (JP)

International Classes:

G06N20/00

Foreign References:

US20210232970A1	2021-07-29
JP2014130520A	2014-07-10

Other References:

IDO GREENBERG; YINLAM CHOW; MOHAMMAD GHAVAMZADEH; SHIE MANNOR: "Efficient Risk-Averse Reinforcement Learning", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 10 May 2022 (2022-05-10), 201 Olin Library Cornell University Ithaca, NY 14853, XP091224319

Attorney, Agent or Firm:

HARAKENZO WORLD PATENT & TRADEMARK (JP)

Download PDF:

View/Download PDF PDF Help

Previous Patent: DESIGN ASSISTANCE DEVICE AND DESIGN ASSISTANCE METHOD

Next Patent: REFRIGERATION CYCLE DEVICE AND AIR-CONDITIONING DEVICE