Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ジョブサーバを使用した大規模な分散システムでの機械学習モデルのトレーニング
Document Type and Number:
Japanese Patent JP6894532
Kind Code:
B2
Abstract:
A computer system for training machine learning models includes a job server and a plurality of compute nodes. The job server receives jobs for training machine learning models and allocates these training jobs to groups of one or more compute nodes. The allocation is based on the current requirements of the training jobs and the current status of the compute nodes. The training jobs include updating values for the parameters (e.g., weights and biases) of the machine learning models. Preferably, the compute nodes in the training group communicate the updated values of the parameters among themselves in order to complete the training job.

Inventors:
Shin Chen
Hua Zhou
Dong Yang Wan
Application Number:
JP2019558354A
Publication Date:
June 30, 2021
Filing Date:
April 13, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
Midea Group Co., Ltd.
International Classes:
G06N20/00; G06N20/20; G06F9/50
Domestic Patent References:
JP2013228859A
JP2007533034A
Foreign References:
US20130290223
US20150379424
Other References:
Martin Abadi, et al.,TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems,[online],2016年 3月16日,[2020年12月11日検索],インターネット,
Attorney, Agent or Firm:
Shu Oikawa
Takahashi Fumio
Akiko Hashiguchi
Junichi Kobayashi
Kajii Yoshinori