Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DECREASED QUANTIZATION LATENCY
Document Type and Number:
WIPO Patent Application WO/2022/155890
Kind Code:
A1
Abstract:
Systems and techniques are described herein for decreasing quantization latency. In some aspects, a process includes determining a first integer data type of data at least one layer of a neural network is configured to process, and determining a second integer data type of data received for processing by the neural network. The second integer data type can be different than the first integer data type. The process further includes determining a ratio between a first size of the first integer data type and a second size of the second integer data type, and scaling parameters of the at least one layer of the neural network using a scaling factor corresponding to the ratio. The process further includes quantize the scaled parameters of the neural network, and inputting the received data to the neural network with the quantized and scaled parameters.

Inventors:
ZHANG WENHAO (US)
LI ZHIGUO (US)
LIN RONGHUI (US)
PANG ZHIPING (US)
Application Number:
PCT/CN2021/073299
Publication Date:
July 28, 2022
Filing Date:
January 22, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
QUALCOMM INC (US)
ZHANG WENHAO (CN)
LI ZHIGUO (CN)
LIN RONGHUI (CN)
PANG ZHIPING (CN)
International Classes:
H04N19/124; G06N3/02
Foreign References:
US20200302299A12020-09-24
CN111126557A2020-05-08
US20160328647A12016-11-10
Attorney, Agent or Firm:
LIU, SHEN & ASSOCIATES (CN)
Download PDF: