PURPOSE: To efficiently learn a neural network at high speed.
CONSTITUTION: A first learning means 1 obtains the average of samples for respective categories as an average vector for the hierarchical neural network NW. The weight of connection between units in a stage before a hierarchy as the component of a matrix converting the average vector into an orthonormal base vector is decided, and learning in a prestage is executed. A second learning means 2 executes the learning of a poststage by using a minimum square method or a most rapid drop method. Namely, learning is executed based on the average of the samples obtained for the respective categories in the prestage of the hierarchy. Thus, learning is executed without propagating an error of an output layer to an input layer in an inverse direction even if a back propagation method is used in the poststage of the hierarchy, for example. Then, learning can be speeded up compared to a conventional method.