PURPOSE: To automatically prepare a rule for voice decision making and decide a voice with simple constitution by deciding whether or not a section is a voice according to the presence rate of a frame which is decided as a voice.
CONSTITUTION: A feature extraction part 1 extracts plural feature quantities featuring the voice from an input signal at constant intervals of time and a neural network 2 whose coupling coefficient is predetermined by learning from the feature quantities of many vowels and non-voice data decides whether each frame is the voice or not. Then a voice/non-voice decision is made by using the presence rate of the frame which is decided as the voice by using a large- power part as a voice section candidate. Consequently, the rule, decision condition, etc., for deciding the voice from plural feature quantities need not be determined by trial and error and it can accurately be decided whether the input signal is the voice or not by the simple constitution.