PURPOSE: To easily perform excellent segmentation by determining a variation section and a stationary section of a phoneme from the moving average of varying energy.
CONSTITUTION: Speech data S(n) inputted from an input terminal 1 is and analyzed by a glottis analysis part 2 by linear prediction of up to 10th order of maximum prediction degree and its prediction error is analyzed by a vocal chord analysis part 3 by linear prediction of up to 10th order of maximum prediction degree to extract a timer series of speech feature vectors X(n). Then a projection arithmetic part 4 generates two-dimensional projection vectors Y(n). The projection vectors Y(n) are supplied to a varying energy calculation part 5, which calculates varying energy e(n) and finds the moving average E(n) of this varying energy e(n). This moving average E(n) is inputted to a threshold value circuit to determine a section where this E(n) does not exceed a threshold value Th as the stationary phoneme section and a section where the threshold value is exceeded as the nonstationary phoneme section.
TAKIZAWA YUMI
ODA KEISUKE
FUKAZAWA ATSUSHI