PURPOSE: To accurately determine the border of a sentence from plural sentence voices which are spoken continuously by comparing fundamental frequencies of voice sections with each other.
CONSTITUTION: A high level section detection part 1 finds a set of a start frame and an end frame of a section wherein the power level of a voice is higher than one power level and a low level section detection part 2 finds a frame which is lower than the other power level from respective start and end frames. Further, a candidate section limiting part 3 limits a candidate section according to the length of each candidate section and the time intervals between candidate sections. Lastly, a fundamental frequency calculation part 4 calculates the fundamental frequencies of the respective candidate sections and a sentence start/end determination part 5 compares the fundamental frequencies of the respective candidate sections to regard respective candidate sections as one sentence section by deciding that there is the border of the sentence between the candidate sections when the difference in fundamental frequency between one candidate section and the following candidate section exceeds a threshold value. Consequently, the sentence which is voiced continuously can accurately be segmented.