Fuzzy Logic Based Segmentation for Myanmar Continuous Speech Recognition System

Yin Win Chit, Dr. Renu Dr. Renu

Abstract


Speech recognition is one of the next generation technologies for human-computer interaction. Automatic Speech Recognition (ASR) is a technology that allows a computer to recognize the words spoken by a person through telephone, microphone or other devices. The various stages of the speech recognition system are pre-processing, segmentation of speech signal, feature extraction of speech and recognition of word. Among many speech recognition systems, continuous speech recognition system is very important and most popular system. This paper proposes the time-domain features and frequency-domain features based on fuzzy knowledge for continuous speech segmentation task via a nonlinear speech analysis. Short-time Energy and Zero-crossing Rate are time-domain features, and Spectral Centroid is frequency-domain feature that the system will calculate in each point of speech signal in order to exploit relevant information for generating the significant segments. Fuzzy Logic technique will be used not only to fuzzify the calculated features into three complementary sets namely: low, middle, high but also to perform a matching phase using a set of fuzzy rules. The output of the Fuzzy Logic are phonemes, syllables and disyllables of Myanmar Language. The result of the system will recognize the continuous words of input speech.


Keywords


Time-domain Features; Frequency-domain Features; Fuzzy Logic; Mel Frequency Cepstral Coefficient; Correlation Coefficient.

Full Text:

PDF

References


T. T. Thet , J. Na and W. K. Ko, “Word Segmentation for the Myanmar Language”, Journal of Information Science, 2008.

T. M. Tun and K. T. Lynn, “Myanmar Continuous Speech to Isolated Word Segmentation”, Engineering and Technology, IJSRSET, Issue 2, Volume 1, 2015.

Haykin, S (2001), Minimum mean square error adaptive filter. In Adaptive Filter Theory, 4th ed. Prentice Hall, Upper Saddle River, 183-228.

M. M. Rahman and M. A. Bhuiyan, “Continuous Bangla Speech Segmentation using short time Speech Features Extraction Approaches.”, IJACSA, Volume 3, No.11, 2012.

T. Zhang and J. C. C. Kuo, “Hierarchical classification of audio data for archiving and retrieving”, In International Conference on Acoustics, Speech and Signal Processing, volume VI, pages 3001–3004. IEEE, 1999.

L R Rabiner and M R Sambur, “An Algorithm for determining the endpoints of Isolated Utterances”, The Bell System Technical Journal, February 1975, pp 298-315.

T Giannakopoulos, “Study and application of acoustic information for the detection of harmful content and fusion with visual information” Ph.D. dissertation, Dept. of Informatics and Telecommunications, University of Athens, Greece, 2009.

Vimala, C., Radha, V., “ A review on speech recognition challenges and approaches”, World Computer. Sci. Inf. Technol., 2012, 2, (1), pp. 1-7.

Mr. Sridhar Chandramohan Iyer, Speaker Recognition System using Coefficients and Correlation Approaches in MATLAB, IJERT, Vol. 3 Issue 5, May – 2014.


Refbacks

  • There are currently no refbacks.


 
  
 

 

  


About ASRJETS | Privacy PolicyTerms & Conditions | Contact Us | DisclaimerFAQs 

ASRJETS is published by (GSSRR).