Fuzzy Logic Based Segmentation for Myanmar Continuous Speech Recognition System

Authors

  • Yin Win Chit Ph.D Researcher, University of Technology (Yatanarpon Cyber City), Pyin Oo Lwin, Myanmar
  • Dr. Renu Dr. Renu Professor, University of Technology (Yatanarpon Cyber City), Pyin Oo Lwin, Myanmar

Keywords:

Time-domain Features, Frequency-domain Features, Fuzzy Logic, Mel Frequency Cepstral Coefficient, Correlation Coefficient.

Abstract

Speech recognition is one of the next generation technologies for human-computer interaction. Automatic Speech Recognition (ASR) is a technology that allows a computer to recognize the words spoken by a person through telephone, microphone or other devices. The various stages of the speech recognition system are pre-processing, segmentation of speech signal, feature extraction of speech and recognition of word. Among many speech recognition systems, continuous speech recognition system is very important and most popular system. This paper proposes the time-domain features and frequency-domain features based on fuzzy knowledge for continuous speech segmentation task via a nonlinear speech analysis. Short-time Energy and Zero-crossing Rate are time-domain features, and Spectral Centroid is frequency-domain feature that the system will calculate in each point of speech signal in order to exploit relevant information for generating the significant segments. Fuzzy Logic technique will be used not only to fuzzify the calculated features into three complementary sets namely: low, middle, high but also to perform a matching phase using a set of fuzzy rules. The output of the Fuzzy Logic are phonemes, syllables and disyllables of Myanmar Language. The result of the system will recognize the continuous words of input speech.

References

[1] T. T. Thet , J. Na and W. K. Ko, “Word Segmentation for the Myanmar Language”, Journal of Information Science, 2008.
[2] T. M. Tun and K. T. Lynn, “Myanmar Continuous Speech to Isolated Word Segmentation”, Engineering and Technology, IJSRSET, Issue 2, Volume 1, 2015.
[3] Haykin, S (2001), Minimum mean square error adaptive filter. In Adaptive Filter Theory, 4th ed. Prentice Hall, Upper Saddle River, 183-228.
[4] M. M. Rahman and M. A. Bhuiyan, “Continuous Bangla Speech Segmentation using short time Speech Features Extraction Approaches.”, IJACSA, Volume 3, No.11, 2012.
[5] T. Zhang and J. C. C. Kuo, “Hierarchical classification of audio data for archiving and retrieving”, In International Conference on Acoustics, Speech and Signal Processing, volume VI, pages 3001–3004. IEEE, 1999.
[6] L R Rabiner and M R Sambur, “An Algorithm for determining the endpoints of Isolated Utterances”, The Bell System Technical Journal, February 1975, pp 298-315.
[7] T Giannakopoulos, “Study and application of acoustic information for the detection of harmful content and fusion with visual information” Ph.D. dissertation, Dept. of Informatics and Telecommunications, University of Athens, Greece, 2009.
[8] Vimala, C., Radha, V., “ A review on speech recognition challenges and approaches”, World Computer. Sci. Inf. Technol., 2012, 2, (1), pp. 1-7.
[9] Mr. Sridhar Chandramohan Iyer, Speaker Recognition System using Coefficients and Correlation Approaches in MATLAB, IJERT, Vol. 3 Issue 5, May – 2014.

Downloads

Published

2017-05-13

How to Cite

Chit, Y. W., & Dr. Renu, D. R. (2017). Fuzzy Logic Based Segmentation for Myanmar Continuous Speech Recognition System. American Scientific Research Journal for Engineering, Technology, and Sciences, 31(1), 183–190. Retrieved from https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/2912

Issue

Section

Articles