Speech recognition using MFCCs

Overflow10 · Feb 16, 2015

Hi, I am a Computer Engineer. I am implementing a software for speech recognition and speaker recognition. My goal is to recognize a single specified word.
I read on internet that to achieve this objective the Mel Frequency Cepstrum Coefficients are very useful. Can anyone explain me how to use them for achieve my goal? Can anyone recommend me a good book about them?

I explain you what I have implemented and what is my problem.
After make the segmentation of the audio file, I extract the voice and unvoice frames through a clustering algorithm (like k-means) with 2 centroids based on the frames energy. Since the voice frames I get 12 MFCCs, so that I get the matrix with 12 rows (the MFCCs) and as many columns as the number of the voice frames(that is variable). From each row I get an average, after that I have a column vector with 12 rows, where the ith-row is the average of all the ith-MFC coefficients of all frames. I send this vector in input to a classifier.
Now my problem is how to train a classifier.
I train a classifier to recognize 4 words. The training set consists of 40 samples, 10 samples per word, each sample consists of the 12 MFCCs. The system now is able to detect which of the 4 words a speaker gives, but I want to detect a specified speech vs all words.
How I can train the classifier?
The classifier that I trained is a Multilayer Perceptron implemented in weka.
The audio signal processing is implemented in MATLAB.
Can anyone suggests me how to proceed.
I hope I was clear.
Thanks in advice.

Speech recognition using MFCCs

Overflow10

Newbie level 3

Similar threads

Speech recognition using MFCCs

Overflow10

Newbie level 3

Similar threads

Privacy & Transparency

Privacy & Transparency