Mel Filter Bank Processing

Phillip+ · Dec 3, 2012

Hello,

Sorry about my ignorance, I am trying to learn this subject for a finals project I am undertaking.

Brief background:

I am developing a Speech Recognition algorithm that identifies whether someone is saying a particular word, in this case "Yes" or "No".

I am computing an MFCC (From this paper: https://arxiv.org/pdf/1003.4083.pdf) and what I have done so far is:

Pre-emphasis
Framing
Hamming Windowing

The equation I am struggling on is "Step 4" .. Now ok, if I take the FFT of each of the "Windows" in the Time-domain and multiply by the Mel filters' frequency response, would this be enough?

I also have a problem with this equation:

For example, what does F represent? Does it represent the FFT of the "Window" or the "Window" in the time-domain?

I hope someone can help, sorry for my lack of understanding.. I am learning here.

shekofteh · Jan 12, 2013

Hi, please see the following link of MATLAB codes:

http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

specially, function of "melcepst.m" implements a mel-cepstrum front end (MFCC) as a feature extractor for speech decoder.

F in your equ. is frequency. It represents range of frequency between 0 to Fs by steps of Fs/N, where N is the length of window.

Mel Filter Bank Processing

Phillip+

Newbie level 1

shekofteh

Newbie level 3

Similar threads

Mel Filter Bank Processing

Phillip+

Newbie level 1

shekofteh

Newbie level 3

Similar threads

Privacy & Transparency

Privacy & Transparency