speech recognition MATLAB

Status
Not open for further replies.

hansmuller

Member level 1
Joined
Apr 27, 2008
Messages
39
Helped
12
Reputation
22
Reaction score
7
Trophy points
1,288
Activity points
1,496
speech recognition matlab

Dear friends,

For isolated word recognition I am creating MFCCs from the filtered and endpoints detected voice. After created MFCCs I look the distances between created MFCCs of all words. Then the minimum distance is the recognized word.

Is there anything wrong or missing by this system? Should I do anything after creating MFCC vectors?

Thanks in advance.
 

dtw matlab

hi.
wat u have done is a naive pattern classification approach but there are several problems aassociaed wid it.
ex: lets say the 'shot'...let ur training data be 'shooot' and ur testing data be 'shot'. so, if i window it n take mfcc, which i am amusing u have done ( n even other wise it would have problems) then the timing of the frames wont match n gives u a mismatch.

these basic approaches of pattern recognition usually dont work for speech recognition. search for discrete time warping and hidden markov model (by Rabiner)) for recognition.
 
dynamic time warping matlab

thx for your answer.

Actually I am using unnormalized distance output of a dtw algortihm function in matlab :
Code:
function [Dist,D,k,w]=dtw(t,r)
%Dynamic Time Warping Algorithm
%Dist is unnormalized distance between t and r
%D is the accumulated distance matrix
%k is the normalizing factor
%w is the optimal path
%t is the vector you are testing against
%r is the vector you are testing
[rows,N]=size(t);
[rows,M]=size(r);
for n=1:N
    for m=1:M
        d(n,m)=(t(n)-r(m))^2;
end
end
%d=(repmat(t(:),1,M)-repmat(r(:)',N,1)).^2; %this replaces the nested for loops from above Thanks Georg Schmitz 

D=zeros(size(d));
D(1,1)=d(1,1);

for n=2:N
    D(n,1)=d(n,1)+D(n-1,1);
end
for m=2:M
    D(1,m)=d(1,m)+D(1,m-1);
end
for n=2:N
    for m=2:M
        D(n,m)=d(n,m)+min([D(n-1,m),D(n-1,m-1),D(n,m-1)]);
    end
end

Dist=D(N,M);
n=N;
m=M;
k=1;
w=[];
w(1,:)=[N,M];
while ((n+m)~=2)
    if (n-1)==0
        m=m-1;
    elseif (m-1)==0
        n=n-1;
    else 
      [values,number]=min([D(n-1,m),D(n,m-1),D(n-1,m-1)]);
      switch number
      case 1
        n=n-1;
      case 2
        m=m-1;
      case 3
        n=n-1;
        m=m-1;
      end
  end
    k=k+1;
    w=cat(1,w,[n,m]);
end

I am looking at the output 'dist' for distance between vectors.

I dont prefer HMM because my system is not complicated. I am training just four words for now.

Yes youre right for the similar words in pronunciation but it is always a problem I think.
Out of that problem is there anything that is missing or error in the system?[/i]
 

speech recognition dtw matlab

hi..

if u r using DTW for just 4 words ur system should work really fine.

problem wid the pronunciation will be be taken care of by the DTW system, that how the algo works.

I am sorry... i am not a great fan of MATLAB codes and didnt go through ur code to say.

Rakesh
 

mfcc speech recognition matlab

not matter. I have just put the code for an example of dtw algo.

it gives good results with my old desktop computer but with my laptop, it doesnt always give good results. I think the microphone inputs of two computers make difference about the noise level. on laptop i see some glitches in the recording so it negatively effects the filtering and endpoint detection.
 

hi im yasar,


im doing project on voice recognition system, i dont have any idea about it.
please do provide me some links and codes, so that it wil be useful for me..

tell me the algorithm too.

id: yasarbruce@gmail.com
 
ya , i ve also applied same dynamic time warping technique.
i can record some password in database.
then my system takes input speech signal through microphone, and compares it with all saved templates, and returns the one with minumum global distance.
problem arises once we speak the word which is not there in the database. in this case, it compares it with all templates and still returns some word from database which has minimum distance from that spoken word. how can we avoid this problem
 

try applying a threshold to the global distance. for example if minimum distance value is greater than a threshold value it is meant it is not in database.
 
Reactions: gallabaan

    gallabaan

    Points: 2
    Helpful Answer Positive Rating
    V

    Points: 2
    Helpful Answer Positive Rating
dear hansmuller,
thanks for the quick reply.
ya, idea of threshold is really good and i have been thinking about this since last 2,3 days.
problem is that if for example there are 5 words in the database, should we have separate threshold for each and every word.
or we should have one single threshold for all 5 passwords.
moreover as my system is supposed to be user independent, so there can be a great variation in global distance, from user to user.

---------- Post added at 12:10 ---------- Previous post was at 11:58 ----------

you can check out dpfast.m in MATLAB.
 
Re: mfcc speech recognition matlab

U said the code worked on your old computer can u send the that DTW code to me please

Does this code works tell me

function [Dist,D,k,w]=dtw(t,r)
%Dynamic Time Warping Algorithm
%Dist is unnormalized distance between t and r
%D is the accumulated distance matrix
%k is the normalizing factor
%w is the optimal path
%t is the vector you are testing against
%r is the vector you are testing
[rows,N]=size(t);
[rows,M]=size(r);
for n=1:N
for m=1:M
d(n,m)=(t-r(m))^2;
end
end
%d=(repmat(t),1,M)-repmat(r)',N,1)).^2; %this replaces the nested for loops from above Thanks Georg Schmitz

D=zeros(size(d));
D(1,1)=d(1,1);

for n=2:N
D(n,1)=d(n,1)+D(n-1,1);
end
for m=2:M
D(1,m)=d(1,m)+D(1,m-1);
end
for n=2:N
for m=2:M
D(n,m)=d(n,m)+min([D(n-1,m),D(n-1,m-1),D(n,m-1)]);
end
end

Dist=D(N,M);
n=N;
m=M;
k=1;
w=[];
w(1,=[N,M];
while ((n+m)~=2)
if (n-1)==0
m=m-1;
elseif (m-1)==0
n=n-1;
else
[values,number]=min([D(n-1,m),D(n,m-1),D(n-1,m-1)]);
switch number
case 1
n=n-1;
case 2
m=m-1;
case 3
n=n-1;
m=m-1;
end
end
k=k+1;
w=cat(1,w,[n,m]);
end
 
Re: mfcc speech recognition matlab

can u kindly explain me whats the local distance which is calculated between corresponding elements of two sequences.
i mean this is the distance relative to which axis?
moreover, can u tell me whats the pupose of "simmx.m" M-file in dynamic time warping?
 

Re: mfcc speech recognition matlab

In matlab samples are in x axis only.

samples are in the form of
t= 0.0001
-0.0010

---------- Post added at 17:07 ---------- Previous post was at 17:01 ----------

actually samples are in x axis. Like below
t= 0.0089
-0.0032
0.0023
0.0024
This is an example of arrangement of samples

I tried with Euclidean Distance & manhattan Distance also I am getting noise as output as i play warping path.
I didnt specify about simmx.m file. I found that frm other source, But I dont knw about that.
 

    V

    Points: 2
    Helpful Answer Positive Rating
hello, We are making an Automatic speaker recognition system using MFCC and DTW...
can any one please help us with a simple MFCC code, coz the codes given in other sites are too complex..
thanks in advance...
 

Re: mfcc speech recognition matlab


"this code is not working in MATLAB R2010a version. kindly please help us with it;'
 

Status
Not open for further replies.
Cookies are required to use this site. You must accept them to continue using the site. Learn more…