if you go deeper, you can build this project. am describing the idea which i got from your post here. as per standards, frequency of each persons speech signal will be different. you can follow the steps given below:
1) collect some audio signals.
2) take the sample audio signal which is to be compared
3) compare the FFT of the two signals
4) depending on the requirement, define some error level. if any signal in he database matches the sample audio within the error range, you can say that the sample audio is one among the audio signals in the database.