If your sound-to-detect has characteristic tones, these would
be easier to disposition than combing through a stream of FFT
data.
Also helpful to know if the "audio" must be detected in the
presence of high amplitude noise, if out-of-audio-band is
to be taken as an "indicator of crap" or a proper detect, etc.
Being more of an analog guy I'd be thinking about an audio
bandpass filter (to qualify that some of signal is audio),
maybe a high pass (to assess non-audio HF as indicator
of spurious mic signal) combined in the logic, an ideal
rectifier as detector and a comparator for level discrimination.
That's maybe a quad op amp and passives, and a quad
comparator, at its simplest I think.
A question is, what does the DSP complexity and effort add
to the desired function? What about the function, demands
such an approach?
Or is this a solution in search of a problem?