Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Detecting audio (with limited processing power)

Status
Not open for further replies.

Eight

Member level 2
Member level 2
Joined
Mar 8, 2013
Messages
53
Helped
1
Reputation
2
Reaction score
1
Trophy points
1,288
Activity points
2,084
Hello.

This is a DIY project. I am trying to design a sound detector on a ATmega328P (20MHz) microcontroller (no, it's not Arduino). Audio DSP is something I haven't done before, at least not on *this* low level, thus, I would like to request some advice to see, if I'm doing things correctly.

I designed a custom prototype PCB with an electret microphone. It is connected to a custom preamplifier and the sensitivity can be set by an on-board potentiometer. The circuit is capable of picking up some pretty faint noises. The opamp output is then fed onto an ADC input on the on-board ATmega328P microcontroller. The ADC is configured to produce samples with about 39 kSPS stream of unsigned 8-bit values (uint8_t). The goal is to process this audio stream and detect sudden bursts of noise. My question is what would be the proper way to go about designing code for this (I am coding in C). I can't run FFT on it in realtime due to lack of processing power. I've already implemented something, but I am not quite happy with it.

My current code works like this:
  1. The code initially samples the mean value. This is done by summing up a large number of samples and dividing the sum by the sample count. The value is usually 127 or 128. Maybe I can skip this step and assume it's always going to be 127?
  2. For each new incoming sample the code calculates a delta value. A delta value is basically an offset from the mean: delta = abs(sample - mean). This is to get absolute values of the audio stream.
  3. I have two sliding windows which are basically an average approximation of the last XX delta values. The first window is a signal window and is very short (like an average of the past 5000 samples or so). The other window is a noise window and is an average of a much longer time span (say like 20 seconds worth of audio). Basically this means that the signal value will ascend and decay much faster than the noise value.
  4. The code checks the ratio between the signal and noise. If the signal value exceeds the noise plus some margin (signal > noise * 1.3) then the code assumes a sound is present.
  5. When the signal level slides beneath the noise level + margin (signal < noise * 1.2) then the code assumes the sound is gone.
  6. Finally, the code calculates the duration of the sound and rejects noises shorter than 100ms or so.

The issue with this approach is that it's very difficult to tune. The tuning variables are the signal/noise window decaying speeds and the margins for noise/silence detection. In a quiet room it works decently well, but outside where the noise floor is much higher it has many false positives. As an example, imagine a constant traffic of cars driving on a highway in the distance (not honking), so there is just a continuous noise of tires driving on the asphalt. The code will sometimes trigger on this noise alone even though my ears don't hear any noticeable deviations. When a dog barks somewhere not as far away the code does pick it up correctly, but often misses some birds singing.

Thoughts?
 

It's well understandable that you can't implement realtime spectrum analysis in ATmega, but it's not clear which signal parameters you are looking for. "detect sudden bursts of noise" is rather general. I'd expect that some kind of bandpass filtering in front of the magnitude detector can be useful.
 

You can do some limited FFT stuff :




Not real time, but possibly usable approach. Keep in mind the sample set can be acquired
real time, its just a lot of latency to FFT result, hence periods of time input "dead time"
periods. But your current approach also has same problem, eg. a sample set has to be
acquired.

Not a DSP expert, but can a cross correlation (auto correlation) f() be used here ? Eg. the traffic
must present as having some periodic components, hence be eliminated as auto correlation would
show a non zero value ? Eg. its not pure noise....


Regards, Dana.
 
Last edited:

FvM: Please note that "but it's not clear which signal parameters you are looking for" is the result of "Audio DSP is something I haven't done before". I'm unsure what exactly I should be using here, so I just opted for amplitude because it seems to be the simplest choice. Should I use a square of the amplitude rather than an absolute value (delta) or something else to detect loudness? The thing seems to trigger on lower frequency sounds a lot. Perhaps I should have added a hardware high-pass filter on the mic that filters out the bass.

danadakk: Thanks. Although I'm not sure how much processing power I can spare here. I am running this on an 8-bit timer with an overflow interrupt routine so the code has exactly 256 clock cycles to complete before the next sample arrives, and some assembly instructions need multiple clock cycles.
 

The tradeoff is you take snapshots of samples rather than process
all samples coming into the ADC. So while the algorithm is being
computed you essentially throw away the real time incoming
samples during that computation time. This causes the latency i
referred to earlier.

1660124524357.png


Regards, Dana.
 

Are you making a distinction between “a sudden burst of noise” and ‘a sudden burst of sound’? What’s the difference? Amplitude? Frequency content? Envelope?
 

If your sound-to-detect has characteristic tones, these would
be easier to disposition than combing through a stream of FFT
data.

Also helpful to know if the "audio" must be detected in the
presence of high amplitude noise, if out-of-audio-band is
to be taken as an "indicator of crap" or a proper detect, etc.

Being more of an analog guy I'd be thinking about an audio
bandpass filter (to qualify that some of signal is audio),
maybe a high pass (to assess non-audio HF as indicator
of spurious mic signal) combined in the logic, an ideal
rectifier as detector and a comparator for level discrimination.
That's maybe a quad op amp and passives, and a quad
comparator, at its simplest I think.

A question is, what does the DSP complexity and effort add
to the desired function? What about the function, demands
such an approach?

Or is this a solution in search of a problem?
 


Status
Not open for further replies.

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top