MEMS microphone audio data acquisition

Ethan25 · 2024-09-24T07:14:21+0100

Hi,
I need some advice on how to collect data using mems microphone? should I collect them as .wav file and then use them for processing or should I collect the analog values and then use it for processing, i felt using .wav file would make the file size big so i though collecting the analog values storing them in a csv and then using MATLAB I can process it.

Which is a better way ?

KlausST · 2024-09-24T07:36:22+0100

Hi,

you give no numbers, no requirements

Also we have no idea what´s your understanding of "processing" means. Digital or analog? Simple low pass filtering .. or advanced voice recognition?
We don´t know what audio bandwidth you are talking, if you need real time processing..
If you don´t need real time processing you need to tell expected audio quality and length of audio time for data storage.
(Maybe you want to analyze birds chirping in a forest for a period of a whole month. We don´t know)

What is a "too big file size" for you? (If you have a RAM storage you may be limited to below 1MByte, if you have a harddisk you may have TeraBytes available)

Ethan25 said:
Which is a better way ?

The one that fits most to your (to us unknown) requirements.

Klaus

dpaul · 2024-09-24T07:40:51+0100

It depends on your requirement what you want to do with the data and the quality of samples you want to work with.
I think it might be better to collect the analog values on csv format and use them later.

betwixt · 2024-09-24T07:44:13+0100

A .wav file is sampled values, so is a CSV file. Whichever way you are going to process the signal digitally it has to be converted to numbers. The main difference as far as storing values is wav is normally stored in binary format, one value per sample and .csv is stored as text so you need at least four storage locations for each value, three for the value itself and one for the comma.

Brian.

KlausST · 2024-09-24T07:50:56+0100

dpaul said:
I think it might be better to collect the analog values on csv format

how?
csv works with numbers. And numbers are digital values.

You may store analog values on an audio tape...

Klaus

Ethan25 · 2024-09-24T08:43:41+0100

KlausST said:
Hi,

you give no numbers, no requirements

Also we have no idea what´s your understanding of "processing" means. Digital or analog? Simple low pass filtering .. or advanced voice recognition?
We don´t know what audio bandwidth you are talking, if you need real time processing..
If you don´t need real time processing you need to tell expected audio quality and length of audio time for data storage.
(Maybe you want to analyze birds chirping in a forest for a period of a whole month. We don´t know)

What is a "too big file size" for you? (If you have a RAM storage you may be limited to below 1MByte, if you have a harddisk you may have TeraBytes available)

The one that fits most to your (to us unknown) requirements.

Klaus

I'm planning to collect data for about a month everyday for 12 hours (I'm not sure how long should each audio file be).
I'm using a 16bit external ADC with my microphone, I'm sampling them at 96KHz.
processing - I'm only going to take FFT & STFT from the collected data.

KlausST · 2024-09-24T12:14:32+0100

Hi,

so basically you have digital audio data. Then store them digitally. Forget about analog storage.

You can easily calculate how much data this means - uncompressed.

For sure - to reduce this data amount - you would do compression.
But many compression algorithms also reduce quality.
Now we don´t know what "quality" you are interested in.

You say you do FFT. So you have to understand that an MP3 algorithm (as an compression method example) just removes a lot of frequencies - that can´t be recognized by a human.
For sure these frequencies still would be recognized by an FFT.

If it was my project .. I´d surely have a clear requirement of the result you would gain from the processing.
And I´d focus on this.
So maybe (as an example) you like to trace the cars and their speed passing a bridge. Then I´d omit all audio data that don´t contain "a car". ... just to save memory space and reduce processing power (processing time).

Klaus

FvM · 2024-09-24T12:37:07+0100

The purpose of audio analysis hasn't been yet mentioned.

.wav file supports different coding options with different sample rate and data resolution. Basic linear PCM coding contains just raw binary data plus a short header, thus it's the most compact way to store all information contained in audio. A popular format ist 2 channel 16-Bit 44 kHz (CD audio). Lossless compression can further reduce data size. matlab or .csv storage has more redundancy than .wav.

.wav can also use MP3 or other other lossy audio coding methods. I agree with KlausST that MP3 coded data probably misses important information of original audio data, but ir really depends on analysis purpose.

betwixt · 2024-09-24T23:04:21+0100

In its simplest form, ignoring any encoding or packaging:

16 bits = 2 bytes per value,
2 bytes at 96KHz rate = 192Kb per second,
for 12 hours that's 192Kb * 43,200 seconds = 8294400000 bytes (about 8.1 Gb)
for say 30 days that's about 243Gb.

Now you can see why we think compression may be useful but compression sacrifices quality so how accurate the recordings have to be is a critical factor in advising you.

Brian.

Welcome to EDAboard.com

MEMS microphone audio data acquisition

Ethan25

Junior Member level 3

KlausST

Advanced Member level 7

dpaul

Advanced Member level 5

betwixt

Super Moderator

KlausST

Advanced Member level 7

Ethan25

Junior Member level 3

KlausST

Advanced Member level 7

FvM

Super Moderator

betwixt

Super Moderator

Commands Quick-Menu:

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Connect with us

Online statistics

Forum statistics