Hello,
I tried implementing a LPF biquad (that works perfectly in floating point) with: a = [1.00000, -1.90887, 0.91129], b = [6.1150e-04, 1.2230e-03, 6.1150e-04] in fixed point / integer, with 16 bit coefficients, 10 bit input signal and a 32 bit accumulator. I suspect the coefficient resolution is not enough.
I tried:
- the direct form 1 and direct form 2 transposed structures, but the output is zero if the input signal is not very high -- because of the small b coefficients, the signal is attenuated very much in the FIR portion of the filter and is lost;
- the direct form 2, but the filter works OK only with a very small input signal, otherwise it overflows.
What structure should I use for this biquad? Can it be implemented in this way considering the small b values, or I should use a 64 bit accumulator and maybe 32 bit coefficients?
Thanks