My moving-average filter resulted in wrong result

mohamis288 · Mar 8, 2023

Hello,

I am simulating moving-average filter in modelsim. some confusing result have been observed and I explain them here.

My moving average filter is 7-tap and my verilog code is represented here:

Code:

`timescale 1ms / 1ps

// Module Name: FIR_Filter


module movingAverage_Filter(clk, reset, my_data_in, my_data_out);

parameter N = 8;

input clk, reset;
input [N-1:0] my_data_in;
output reg [N-1:0] my_data_out;

// coefficients defination
// Moving Average Filter, 7th order

//wire [7:0] b0 = 0_0001010;
//wire [7:0] b1 = 0_0010000;
//wire [7:0] b2 = 0_0101000;
//wire [7:0] b3 = 0_1000000;
//wire [7:0] b4 = 0_0101000;
//wire [7:0] b5 = 0_0010000;
//wire [7:0] b6 = 0_0001010;

wire [7:0] b0 = 0.078125;
wire [7:0] b1 = 0.125;
wire [7:0] b2 = 0.3125;
wire [7:0] b3 = 0.5;
wire [7:0] b4 = 0.3125;
wire [7:0] b5 = 0.125;
wire [7:0] b6 = 0.078125;


wire [N-1:0] x1, x2, x3, x4, x5, x6;

// Create delays i.e x[n-1], x[n-2], .. x[n-N]
// Instantiate D Flip Flops
DFF DFF0(clk, reset, my_data_in, x1); // x[n-1]
DFF DFF1(clk, reset, x1, x2);      // x[x[n-2]]
DFF DFF2(clk, reset, x2, x3);      // x[n-3]
DFF DFF3(clk, reset, x3, x4);     
DFF DFF4(clk, reset, x4, x5);
DFF DFF5(clk, reset, x5, x6);


//  Multitiplication
wire [N-1:0] Multi0, Multi1, Multi2, Multi3, Multi4, Multi5, Multi6; 
assign Multi0 = my_data_in * b0;
assign Multi1 = x1 * b1; 
assign Multi2 = x2 * b2; 
assign Multi3 = x3 * b3; 
assign Multi4 = x4 * b4; 
assign Multi5 = x5 * b5; 
assign Multi6 = x6 * b6; 

 
// Addition operation
wire [N-1:0] Add_final_value;
assign Add_final_value = Multi0 + Multi1 + Multi2 + Multi3 + Multi4 + Multi5 + Multi6;

// Final calculation to output
always@(posedge clk)
my_data_out <= Add_final_value;

endmodule


module DFF(clk, reset, my_data_in, data_delayed);
parameter N = 8;
input clk, reset;
input [N-1:0] my_data_in;
output reg [N-1:0] data_delayed;

always@(posedge clk)
begin
    if (reset)
    data_delayed <= 0;
    else
    data_delayed <= my_data_in;
    
end

endmodule

When I simulate this verilog code in modelsim, the result is completely wrong. in fact, we expect our output to be average of last seven input samples. but what we observed was just a 3-sample delayed version of input. in fact, in ideal simulation, modelsim ignore all the multiplier and some of delays. I also synthesized my code in code vision and I observed the synthesized circuit and understood that my observation were correct, i.e., multiplier were not present in my circuit and also input was connected directly to the output by first 3 DFF in between. input also was connected to the last 3 DFF (connected in series) which its output was no connect.

what is the problem do you think?

--- Updated Mar 8, 2023 ---

please give me a point. any help would be appreciated.

KlausST · Mar 8, 2023

Hi,

for a "moving average" I expect all coefficients to be identical = "1/filter_depth".
You now have different coefficients, thus I call it a regular "FIR filter"

So far so good.
Now your input data width is 8 and so is the output data width.
Not to get an overflow the sum of the coefficients have to be <=1. This is not the case here.
You may need to adjust on this.

Instead of giving vague informations like "result is completely wrong" and "to be average of" ...
You could give values. With this we are able to check
* your expectation
* the simuation results

Klaus

mohamis288 · Mar 10, 2023

mohamis288 said:
Hello,

I am simulating moving-average filter in modelsim. some confusing result have been observed and I explain them here.

My moving average filter is 7-tap and my verilog code is represented here:

Code:

`timescale 1ms / 1ps // Module Name: FIR_Filter module movingAverage_Filter(clk, reset, my_data_in, my_data_out); parameter N = 8; input clk, reset; input [N-1:0] my_data_in; output reg [N-1:0] my_data_out; // coefficients defination // Moving Average Filter, 7th order //wire [7:0] b0 = 0_0001010; //wire [7:0] b1 = 0_0010000; //wire [7:0] b2 = 0_0101000; //wire [7:0] b3 = 0_1000000; //wire [7:0] b4 = 0_0101000; //wire [7:0] b5 = 0_0010000; //wire [7:0] b6 = 0_0001010; wire [7:0] b0 = 0.078125; wire [7:0] b1 = 0.125; wire [7:0] b2 = 0.3125; wire [7:0] b3 = 0.5; wire [7:0] b4 = 0.3125; wire [7:0] b5 = 0.125; wire [7:0] b6 = 0.078125; wire [N-1:0] x1, x2, x3, x4, x5, x6; // Create delays i.e x[n-1], x[n-2], .. x[n-N] // Instantiate D Flip Flops DFF DFF0(clk, reset, my_data_in, x1); // x[n-1] DFF DFF1(clk, reset, x1, x2); // x[x[n-2]] DFF DFF2(clk, reset, x2, x3); // x[n-3] DFF DFF3(clk, reset, x3, x4); DFF DFF4(clk, reset, x4, x5); DFF DFF5(clk, reset, x5, x6); // Multitiplication wire [N-1:0] Multi0, Multi1, Multi2, Multi3, Multi4, Multi5, Multi6; assign Multi0 = my_data_in * b0; assign Multi1 = x1 * b1; assign Multi2 = x2 * b2; assign Multi3 = x3 * b3; assign Multi4 = x4 * b4; assign Multi5 = x5 * b5; assign Multi6 = x6 * b6; // Addition operation wire [N-1:0] Add_final_value; assign Add_final_value = Multi0 + Multi1 + Multi2 + Multi3 + Multi4 + Multi5 + Multi6; // Final calculation to output always@(posedge clk) my_data_out <= Add_final_value; endmodule module DFF(clk, reset, my_data_in, data_delayed); parameter N = 8; input clk, reset; input [N-1:0] my_data_in; output reg [N-1:0] data_delayed; always@(posedge clk) begin if (reset) data_delayed <= 0; else data_delayed <= my_data_in; end endmodule

When I simulate this verilog code in modelsim, the result is completely wrong. in fact, we expect our output to be average of last seven input samples. but what we observed was just a 3-sample delayed version of input. in fact, in ideal simulation, modelsim ignore all the multiplier and some of delays. I also synthesized my code in code vision and I observed the synthesized circuit and understood that my observation were correct, i.e., multiplier were not present in my circuit and also input was connected directly to the output by first 3 DFF in between. input also was connected to the last 3 DFF (connected in series) which its output was no connect.

what is the problem do you think?

--- Updated Mar 8, 2023 ---

please give me a point. any help would be appreciated.

UPDATE
the above code have been synthetized in design vision and this is the schematic view:

which is not correct for a FIR filter.

when I replaced the filter multiplier coefficient with the following lines:

wire [7:0] b0 = 0_0001010;
wire [7:0] b1 = 0_0010000;
wire [7:0] b2 = 0_0101000;
wire [7:0] b3 = 0_1000000;
wire [7:0] b4 = 0_0101000;
wire [7:0] b5 = 0_0010000;
wire [7:0] b6 = 0_0001010;

the schematic view is like this:

which seems correct, but output is not as expected. it is like this:

--- Updated Mar 10, 2023 ---

KlausST said:
Hi,

for a "moving average" I expect all coefficients to be identical = "1/filter_depth".
You now have different coefficients, thus I call it a regular "FIR filter"

So far so good.
Now your input data width is 8 and so is the output data width.
Not to get an overflow the sum of the coefficients have to be <=1. This is not the case here.
You may need to adjust on this.

Instead of giving vague informations like "result is completely wrong" and "to be average of" ...
You could give values. With this we are able to check
* your expectation
* the simuation results

Klaus

Hello,
@KlausST
input signal was 8 bits. and sum of FIR filter's coefficients was more than one. but, input signal is between -64 and 63 and we have simulated this filter in MATLAB and we are sure that the output signal does not exceed the input 127 or -128. I have updatted the question, so you can see the related picture.
thank you

FvM · Mar 11, 2023

Simulated in Matlab, how? With fixed package and correct bit widths? Otherwise the result is meaningless.

The original code has more than one problem
1. Real constants truncated to zero
2. 8x8 multiply 16 bit result truncated to 8 bit
3. Overflow in term sum, as already discussed

Now you fixed 1. by using integer constant, but not 2 and 3.

The simulator shows you each fault exactly, just need to look.

--- Updated Mar 11, 2023 ---

Regarding 2, to get a roughly correct result, you can strip 8 lower bits instead truncating the upper. Better keep full 16 bit resolution (plus up to 3 extra bits, see issue 3) til the final sum. Than truncate and round as necessary.

kaz1 · Mar 11, 2023

Lots of issues as others pointed out.
1) you need clocked process on mults and adder, apart from final output.
This way you get real FIR and also can check sim better.
2) allow full bit growth internally:
8 bits * 8 bits => 16 bits
16 bits + 16 bits +... => 19 bits.
for final output discard LSBs as suitable.
match matlab model and your rtl if checking both.

mohamis288 · Mar 11, 2023

kaz1 said:
Lots of issues as others pointed out.
1) you need clocked process on mults and adder, apart from final output.
This way you get real FIR and also can check sim better.
2) allow full bit growth internally:
8 bits * 8 bits => 16 bits
16 bits + 16 bits +... => 19 bits.
for final output discard LSBs as suitable.
match matlab model and your rtl if checking both.

Sorry I am new to Verilog. Can you please tell me how to discard LSBs in multiplication using a simple example?

kaz1 · Mar 11, 2023

I assume it is like this for discarding 3 LSBs and some MSBs assuming no overflow and 8 bit output:
my_data_out <= Add_final_value[10:3];

mohamis288 · Mar 11, 2023

I

kaz1 said:
I assume it is like this for discarding 3 LSBs and some MSBs assuming no overflow and 8 bit output:
my_data_out <= Add_final_value[10:3];

Thank you @kaz1,
It seems that add_final_value has just 8 bits. Because even 9th bit is out of bound. I also tried to do the same for previous statements like this:

assign Multi1 = (x1 * b1)[15:8];

But I got compilation error. What is your suggestion now?

Thank you

kaz1 · Mar 11, 2023

Try this:

// Multitiplication
wire [15:0] Multi0, Multi1, Multi2, Multi3, Multi4, Multi5, Multi6;
assign Multi0 = my_data_in * b0;
assign Multi1 = x1 * b1;
assign Multi2 = x2 * b2;
assign Multi3 = x3 * b3;
assign Multi4 = x4 * b4;
assign Multi5 = x5 * b5;
assign Multi6 = x6 * b6;

// Addition operation
wire [18:0] Add_final_value;
assign Add_final_value = Multi0 + Multi1 + Multi2 + Multi3 + Multi4 + Multi5 + Multi6;

// Final calculation to output
always@(posedge clk)
my_data_out <= Add_final_value[10:3];

Remember you need clocked process on mults and adder

FvM · Mar 11, 2023

kaz1 said:
Remember you need clocked process on mults and adder

Not necessarily, but depending on clock frequency hardware synthesis might not achieve timing closure.

kaz1 · Mar 11, 2023

FvM said:
Not necessarily, but depending on clock frequency hardware synthesis might not achieve timing closure.

There are 7 adders in parallel after mults (16*7 = 112 bits wide), unless it is kitchen based project then I expect your advice is not practical for a beginner.

FvM · Mar 11, 2023

The adders are maximal 19 bits wide. Even with full 19 bit output, the unregistered design achieves > 100 MHz with a recent low cost FPGA. With registered multiplier output and a pipelined adder (4+3 partial sums registered) > 200 MHz.

So yes, speed can be considerably increased with additional registers, but it might not be necessary for the actual application.

My advice would be to use appropriate rather than maximal pipelining.

mohamis288 · Mar 11, 2023

kaz1 said:
Try this:

// Multitiplication
wire [15:0] Multi0, Multi1, Multi2, Multi3, Multi4, Multi5, Multi6;
assign Multi0 = my_data_in * b0;
assign Multi1 = x1 * b1;
assign Multi2 = x2 * b2;
assign Multi3 = x3 * b3;
assign Multi4 = x4 * b4;
assign Multi5 = x5 * b5;
assign Multi6 = x6 * b6;

// Addition operation
wire [18:0] Add_final_value;
assign Add_final_value = Multi0 + Multi1 + Multi2 + Multi3 + Multi4 + Multi5 + Multi6;

// Final calculation to output
always@(posedge clk)
my_data_out <= Add_final_value[10:3];

Remember you need clocked process on mults and adder

even with this code, I got the wrong output. I also changed [10:3] but output is still wrong. I thought to change my code to the shifter version like this:

Code:

wire [N-1:0] Multi0, Multi1, Multi2, Multi3, Multi4, Multi5, Multi6; 
assign Multi0 = (my_data_in>>4)+(my_data_in>>6);
assign Multi1 = (my_data_in>>3); 
assign Multi2 = (my_data_in>>4)+(my_data_in>>2); 
assign Multi3 = (my_data_in>>1); 
assign Multi4 = (my_data_in>>4)+(my_data_in>>2);
assign Multi5 = (my_data_in>>3); 
assign Multi6 = (my_data_in>>4)+(my_data_in>>6); 

 
// Addition operation
wire [N-1:0] Add_final_value;
assign Add_final_value = Multi0 + Multi1 + Multi2 + Multi3 + Multi4 + Multi5 + Multi6;

// Final calculation to output
always@(posedge clk)
my_data_out <= Add_final_value;

in this way I am sure that overflow does not occur. According to MATLAB, I am sure that the outout doesn't exceed 8-bit, So I let it to be that 8-bit. this time, my output in modelsim is:

now it is better but something is wrong. I do not know what it is. let me know your thought.
this is my output in MATLAB:

kaz1 · Mar 11, 2023

I ran your coeffs in matlab: blue input, red output

I wonder what is your input. Try not change original code but show me your work, it should work

mohamis288 · Mar 11, 2023

kaz1 said:
I ran your coeffs in matlab: blue input, red output

View attachment 181715

I wonder what is your input. Try not change original code but show me your work, it should work

I thought it was not that much important.
here it is input:

--- Updated Mar 11, 2023 ---

kaz1 said:
I ran your coeffs in matlab: blue input, red output

View attachment 181715

I wonder what is your input. Try not change original code but show me your work, it should work

in MATLAB every thing is ok, BUT I can not understand the problem with my verilog code.

kaz1 · Mar 11, 2023

I suggest you keep simple. Try a clean sine wave without noise first. Also show me your original code.

FvM · Mar 11, 2023

Problem is that the sum of coefficients is about 1.45, respectively you need to scale output down (shift right) to avoid overflow with full range input signal. In the post #13 simulation waveform it's not scaled down. Better scale the coefficients so that sum is power of 2.

mohamis288 · Mar 11, 2023

FvM said:
Problem is that the sum of coefficients is about 1.45, respectively you need to scale output down (shift right) to avoid overflow with full range input signal. In the post #13 simulation waveform it's not scaled down. Better scale the coefficients so that sum is power of 2.

as shown in MATLAB code, output is in the allowed range (i.e. between -128 and 127). because I have limited the input to [-64,63] range. so the problem is not with the filter itself. the problem is just with verilog implementation.

--- Updated Mar 11, 2023 ---

kaz1 said:
I suggest you keep simple. Try a clean sine wave without noise first. Also show me your original code.

@kaz1
using the method in post #13, my output to sinusoidal wave is like this. at the edge of negative and positive, something is wrong. you know what is it? my input is in 2's complement format. do I need to change the coefficient type, for example, unsigned and signed? I'll check it.

kaz1 · Mar 11, 2023

mohamis288 said:
as shown in MATLAB code, output is in the allowed range (i.e. between -128 and 127). because I have limited the input to [-64,63] range. so the problem is not with the filter itself. the problem is just with verilog implementation.

To match matlab with RTL you need to do following:

hs = round(h*128); %your coeffs scaled in RTL
y = filter(hs,1,x); %x range +63,-64
dout = floor(y/128); %divide back range + 66

you will see max y close to +8500
You then scale matlab filter output by dividing by 128 and you then get max y close +66

if you divide y by 64 you get double of range i.e. +130

division by 128 = discard 7 LSBs.
division by 64 = discard 6 LSBs.

as to rounding/overflow you can check

Welcome to EDAboard.com

My moving-average filter resulted in wrong result

Full Member level 3

Advanced Member level 7

Full Member level 3

Super Moderator

Advanced Member level 1

Full Member level 3

Advanced Member level 1

Full Member level 3

Advanced Member level 1

Super Moderator

Advanced Member level 1

Super Moderator

Full Member level 3

Advanced Member level 1

Full Member level 3

Attachments

Advanced Member level 1

Super Moderator

Full Member level 3

Advanced Member level 1

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor