Why is this Verilog Code not working as expected?!

Status
Not open for further replies.

Mario875

Newbie level 4
Joined
Jul 28, 2021
Messages
6
Helped
0
Reputation
0
Reaction score
0
Trophy points
1
Activity points
138
Hi all,

I have created some verilog code to lock onto the output signal of a device and make a new signal with a greater pulse width. Now when I have the device powered on and I program the FPGA it locks on 100% of the time, no issues.

However, if I reset the device, but do not re-program the FPGA, I lose the lock (as expected), so I created a 'watchdog' reset that will essentially reset the IP and allow it to re-lock again. If I attach the 'reset_db' signal to an external push button, it works perfectly in that resetting the device loses the lock, but when I press the push button, the IP resets and re-locks as expected.

I thought, that's great, now let me try and automate the reset with a watchdog timer, code (with comments) below, and the strange thing is that when I simulate it, the code works as expected, but when I implement it, it just doesn't, almost as if some of the counters or states are not being properly reset.

Essentially the watchdog is clocked by the FPGA 100MHz clock and if the device is enabled, one of the signals resets the watchdog every 100us, but when the device is reset the signal will go low for 500ms and the watchdog timer is set to only 10ms, so when that happens the FPGA should hold the reset value high, then when the device comes back on and reset is held high, all states and counters will be reset, once that happens (hbstate==boot) then the reset signal is asserted low and the watchdog counter reset for it to repeat.

It just doesn't work and the output signals end up totally out of sync, but when I use a push button to manually reset instead of the watchdog counter, it works as expected??

Code:
.
.
.
reg [1:0] hbstate=2'b00, vbstate=2'b00; //initialise states to 0
reg reset_db=0; //initialise reset to 0

integer hbcnt=0, hbbootcnt=0, vbcnt=0, vbbootcnt=0, watchdog=1000000; //counters, values initialised as required, 1000000 for watchdog = 10ms

always @ (posedge clock) //FPGA clock
begin

watchdog<=watchdog-1;

if(ps2_hblank==1 && watchdog>=1) watchdog<=1000000; //if signal from device is present (square wave with 100us period) and watchdog is present, reset watchdog timer

else if(ps2_hblank==0 && watchdog<=0) //if signal is lost (drops for 500ms during a reset), and watchdog counter is less than or equal to 0 (10ms has elapsed), begin the reset
begin
    reset_db<=1; //turn on reset
    if(hbstate==boot) //if 1 of the states has reset (indicating the device has powered back on, as it is reliant on the devices own clock)
    begin
        watchdog<=1000000; //set watchdog timer to 10ms again
        reset_db<=0; //turn off reset
    end
end
end


always @ (posedge ps2_vclk) //device clock
begin

if(reset_db==1) //if reset is equal to 1, reset all counters and status' as below
begin
hbstate<=boot;
hbcnt<=0;
hbbootcnt<=0;

vbstate<=boot;
vbcnt<=0;
vbbootcnt<=0;
end
.
.
.


I have even tried the modified code as below, adding a 750ms delay after hbstate changes to 'boot' but still the exact same issue. Again, when I comment out the code in the 1st always block and use a push button to enable reset_db, it works fine?! What is going on??

Code:
.
.
.
reg [1:0] hbstate=2'b00, vbstate=2'b00, wdstate=2'b00;
reg reset_db=0;

integer hbcnt=0, hbbootcnt=0, vbcnt=0, vbbootcnt=0, watchdog=0, wdcnt=0;

always @ (posedge clock)
begin

watchdog<=watchdog+1;

case(wdstate)

boot :  begin
            if (ps2_hblank==1 && watchdog<=1000000) watchdog<=0;
            else if(watchdog>=1000001) wdstate<=start;
        end
      
start : begin
            reset_db<=1;
            if(hbstate==boot)
            begin
                wdcnt<=wdcnt+1;
                if(wdcnt>=75000000)
                begin
                    watchdog<=0;
                    wdcnt<=0;
                    reset_db<=0;
                    wdstate<=boot;
                end
            end
        end
endcase
end

always @ (posedge ps2_vclk)
begin

if(reset_db==1)
begin
hbstate<=boot;
hbcnt<=0;
hbbootcnt<=0;

vbstate<=boot;
vbcnt<=0;
vbbootcnt<=0;
end

case(hbstate)   
.
.
.
 

I’m not really a Verilog guy, but several things jump out at me.

1) You have two blocks using two different clocks. Is there a CDC issue?
2) Your If/elsif statements have some undefined conditions which will probably create latches.
3) Have you simulated this? That will tell you a lot.
 

1) You have two blocks using two different clocks. Is there a CDC issue?
Sure. The code implements a handshake synchronisation mechanism, unfortunately without sync registers for the handshake signals. This kind of design flaw isn't necessarily detected in simulation.

I believe, the watchdog function can be coded safe and also readable and straightforward.
 

2) Your If/elsif statements have some undefined conditions which will probably create latches.
I do not see any "if" that can cause latches. All "if"s I see are in clocked always blocks, and they create FFs with clock enables hold the value of the signals when the condition is not satisfied.

@Mario875 I do not see any errors in the part of the code you showed, except for the absence of CDC synchronizes. See, for example, https://www.eetimes.com/understanding-clock-domain-crossing-issues/

I suspect that the error in the part of the code you didn't show. There might be not all signals reset in your reset statement of the second always block (and why you repeat the same code twice there?). With the current watchdog logic, your reset is set for a brief moment, and the signals not mentioned in the reset block might stay not cleared. While when you press the button, it is very long event and the signals not covered by the reset are somehow cleared by themselves while the button is still pressed.
--- Updated ---

It's kinda unrelated, but the first always block is overcomplicated, hard to read, and uses more resources than it should. Also, what happens when ps2_hblank==1 && watchdog<=0 ?
The watchdog can go negative, and it is defined as an integer. This forces it to 32 bit width. You don't really need negative values at all.

Consider the following code:
Code:
localparam WATCHDOG_10MS_COUNT_VALUE = 1000000; // @ 100MHz

always @ (posedge clock)
begin
  if(reset_db==0) begin
    watchdog<=watchdog-1;

    if(ps2_hblank==1) watchdog<= WATCHDOG_10MS_COUNT_VALUE;
    else if(watchdog==0) reset_db<=1;

  end else begin
    reset_db<=1;

    if(hbstate==boot)
    begin
      watchdog<= WATCHDOG_10MS_COUNT_VALUE;
      reset_db<=0;
    end
  end
end

It's not only clearer, but you can redefine watchdog counter to be positive only and also use less bits as it never leaves [0, 1000000] range.

Also you don't need to repeat in English what is already written in Verilog. Not a single comment in the code you posted adds any understanding to the code.
reg reset_db=0; //initialise reset to 0
reset_db<=0; //turn off reset

Anyone can see that you assign 0 to the reset signal without these comments. Comments are needed when you need to communicate some information that you cannot communicate in Verilog/VHDL/other PL.
 
Last edited:

Hi,

Let's think outside the box a little bit. We have the .sof file and the .pof file. Which one did you program the FPGA with?
 

Hi all, thanks for the replies, will try to provide some responses below...


1) I didn't think that CDC would be an issue here as reset_db is being asserted in the 1st logic block, then held high until feedback comes in that the reset has occurred (hbstate==boot). Maybe I am wrong here? Some more explanation on what is going on would be great.

2) Which undefined conditions / possible latches?

3) Yes, simulation works fine, exactly as expected which is what is confusing more


Are you saying this is likely a CDC issue? If so, how would you recommend I fix it? This part of the design is small but important, however, if it is easier I could quite happily make the watchdog a separate IP then use the Xilinx XPM CDC IP to cross between domains if it makes my life easier, as it is just a small part of a much larger block design. So if it works, that is all I am after at the moment.


Hi, thanks, I will look into the CDC, I didn't think it would be an issue as the 1st always block is just asserting reset_db high until it gets feedback that hbstate==boot and therefore the reset has triggered, however as I mentioned above though if it is easier I could quite happily make the watchdog a separate IP then use the Xilinx XPM CDC IP to cross between domains if it makes my life easier, as it is just a small part of a much larger block design. So if it works, that is all I am after at the moment.

To be honest the error cannot be in part of the code I did not show as this is the only part of the code that controls the watchdog, the rest of the code is generating the new signal I want and that works perfectly every time at boot up, but also works perfectly when I attach the reset to a push button, so the issue has to be with the watchdog in the 1st always block.

As for repeating the reset twice, I am not, if you look once says hbxxxx then the others say vbxxxx, there are 6 parameters I need to reset, hence all 6 are listed.

As for resets not clearing which are "not mentioned" in the reset block...there are none which are not mentioned, I have 2 states and 4 counters I need to reset, they are all mentioned in the reset block.

I do agree the 1st always block was not overly clear, but thats why I also changed it in my 2nd code listing to use case statements, makes it a bit clearer (for me at least) and removes the negative values by having the watchdog count up from 0 instead of down from 1000000.

As for what happens when ps2_hblank==1 && watchdog<=0, that should never occur in normal circumstances but you have a fair point, however, see my 2nd code with the case statements that only uses ps2_hblank for resetting the watchdog, but if the watchdog exceeds a certain value it moves to the next state regardless of ps2_hblank value, so that should eliminate that potential issue, but I still see the same problems with things not resetting correctly.

As for the comments, they are not actually there in my own code (I know exactly what it is doing / trying to do), I just added it for people on here incase it helps them.

Hi,

Let's think outside the box a little bit. We have the .sof file and the .pof file. Which one did you program the FPGA with?

FPGA is programmed with a .bit file and .ltx files for the debug probes in Vivado, not .sof or .pof files...
 

...
FPGA is programmed with a .bit file and .ltx files for the debug probes in Vivado, not .sof or .pof files...
Yes, you're right. I was thinking it might have to do with file-dependable volatile and non-volatile programming that can be read up here https://techniex.com/creating-and-configuring-xilinx-fpga-with-mcs/ but it's not since it works with your reset_db signal pushbutton. It's more than likely an FSM issue. I don't understand verilog but could I ask if you could see the state diagram for the code?
 

    JWHassler

    Points: 2
    Helpful Answer Positive Rating
A CDC problem is brought up if an asynchronous variable controls the assignment of multiple bits.
e.g. in this code snippet, when hbstate is asynchronous (originated from a different clock domain), it's not guaranteed that all bits of variable watchdog are set in the same clock cycle. The temporary state may cause unexpected reaction of other logic parts. State machines can be even caught in an endless illegal state.
Code:
if(hbstate==boot)
 begin
     watchdog<=1000000;
     reset_db<=0;
 end
 


Sorry, I do not have state diagrams and Vivado will not create one from the Verilog. I would need to make one specifically just to upload to this thread.


Hmm, I was not aware that all bits of watchdog variable may not be set in the same clock cycle and hence cause issues when referencing hbstate to initiate the reset of the watchdog timer and set the reset_db to value 0

I think I will simply move the watchdog to a new project, create it's own IP and within my block design I will join the watchdog IP block (clocked from 100MHz) to the IP block for this design (clocked at 54MHz) using a Xilinx XPM CDC IP block to join the 2 together and eliminate issues with CDC.

This is my personal project I am performing in my free time, but I will try to provide an update on if this plan works in the next couple of days.

Thank you everyone, much appreciate you all pointing me in (hopefully) the right direction.
 

The usual solution is to register asynchronous input bits to the clock of the always block where you are reading it (typically with double register to avoid metastable states). Multibit input, e.g. state variables may need decoding/reencoding before registering it.
 

So bit of an update, I separated the watchdog into its own IP and used the Xilinx XPM CDC IP block to join the 2 together.

This was certainly required, but it was not the only issue. Turns out when I reset the device, even though its clock came back in 500ms, I could not get a good lock unless I waited a good 3-4 seconds! Once I figured this out I did try reverting back to my original design just to test, but it would not work again (unsurprisingly), so the CDC issue was definitely a problem, but then this quirk of the device was not helping, and hence it would work fine when using a push button as I always rebooted the device and waited a few seconds before pressing the button.

All is well now and it is working pretty reliably. I can occasionally trip it up and it ends up about 8500 clock cycles out, but if I reset again, it works fine. Think I just need to tweak the timing a bit.

Thanks all!
 

Status
Not open for further replies.
Cookies are required to use this site. You must accept them to continue using the site. Learn more…