Syncronous FIFO - flag generation

Status
Not open for further replies.
I'd like to share my FIFO design that uses an extra bit (MSB) to generate control flags.
I've chosen this approach because it can be used for both synchronous and asynchronous (with adjustments) FIFOs.

Code:
signal write_pointer , read_pointer : unsigned ( a downto 0 ) ;
signal write_address , read_address : unsigned ( a - 1 downto 0 ) ;
signal almost_full_threshold , almost_empty_threshold : unsigned ( a - 1 downto 0 ) ;

signal almost_full , full , almost_empty , empty : std_logic ;
signal msb_equal , pointer_equal : std_logic ; 


pointer_equal <= '1' when write_pointer = read_pointer else '0' ;

almost_full <= '1' when empty = '0' and read_address - write_address <= almost_full_threshold else '0' ;				
full <= '1' when address_equal = '1' and pointer_equal = '0' else '0' ;

almost_empty <= '1' when full = '0' and write_address - read_address <= almost_empty_threshold else '0' ;
empty <= '1' when pointer_equal = '1' else '0' ;
			
			
			
writing : process ( clock , reset ) is
begin
  if reset = '1' then    
    write_pointer <= ( others => '0' ) ;
  elsif rising_edge ( clock ) then
    if write_request = '1' and full = '0' then
      write_pointer <= write_pointer + 1 ;
    end if ;	
  end if ;
end process writing ;	 	
 
 
reading : process ( clock , reset ) is
begin
  if reset = '1' then    
    read_pointer <= ( others => '0' ) ;
  elsif rising_edge ( clock ) then
    if read_request = '1' and empty = '0' then
      read_pointer <= read_pointer + 1 ;
    end if ;	
  end if ;
end process reading ;

Would you change anything in the code ( while still maintaining the extra bit concept ) ?
 

Would you change anything in the code ( while still maintaining the extra bit concept ) ?
I would change the flag generation logic so that it is synchronized to the clock. Having the output status flags be combinatorial outputs that depend on every bit of both pointers as you have it will lead to a lower clock cycle performance in a system.

Kevin Jennings
 
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
Can you please give an example in code...

You did it the "right" way in #73.

As a general rule, try to avoid combinatorial outputs from a module. Internal signals like "new_write_address" in #73 can be combinatorial. I would use a (process local) variable for "new_write_address" in #73, but the resulting logic should be identical if it is always assigned a value like in #15 (new_write_pointer).
 

The important thing is to assign the flags inside a clocked process, so the outputs come directly from a register (as in #73). To do that you need the "lookahead".
 
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
std_match,

But in post #73 the "Full" and "Empty" flag are asserted with direct dependency on Read/Write requests.
With the "extra bit method", flags aren't generated by the Write and Read requests directly - but as a comparation result of the pointers.

How can I use the lookhead method then?
 


What you have right now, depends on the following type of combinatorial logic which you then use to generate the 'Full' flag.

pointer_equal <= '1' when write_pointer = read_pointer else '0'

To de-clutter this a bit, consider that the expression "W = R" is equivalent to "(W-R)=0". To get the 'early read' that the expression is about to go true, consider the expression "(W-R)=1". This expression will be true one before the "W=R" expression. What you need to do is recast your flag logic a bit so that you look for the read pointer to be one away from the write pointer. Now if that is true, and you have a write but no read, then you would set 'Full' to true within a clocked process.

Do the same type of thing for the other flags.

Kevin Jennings
 
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
K-J,
I understand.
But it will require evaluating the "(W-R)=1" condition under the read request. Correct?
 

K-J,
I understand.
But it will require evaluating the "(W-R)=1" condition under the read request. Correct?
That's correct. The advantage is that the long combinatorial logic path created by checking to see if "(W-R)=1" feeds into the flip flop that creates the full flag. As you have it written, the 'likely just as long' combinatorial path checking to see if "W=R" does not have that flip flop so the prop delay to 'Full' would be quite a bit longer.

Kevin Jennings
 
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
I'm trying to get to my point...
If the condition gets evaluated only under a read request then only a read request can update it and this isn't optimal - because writing to memory should also be able to effect the empty flags.

Hence, we'll have to evaluate the W-R=1 condition under write requests also. Of course this isn't a problem with synchronous clock domains - but what about asynchronous FIFOs? If I use the same concept, I'll have to synchronize the requests between domains which may prove unreliable.

I want to design my FIFO in such a way that the flag generation will depend only on the value of the pointers and won't require the conditions to be evaluated under either write or read requests.

My asynchronous example does just that - but it we'll be slow.
How would you suggest improving it without departing from the extra address bit method ?
 

Regardless of which implementation #73 or #81 neither will work as an asynchronous FIFO. Given that you want to support asynchronous operation you will have to gray code and synchronize the addresses across the clock domains.

If the FIFO operates with asynchronous clocks with a large difference in frequency, then you will require some way of transferring the write and read requests from faster-slower clock domains, which can be problematic. In the case of going from the faster to slower clock domain you will have to both stretch the request to be reliably captured on the slower clock domain and contiguous requests will have to be saved as each of them require pulse stretching. To go from the slower to faster clock domain the request will have to be truncated to a single clock cycle of the faster clock domain (relatively easy as an edge detection will work). In either case you will still need to deal with crossing the clock domains and mitigating metastability events.

I really think the idea of producing a generic FIFO controller that works for both asynchronous and synchronous clocks with any combination of frequencies while commendable is doomed to producing a non-optimal design for all but the worst case of an asynchronous FIFO with a large difference in clock frequencies between the write and read domains (unless a significant effort is placed on selective code compilation and or multiple architectures and using configurations to select the appropriate version).

Regards
 
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
I'm trying to get to my point...
If the condition gets evaluated only under a read request then only a read request can update it and this isn't optimal - because writing to memory should also be able to effect the empty flags.
I didn't say it was only affected by read request. Here is an outline of what I was saying:
Code:
if (read_request = '1') then
   Full <= '0';
elsif (write_request = '1') then
   if ((read_pointer-write_pointer) = 1) then -- Think I said it backwards in previous post...should be (R-W, not W-R)
      Full <= '1'
   end if;
end if;

-- Do something similar to generate 'Empty' except there you will be looking for "if ((write_pointer-read_pointer) = 1) then"

There is nothing inherently unreliable about the synchronizing...not sure what you mean

Unfortunately, it's quite a bit worse than just 'slow'. The write pointer will be synchronized with the write clock; the read pointer will be synchronized with the read clock. Any calculations you do between those two pointers will not be synchronized to any clock, they will be completely useless to anyone outside of a simulation environment
How would you suggest improving it without departing from the extra address bit method ?
Maybe Google for the source code to lpm_fifo_dc and see how they do it.

Kevin Jennings
 
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
shaiko,

You should probably just implement the FIFO in the Cummings paper: http://www.sunburst-design.com/papers/CummingsSNUG2002SJ_FIFO1.pdf

You should avoid using the FIFO2 implementation (yeah it was voted best paper, but that doesn't mean it's a better design).

This is one of the most robust implementations that I know of, at least I've never seen it fail on any design. I've had other less robust solutions that I've implemented specifically to achieve higher performance or reduce logic that did have failure modes, which I eventually had to fix or ensure the failure mode could never occur.

Regards
 
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
ads-ee,
thanks for the elaborate response.
Regardless of which implementation #73 or #81 neither will work as an asynchronous FIFO. Given that you want to support asynchronous operation you will have to gray code and synchronize the addresses across the clock domains.
Sure, I know that. That's what I meant in post #81 when I said "with adjustments".
Do you know a way to design a reliable asynchronous FIFO without having to synchronize requests between domains?

K-J,
There is nothing inherently unreliable about the synchronizing...not sure what you mean
If one clock domain is much faster then another, requests synchronization from the fast domain to the slow may fail.
 

If one clock domain is much faster then another, requests synchronization from the fast domain to the slow may fail.
If that is the way you do it, then yes you would have issues...but that is not the way to do it. Peruse the examples of good designs that have stood the test of time like the Cummings paper or lpm_fifo_dc. You're intentionally trying to re-invent the wheel on this and a dual clock fifo has many potential pitfalls. You would be better off studying working designs rather than asking people to review what you've written for one purpose (single clock fifo) and then think that it should be applicable to something much more complex (dual clock fifo). I realize you're trying to come up with your own code, but that does not imply that you can't learn from actual working designs.

At the end of the day though, a dual clock fifo will have to provide flags that are synchronized to the clocks. The typical arrangement that is most useful is (almost)full synchronized with the write clock; (almost)empty synchronized with the read clock. Having flags that are not synchronized to either clock will be of no use.

Kevin Jennings
 
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
I hate to say it, but this whole thread just shows that trying to re-invent the wheel can really be wasted effort.

Let the vendors look after their well made wheels. Concentrate on making the shiny new car.
 
Reactions: FvM

    FvM

    Points: 2
    Helpful Answer Positive Rating
K-J, TrickyDicky,

Thanks for your replies...
I'm not trying to invent the ultimate FIFO here...And I do study other designs and try to understand what makes them better then others.
Ultimately, I find that learning from my own code and tuning it up works best for me.
 

ads-ee,

I followed your suggestion and started learning the design in this article:
**broken link removed**

I see the in page 18, that the empty flag is asserted/deasserted combinatorially outside of a clocked "always".
Code:
assign rbinnext = rbin + (rinc & ~rempty);

My question:
If it's good for an asynchronous FIFO, why isn't it good for a synchronous FIFO?
 

Status
Not open for further replies.
Cookies are required to use this site. You must accept them to continue using the site. Learn more…