ctzof
Full Member level 3
- Joined
- Mar 1, 2012
- Messages
- 157
- Helped
- 12
- Reputation
- 24
- Reaction score
- 11
- Trophy points
- 1,298
- Location
- Munich
- Activity points
- 2,516
If the offset is linear (which I'm assuming at this point) then just calculate all of them not just some of them. If the offset isn't a linear function then yes you use interpolation and that means you use math.
For LUTs in resource constrained designs, you might want to either move away from the LUT, run the LUT at a higher clock rate, or time share the LUT.
For a Xilinx BRAM, you get 2 reads per cycle per BRAM. Thus a 10kBit LUT (BRAM18) with 10 reads will use 5 BRAM, and be duplicated 5 times. The BRAMs _can_ run up near 500-600MHz. If your normal design runs at 100MHz, this means you could provide 10 input addresses per 100MHz cycle and get 10 output values per 100MHz cycle. You would need a small amount of logic running at 500MHz, and would need to have appropriate pipelining considerations to ensure the high-speed logic works out.
If many things use the LUT, but only infrequently, you might look into some form of arbitration to provide access to a reasonable number of LUTs, but with variable latency. This is a similar idea -- serialize access to the LUT -- but it doesn't use a high speed clock. Routing and arbitration logic might become an issue.
(if you can have 512-1024 cycles of latency, you can also cycle through the entire LUT and broadcast the result.)
--edit: linear interpolation might help, but it is hard to say. you should try to shrink the LUT by a factor of two in order to make up for double reads.
'Yes' is the short answer. But if you want more detailed answers, you're going to have to define what you're doing. So far, you haven't adequately defined function, performance or constraints. Without that info, you're only going to get speculative responses.Hi, Thanks for the answer. Is there a different approach to my problem in your opinion rather than LUT?
'Yes' is the short answer. But if you want more detailed answers, you're going to have to define what you're doing. So far, you haven't adequately defined function, performance or constraints. Without that info, you're only going to get speculative responses.
- Function: Exactly what function are you trying to implement and over what what input domain?
- Performance: How quickly do you need things? One per clock? Multiple clock cycles? Etc.
- Constraints: Is the FPGA or the FPGA family or maybe even just the supplier chosen? If so, which one? Are there resources that are likely limited because of other stuff that you have going on in your design? For example, maybe the rest of your design is pretty much locked in and you have only one spare LUT.
Kevin Jennings
How quickly do you need things? One per clock? Multiple clock cycles?
Thanks for all the answers. As I said I am rather new to Verilog and I don't have so much expirirnce with coding. Is there a reference on how I can share the LUT block many time on the design? Also some LUT in my design are 14x14 bit=229kb and I have only 36kb of RAM in my FPGA so probably I have to stick with interpolation. :bang:
In ads-ee question
The data in the line are actually measured data or to say it more accurately precomputed offset data points so the line doesn't follow any specific equation.
Let me repeat what has been stated before...
To share a memory you either have to have a multi-port memory (FPGA support dual-port memories) or share it virtually by using time division multiplexing of the resource to share the bandwidth into the memory.
Which way you go depends on how often you have to access the memory in a given amount of time.
FYI, your real question isn't about not knowing how to code this in Verilog, it's not understanding how to architect a design to do what you want within the context of the resources available in an FPGA. To help you with that will require a detailed specification on the data rates and clock frequencies of the design along with quantity of LUTs required.
The problem of sharing the non-linear function block between multiple channels is independent of using a direct look-up table or linear interpolation. There are of course several relations:As I said in a previous post what I am really concern with is the fact that some of the tables are 14x14bit (229kb) and the available memory of this FPGA is 36Kb which means that it doesn't fit a single table thats why I want to use the interpolation approach.
I tend to contradict. Polynomial interpolation is an option if the function is explicitely defined this way, e.g. Pt100 or thermocouple linearisation. But the calculation is rather inconvenient with integer or fixed point arithmetic. Piecewise linear interpolation is in contrast simple and straightforward. And it can be much easier fitted to arbitrary calibration functions.You should consider to perform this task by a algebraic expression, instead of LUT.
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?