[SOLVED] Verilog Error : Too Few Parameters Passed To Task

mrflibble · Aug 18, 2014

Last time I checked a FIR filter works just fine in the time domain. If your argument is going to be "but but, something or other related to frequency domain *handwave*", well then bad news. Exactly the same "but but, something or other related to frequency domain *handwave*" argument will apply to your design. And every other signal processing algo for that matter. Your current design is not any more or less "time-domain" than a FIR filter is.

I'd asssume FIR and IIR and such to be part of an EE/telecom curriculum, but apparently not.

Re: BRAM: yes you have to instantiate it. Step 1: read the documentation I mentioned (button in core generator). Step 2: in ISE you select the core, and then do "Show instantiation template" or something to that effect. You can copy/paste the template and fill in your signals.

AshkanYJM · Aug 18, 2014

mrflibble said:
Last time I checked a FIR filter works just fine in the time domain. If your argument is going to be "but but, something or other related to frequency domain *handwave*", well then bad news. Exactly the same "but but, something or other related to frequency domain *handwave*" argument will apply to your design. And every other signal processing algo for that matter. Your current design is not any more or less "time-domain" than a FIR filter is.

I'd asssume FIR and IIR and such to be part of an EE/telecom curriculum, but apparently not.

Re: BRAM: yes you have to instantiate it. Step 1: read the documentation I mentioned (button in core generator). Step 2: in ISE you select the core, and then do "Show instantiation template" or something to that effect. You can copy/paste the template and fill in your signals.

x[n]+x[n-1] >>> Time-Domain
X[Z]+X[Z]Z^(-1) >>> Frequency Domain

These two approaches will give the same results but I am sure they are different.a FIR Filter works in Frequency Domain. It's all about Z^(-1) s! how can you say FIR is in Time-Domain ? unless there's a point i'm missing my dear friend.

FvM · Aug 18, 2014

time- and frequency domain are linked by bijective transformations. In so far they can be considered as different descriptions of the same thing.

E.g. an echo device is also a FIR filter (or IIR filter for echo with feedback), and you can calculate it's z-domain description.

Due to the duality of descriptions, it's your choice which you prefer. Just notice the hint that FIR hardware looks very similar to your "time-domain" signal manipulation. Presently you should focus on learning Verilog for hardware design anyway, I think.

B.t.w.: A great book related to your work is Zoelzer, Digital Audio Effects.

mrflibble · Aug 18, 2014

AshkanYJM said:
how can you say FIR is in Time-Domain ?

I can say so quite easily, because ...

FvM said:
time- and frequency domain are linked by bijective transformations. In so far they can be considered as different descriptions of the same thing.

What he said. You know, Laplace & Fourier transform and all that.

AshkanYJM · Aug 18, 2014

I've been reading verilog all day... Guys, I'm feeling helpless, alone at the end of the world:-( I'm only 22 and not being able to proceed with my code I feel like dumb,and assume have wasted my life :-(

Anyways, a man must fight for what he believes and I believe I can do this. Alright. Now I built a block ram( not sure if A FIFO memory would be better or not). also tried some FIR 5.2 version and ooppss, more than 6000 lines of alien codes :shock:

So I tried to put that block ram into circularbuffer and get rid of those nasty creepy fors and do the hardware way. now I've got to declare write and read pointers for my memory, right ? to mention again, I chose a true dual port memory. because I needed 2 read pointers ...

Now I have to get rid of that lot of modules,make them less, two might be proper. one for block ram, one for the effect... and then connect them in a way that a top module can work fine in the testbench with meaningful outputs... Right ?

ads-ee · Aug 18, 2014

AshkanYJM said:
I've been reading verilog all day... Guys, I'm feeling helpless, alone at the end of the world:-( I'm only 22 and not being able to proceed with my code I feel like dumb,and assume have wasted my life :-(

It's more an issue of having a software mindset and not thinking of parallel hardware. You've been looking at Verilog as if it executes in a sequential fashion, i.e. like loops iterate N times in N cycles...so in the case of your code you thought that the loop would execute one index each clock cycle. Nope it's translated to hardware...so all indexes are implemented in parallel. For loops are usually reserved for use to make N copies of something, e.g. you need 10 SPI interfaces, so you can a) instantiate each one separately, b) use a for loop to generate the 10 instances.

Anyways, a man must fight for what he believes and I believe I can do this. Alright. Now I built a block ram( not sure if A FIFO memory would be better or not). also tried some FIR 5.2 version and ooppss, more than 6000 lines of alien codes :shock:

So I tried to put that block ram into circularbuffer and get rid of those nasty creepy fors and do the hardware way. now I've got to declare write and read pointers for my memory, right ? to mention again, I chose a true dual port memory. because I needed 2 read pointers ...

I've never suggested using a FIR to do this. That's been FvM and mrfibble suggestion, I was wary of suggesting anything like that as you may not know exactly how to use a FIR to implement this. Even more likely not know how to translate all those software FIR examples into a Verilog hardware FIR filter design. If you do know how to do this then it's a better implementation. Otherwise your original algorithm will work you just needed to get out of using a software loop approach. You need some control logic (FSM or similar) to time the reads and writes to the RAM and an address generator to produce the correct addresses. The actual calculation would be used, but pipelined with the RAM interface.

Now I have to get rid of that lot of modules,make them less, two might be proper. one for block ram, one for the effect... and then connect them in a way that a top module can work fine in the testbench with meaningful outputs... Right ?

You don't "have to" reduce the number of modules, but if you do it will make your life easier to "see the big picture" (can't see the forest for the trees). I find it's much harder to see the flow of logic in a design that is broken down into a excessively fine granularity. This is similar to the problem of having a design that isn't broken down enough (so you get lost in the "forest" of code).

Regards

mrflibble · Aug 18, 2014

AshkanYJM said:
Alright. Now I built a block ram( not sure if A FIFO memory would be better or not). also tried some FIR 5.2 version and ooppss, more than 6000 lines of alien codes :shock:

A fifo might be okay as well. This will depend on your memory access pattern, which you will know more about than me. If you do your read/writes as first in first out, then yes FIFO will do the trick. Such a choice usually is done at the design stage, so that's an idea for your next design. As for 6000 lines of alien code ... the idea is usually to read the pdf documentation of that code, so you don't have to go over those large amounts of code. Typically you only need to dive into core generated code if something went horribly wrong.

So I tried to put that block ram into circularbuffer and get rid of those nasty creepy fors and do the hardware way. now I've got to declare write and read pointers for my memory, right ? to mention again, I chose a true dual port memory. because I needed 2 read pointers ...

Well, if you truly need 2 read pointers for whatever reason, then you will not be able to do it with a plain fifo. So IF 2 read pointers truly required, THEN forget about fifo's. This of course depends rather heavily on if you really really require 2 read pointers. You do 2 seperate reads at distinctly different locations every clock cycle?

Now I have to get rid of that lot of modules,make them less, two might be proper. one for block ram, one for the effect... and then connect them in a way that a top module can work fine in the testbench with meaningful outputs... Right ?

For testbench purposes that sounds about right.

FvM · Aug 18, 2014

Fortunately audio sample rate is only a small fraction of the achievable system clock frequency, so you can easyly time-multiplex many read operations on the circular buffer. No dual port memory actually needed.

mrflibble · Aug 18, 2014

Well, now you just messed up my perfectly loaded question. The idea was to find out the motivation for the multiple reads.

ads-ee · Aug 18, 2014

mrflibble said:
Well, if you truly need 2 read pointers for whatever reason, then you will not be able to do it with a plain fifo. So IF 2 read pointers truly required, THEN forget about fifo's. This of course depends rather heavily on if you really really require 2 read pointers. You do 2 seperate reads at distinctly different locations every clock cycle?

The reads from the RAM are based on the G tables that were in the rar file. I'm looking at the addressing and the first table produces addresses that are incrementing/decrementing by 4. Don't know if there is any point where the table increments differently. The next two G tables of indices changes by +/-2. So all three tables look to be simple saw tooth waveforms, which is simple enough to implement in a always block as an increment/decrement by N counter to generate the addressing to the RAM.

It's rather hard to determine what the actual intention of your algorithm is based on the original code as it wasn't representative of software or hardware. It might be helpful if you posted the software program (that was tested as working) that you based your Verilog code on as that would allow us to understand the intent.

Things like this:

Code:

Buff_Array1=Buff_Array>>1;
Audio_out=Buff_Array[0]+ divide(Buff_Array[t],Buff_Array1[t]);

You appear to be performing a divide by 2 on each element of the Buff_Array and adding it to the BUff_Array[0] value? I'm not sure if your Buff_Array values are all unsigned, I would probably assume so as the values range up to 2400, but then at a different point in your code you perform a subtraction:

Code:

Audio_out=Buff_Array[0]- divide(Buff_Array[G],Buff_Array1[G]);

Is Buff_Array[0] always bigger than the divide by 2 values for every other Buff_Array entry?

Regards

AshkanYJM · Aug 18, 2014

mrflibble said:
A fifo might be okay as well. This will depend on your memory access pattern, which you will know more about than me. If you do your read/writes as first in first out, then yes FIFO will do the trick. Such a choice usually is done at the design stage, so that's an idea for your next design. As for 6000 lines of alien code ... the idea is usually to read the pdf documentation of that code, so you don't have to go over those large amounts of code. Typically you only need to dive into core generated code if something went horribly wrong.

Well, if you truly need 2 read pointers for whatever reason, then you will not be able to do it with a plain fifo. So IF 2 read pointers truly required, THEN forget about fifo's. This of course depends rather heavily on if you really really require 2 read pointers. You do 2 seperate reads at distinctly different locations every clock cycle?

For testbench purposes that sounds about right.

The base functionality of all these processes is in adding the current sample with a delayed sample. so the circular buffer and now the block ram with address generator is for this. to have two samples ready at a time to add or subtract.

FvM said:
Fortunately audio sample rate is only a small fraction of the achievable system clock frequency, so you can easyly time-multiplex many read operations on the circular buffer. No dual port memory actually needed.

Well given a sample rate of 40 kHz at least...and needing let's say 1 second delay... I will need 40000 blocks in my ram. that's impossible to only time-multiplex.

- - - Updated - - -

ads-ee said:
The reads from the RAM are based on the G tables that were in the rar file. I'm looking at the addressing and the first table produces addresses that are incrementing/decrementing by 4. Don't know if there is any point where the table increments differently. The next two G tables of indices changes by +/-2. So all three tables look to be simple saw tooth waveforms, which is simple enough to implement in a always block as an increment/decrement by N counter to generate the addressing to the RAM.

It's rather hard to determine what the actual intention of your algorithm is based on the original code as it wasn't representative of software or hardware. It might be helpful if you posted the software program (that was tested as working) that you based your Verilog code on as that would allow us to understand the intent.

Things like this:

Code:

Buff_Array1=Buff_Array>>1; Audio_out=Buff_Array[0]+ divide(Buff_Array[t],Buff_Array1[t]);

You appear to be performing a divide by 2 on each element of the Buff_Array and adding it to the BUff_Array[0] value? I'm not sure if your Buff_Array values are all unsigned, I would probably assume so as the values range up to 2400, but then at a different point in your code you perform a subtraction:

Code:

Audio_out=Buff_Array[0]- divide(Buff_Array[G],Buff_Array1[G]);

Is Buff_Array[0] always bigger than the divide by 2 values for every other Buff_Array entry?

Regards

Those saw-tooths are meant to be sinusoids...But I didn't bother about that,saw-tooth doesn't make much difference here and the more important issue was implementing the whole design.

each of those set of Gs and finally the Mixer (Audio_out=Buf....) has a meaning in audio processing.
Buff_array[0] is a specific block(constant read point) but its content changes in every clock, so do the other blocks. but for a standard delay,echo,chorus... the delayed sample must be multiplied in a less than 1 constant so that human ear can understand the effect and also the probability of samples omitting each other becomes low.

FvM · Aug 18, 2014

Well given a sample rate of 40 kHz at least...and needing let's say 1 second delay... I will need 40000 blocks in my ram. that's impossible to only time-multiplex.

The problem is that you stiil don't understand the concept of a circular buffer. You have one write cycle and and several read cycles (one for every tapped signal with different delay) per audio sample clock. Surely not 40000.

mrflibble · Aug 18, 2014

The current sample is current. So why would you need to read it out of block ram? Since it is current you should keep it in a register as well. Why? Because that way you don't need a memory read for it.

As for this:

Well given a sample rate of 40 kHz at least...and needing let's say 1 second delay... I will need 40000 blocks in my ram. that's impossible to only time-multiplex.

Noooooope. What FvM says is quite possible.

Example A: memory clock of 40 kHz , dual read ports
Example B: memory clock of 80 kHz, single read port

Both A and B can get those two required reads done within a 25 microsecond period.

AshkanYJM · Aug 18, 2014

FvM said:
The problem is that you stiil don't understand the concept of a circular buffer. You have one write cycle and and several read cycles (one for every tapped signal with different delay) per audio sample clock. Surely not 40000.

Read my code for circular buffer to see if I understand it or not.I'm not saying I need 40000 read points,I'm saying I need a memory anyway. maybe it's just a misunderstanding. you said no dual port memory is needed. But you actually meant it doesn't have to be dual port, a single port suffices,Right ?

mrflibble said:
The current sample is current. So why would you need to read it out of block ram? Since it is current you should keep it in a register as well. Why? Because that way you don't need a memory read for it.

As for this:

Noooooope. What FvM says is quite possible.

Example A: memory clock of 40 kHz , dual read ports
Example B: memory clock of 80 kHz, single read port

Both A and B can get those two required reads done within a 25 microsecond period.

Well,you mean I take out the current sample and after a very fast clock pulse, take out the delayed sample out too and add them to each other, Right ?

mrflibble · Aug 18, 2014

Could you list the formula your design has for the output in terms of delayed inputs? I imagine it's the weighted sum of a bunch of taps.
The number (and position) of taps should dictate how your reads are done.

AshkanYJM · Aug 19, 2014

I guess you are right. Read the attached pdf. It's very short and concise.

mrflibble · Aug 19, 2014

So just one read port required. Don't quite see why this is an fpga application, you could almost do this on a battery powered msp430. Maintain circular buffer and do all of 2 multiply adds per sample. Well okay, maybe an stm32 with some more memory for longer delays.

Anyways, that document seems to have enough hints I'd say.

AshkanYJM · Aug 19, 2014

There's something here I don't get, about the bram

a single port bram has these signals as pinout:

CLKAort A operations are synchronous to this clock.
ADDRA:adresses the memory space for port A Read and Write
operations.
DINA : Data input to be written into the memory via port A.
DOUTA : Data output from Read operations via port A.
WEA :Enables Write operations via port A.

1. I am sure the write process must be done exactly at the sample rate of the A/D which is 40 kHz or higher.
But as said before, the clock for reading must be higher. how is that possible ?
2. I can't understand the functionality of this ram clearly. suppose I want it to write a hex FFFF into the block with an address of FAFB! and at the same time I want it to utter out the content of block with an address of A80F! how can I do that ? is that done with WEA ? When 1 the the RAM writes DINA into the address, when zero RAM reads DOUTA from the address ? can it be done simultaneously ? I think the mismatch of clocks will result in errors.

ads-ee · Aug 19, 2014

Either use a clock that is 2x the sample frequency and perform a write-read cycle for every two clocks. Or use a simple dual port BRAM. One port is write only the second port is read only. This style of BRAM will have separate addresses for write and reads.

1. I am sure the write process must be done exactly at the sample rate of the A/D which is 40 kHz or higher.
But as said before, the clock for reading must be higher. how is that possible ?

I suspect you don't even have a planned method of getting data into the part do you? The A/D should be controlled by the FPGA in the best case scenario. The FPGA logic would be running at a minimum of 2x the sample frequency. I would strongly recommend using a clock that is at an even higher frequency. Beside reducing the latency through the design, you will also find that the oscillators used to run a clock at frequencies in the low MHz range tend to be very inexpensive.

2. I can't understand the functionality of this ram clearly. suppose I want it to write a hex FFFF into the block with an address of FAFB! and at the same time I want it to utter out the content of block with an address of A80F! how can I do that ? is that done with WEA ? When 1 the the RAM writes DINA into the address, when zero RAM reads DOUTA from the address ? can it be done simultaneously ? I think the mismatch of clocks will result in errors.

The documentation should show timing diagrams for writing and reading from the RAM.

Is this by chance the first digital design class you've taken? A lot of what you seem to be having trouble with are pretty basic digital design techniques.

Regards

mrflibble · Aug 19, 2014

ads-ee said:
The documentation should show timing diagrams for writing and reading from the RAM.

This. Read the documentation for block ram.

https://www.xilinx.com/support/documentation/user_guides/ug383.pdf

Activating spoon feeding exponential back off timer ... now.

[SOLVED] Verilog Error : Too Few Parameters Passed To Task

Advanced Member level 5

Junior Member level 3

Super Moderator

Advanced Member level 5

Junior Member level 3

Super Moderator

Advanced Member level 5

Super Moderator

Advanced Member level 5

Super Moderator

Junior Member level 3

Super Moderator

Advanced Member level 5

Junior Member level 3

Advanced Member level 5

Junior Member level 3

Attachments

Advanced Member level 5

Junior Member level 3

Super Moderator

Advanced Member level 5

Similar threads

Privacy & Transparency

Privacy & Transparency