mrflibble
Advanced Member level 5
- Joined
- Apr 19, 2010
- Messages
- 2,720
- Helped
- 679
- Reputation
- 1,360
- Reaction score
- 652
- Trophy points
- 1,393
- Activity points
- 19,551
Last time I checked a FIR filter works just fine in the time domain.If your argument is going to be "but but, something or other related to frequency domain *handwave*", well then bad news. Exactly the same "but but, something or other related to frequency domain *handwave*" argument will apply to your design. And every other signal processing algo for that matter. Your current design is not any more or less "time-domain" than a FIR filter is.
I'd asssume FIR and IIR and such to be part of an EE/telecom curriculum, but apparently not.
Re: BRAM: yes you have to instantiate it. Step 1: read the documentation I mentioned (button in core generator). Step 2: in ISE you select the core, and then do "Show instantiation template" or something to that effect. You can copy/paste the template and fill in your signals.
I can say so quite easily, because ...how can you say FIR is in Time-Domain ?
What he said.time- and frequency domain are linked by bijective transformations. In so far they can be considered as different descriptions of the same thing.
It's more an issue of having a software mindset and not thinking of parallel hardware. You've been looking at Verilog as if it executes in a sequential fashion, i.e. like loops iterate N times in N cycles...so in the case of your code you thought that the loop would execute one index each clock cycle. Nope it's translated to hardware...so all indexes are implemented in parallel. For loops are usually reserved for use to make N copies of something, e.g. you need 10 SPI interfaces, so you can a) instantiate each one separately, b) use a for loop to generate the 10 instances.I've been reading verilog all day... Guys, I'm feeling helpless, alone at the end of the world:-( I'm only 22 and not being able to proceed with my code I feel like dumb,and assume have wasted my life :-(
I've never suggested using a FIR to do this. That's been FvM and mrfibble suggestion, I was wary of suggesting anything like that as you may not know exactly how to use a FIR to implement this. Even more likely not know how to translate all those software FIR examples into a Verilog hardware FIR filter design. If you do know how to do this then it's a better implementation. Otherwise your original algorithm will work you just needed to get out of using a software loop approach. You need some control logic (FSM or similar) to time the reads and writes to the RAM and an address generator to produce the correct addresses. The actual calculation would be used, but pipelined with the RAM interface.Anyways, a man must fight for what he believes and I believe I can do this. Alright. Now I built a block ram( not sure if A FIFO memory would be better or not). also tried some FIR 5.2 version and ooppss, more than 6000 lines of alien codes :shock:
So I tried to put that block ram into circularbuffer and get rid of those nasty creepy fors and do the hardware way. now I've got to declare write and read pointers for my memory, right ? to mention again, I chose a true dual port memory. because I needed 2 read pointers ...
You don't "have to" reduce the number of modules, but if you do it will make your life easier to "see the big picture" (can't see the forest for the trees). I find it's much harder to see the flow of logic in a design that is broken down into a excessively fine granularity. This is similar to the problem of having a design that isn't broken down enough (so you get lost in the "forest" of code).Now I have to get rid of that lot of modules,make them less, two might be proper. one for block ram, one for the effect... and then connect them in a way that a top module can work fine in the testbench with meaningful outputs... Right ?
A fifo might be okay as well. This will depend on your memory access pattern, which you will know more about than me. If you do your read/writes as first in first out, then yes FIFO will do the trick. Such a choice usually is done at the design stage, so that's an idea for your next design.Alright. Now I built a block ram( not sure if A FIFO memory would be better or not). also tried some FIR 5.2 version and ooppss, more than 6000 lines of alien codes :shock:
Well, if you truly need 2 read pointers for whatever reason, then you will not be able to do it with a plain fifo. So IF 2 read pointers truly required, THEN forget about fifo's. This of course depends rather heavily on if you really really require 2 read pointers. You do 2 seperate reads at distinctly different locations every clock cycle?So I tried to put that block ram into circularbuffer and get rid of those nasty creepy fors and do the hardware way. now I've got to declare write and read pointers for my memory, right ? to mention again, I chose a true dual port memory. because I needed 2 read pointers ...
For testbench purposes that sounds about right.Now I have to get rid of that lot of modules,make them less, two might be proper. one for block ram, one for the effect... and then connect them in a way that a top module can work fine in the testbench with meaningful outputs... Right ?
The reads from the RAM are based on the G tables that were in the rar file. I'm looking at the addressing and the first table produces addresses that are incrementing/decrementing by 4. Don't know if there is any point where the table increments differently. The next two G tables of indices changes by +/-2. So all three tables look to be simple saw tooth waveforms, which is simple enough to implement in a always block as an increment/decrement by N counter to generate the addressing to the RAM.Well, if you truly need 2 read pointers for whatever reason, then you will not be able to do it with a plain fifo. So IF 2 read pointers truly required, THEN forget about fifo's. This of course depends rather heavily on if you really really require 2 read pointers. You do 2 seperate reads at distinctly different locations every clock cycle?
Buff_Array1=Buff_Array>>1;
Audio_out=Buff_Array[0]+ divide(Buff_Array[t],Buff_Array1[t]);
Audio_out=Buff_Array[0]- divide(Buff_Array[G],Buff_Array1[G]);
A fifo might be okay as well. This will depend on your memory access pattern, which you will know more about than me. If you do your read/writes as first in first out, then yes FIFO will do the trick. Such a choice usually is done at the design stage, so that's an idea for your next design.As for 6000 lines of alien code ... the idea is usually to read the pdf documentation of that code, so you don't have to go over those large amounts of code. Typically you only need to dive into core generated code if something went horribly wrong.
Well, if you truly need 2 read pointers for whatever reason, then you will not be able to do it with a plain fifo. So IF 2 read pointers truly required, THEN forget about fifo's. This of course depends rather heavily on if you really really require 2 read pointers. You do 2 seperate reads at distinctly different locations every clock cycle?
For testbench purposes that sounds about right.
Fortunately audio sample rate is only a small fraction of the achievable system clock frequency, so you can easyly time-multiplex many read operations on the circular buffer. No dual port memory actually needed.
The reads from the RAM are based on the G tables that were in the rar file. I'm looking at the addressing and the first table produces addresses that are incrementing/decrementing by 4. Don't know if there is any point where the table increments differently. The next two G tables of indices changes by +/-2. So all three tables look to be simple saw tooth waveforms, which is simple enough to implement in a always block as an increment/decrement by N counter to generate the addressing to the RAM.
It's rather hard to determine what the actual intention of your algorithm is based on the original code as it wasn't representative of software or hardware. It might be helpful if you posted the software program (that was tested as working) that you based your Verilog code on as that would allow us to understand the intent.
Things like this:
You appear to be performing a divide by 2 on each element of the Buff_Array and adding it to the BUff_Array[0] value? I'm not sure if your Buff_Array values are all unsigned, I would probably assume so as the values range up to 2400, but then at a different point in your code you perform a subtraction:Code:Buff_Array1=Buff_Array>>1; Audio_out=Buff_Array[0]+ divide(Buff_Array[t],Buff_Array1[t]);
Is Buff_Array[0] always bigger than the divide by 2 values for every other Buff_Array entry?Code:Audio_out=Buff_Array[0]- divide(Buff_Array[G],Buff_Array1[G]);
Regards
The problem is that you stiil don't understand the concept of a circular buffer. You have one write cycle and and several read cycles (one for every tapped signal with different delay) per audio sample clock. Surely not 40000.Well given a sample rate of 40 kHz at least...and needing let's say 1 second delay... I will need 40000 blocks in my ram. that's impossible to only time-multiplex.
Noooooope. What FvM says is quite possible.Well given a sample rate of 40 kHz at least...and needing let's say 1 second delay... I will need 40000 blocks in my ram. that's impossible to only time-multiplex.
The problem is that you stiil don't understand the concept of a circular buffer. You have one write cycle and and several read cycles (one for every tapped signal with different delay) per audio sample clock. Surely not 40000.
The current sample is current. So why would you need to read it out of block ram? Since it is current you should keep it in a register as well. Why? Because that way you don't need a memory read for it.
As for this:
Noooooope. What FvM says is quite possible.
Example A: memory clock of 40 kHz , dual read ports
Example B: memory clock of 80 kHz, single read port
Both A and B can get those two required reads done within a 25 microsecond period.
I suspect you don't even have a planned method of getting data into the part do you? The A/D should be controlled by the FPGA in the best case scenario. The FPGA logic would be running at a minimum of 2x the sample frequency. I would strongly recommend using a clock that is at an even higher frequency. Beside reducing the latency through the design, you will also find that the oscillators used to run a clock at frequencies in the low MHz range tend to be very inexpensive.1. I am sure the write process must be done exactly at the sample rate of the A/D which is 40 kHz or higher.
But as said before, the clock for reading must be higher. how is that possible ?
The documentation should show timing diagrams for writing and reading from the RAM.2. I can't understand the functionality of this ram clearly. suppose I want it to write a hex FFFF into the block with an address of FAFB! and at the same time I want it to utter out the content of block with an address of A80F! how can I do that ? is that done with WEA ? When 1 the the RAM writes DINA into the address, when zero RAM reads DOUTA from the address ? can it be done simultaneously ? I think the mismatch of clocks will result in errors.
This. Read the documentation for block ram.The documentation should show timing diagrams for writing and reading from the RAM.
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?