Memory Conflict in Xilinx

Status
Not open for further replies.

Gayathrirani

Newbie level 4
Joined
Jul 11, 2014
Messages
7
Helped
1
Reputation
2
Reaction score
1
Trophy points
3
Visit site
Activity points
73
I've written verilog code for Pupil localisation(for iris recognition).The simulation is done.It synthesizes for the following code.But when i attach a few extra lines(given below) for further computation the folowing error occurs.
Portability:3 - This Xilinx application has run out of memory or has encountered a memory conflict. Current memory usage is 2082380 kb. You can try increasing your system's physical or virtual memory. If you are using a Win32 system, you can increase your application memory from 2GB to 3GB using the /3G switch in your boot.ini file. For more information on this, please refer to Xilinx Answer Record #14932. For technical support on this issue, please visit https://www.xilinx.com/support.
The extra lines:
Code:
cnt8=5'b0;
	cnt1=5'b0;
  k=1;
  repeat(24)
  begin
    if(a1[k]==1'd1)
      begin
      cnt1=cnt1+5'b1;
      if(cnt1==5'b1)
        k_min=k;
      else if (cnt1==white_pix_row[k] && cnt1!=5'b0)
        k_max=k;
		  end
		if(white_pix_row[k]==max_whitepix-5'b00010)
     begin
      cnt8=cnt8+5'b1;
      if (cnt8==5'b1)
      min=k;
      else
      max=k;
      end
		k=k+1;
    end
end
endmodule
I doubt whether each and every iteration of repeat statement takes different memory eventhough i use the same variable.If not so can anyone tell about the prob that leads to memory conflict in my code.I couldn't figure out how my coding is using 2082380 kb memory.
 

Attachments

  • module iris.doc
    53 KB · Views: 117

How do you expect this code to map to gates and registers? The problem comes because your code looks like software, and software style HDL end up either being completly rubbish or just completly uncompilable.

I suggest reviewing a good textbook on digital logic design. Then get out a peice of paper and draw the circuit ON PAPER before you start your HDL again
 
Yes ,
I had a look at this on the 28th and I'm afraid it can only manage a 'repeat' of 9-10 on a 16Gb machine and even with significant swap file it does not get much further along.

The internal tables generated by the software to resolve the design are just too big and grow significantly each time the 'repeat' is increased.

I'm afraid each 'Repeat' DOES take different memory due to the interconnections that need to be resolved.
 
This looks a bit optimistic.

Step 1: write something verilog-ish
Step 2: hope for the best.

What might work a little better:
Step 1: prototype the iris locating algo in opencv, either python or C++ (I'd use python for prototyping). Or matlab if you like.
Step 2: extract the elementary operations from prototype
Step 3: design modules for these operations, that you can map to underlying fpga resources
Step 4: write proper verilog, which incidentally uses verilog-2001 port style while we're at it

I hope you can see the difference between these two approaches.
 
Thank you all for the suggestions.My coding style is software oriented since I'm more practiced with software type codings and I'm new to verilog.Now i corrected the coding .I replaced the software style" repeat and if "statements with digital circuits like leading one detector and Priority encoder.But I doubt whether repeat of only 9-10 is supported by a 16GB machine.Because i still use the repeat(1024)for the threshold part and it synthesizes well.And I finally understood that when the synthesis tool recognizes a repeat statement, it unrolls the loop and allocates separate hardware resources for each and every iteration.
 

But I doubt whether repeat of only 9-10 is supported by a 16GB machine.Because i still use the repeat(1024)for the threshold part and it synthesizes well.
Maybe it's because you aren't considering what that loop unrolls to for the repeat(24) and your repeat(1024). I suspect the repeat(1024) loop is rather simple and therefore doesn't end up with some exponential memory growth as it unrolls the loop.

You have to think about hardware as you write RTL not software, e.g. suppose you have to delay a 8-bit value 100 clock cycles (and you don't have any RAM resources in your FPGA). Now you could code this as 100 separate lines of Verilog in a edge triggered always block or you could just do it with a repeat (note: a for loop would be better). In this case you want to replicate an 8-bit register 100 times and connecting the registers as an 8-bit wide shift register.
 

Status
Not open for further replies.
Cookies are required to use this site. You must accept them to continue using the site. Learn more…