Place & Route takes too long

Status
Not open for further replies.

SharpWeapon

Member level 5
Joined
Mar 18, 2014
Messages
89
Helped
0
Reputation
0
Reaction score
0
Trophy points
6
Visit site
Activity points
705
Hello,

My design is taking really too long in place & route part, it went fine till some phase, but then it took forever. After around 40 min I stopped it. Here is the place and route progress report:

Code:
Phase  1  : 33641 unrouted;      REAL time: 36 secs 
Phase  2  : 23627 unrouted;      REAL time: 42 secs 
Phase  3  : 4206 unrouted;      REAL time: 1 mins 29 secs 
Phase  4  : 4218 unrouted; (Setup:1612, Hold:149225, Component Switching Limit:0)     REAL time: 1 mins 47 secs 
Updating file: toplevel.ncd with current fully routed design.
Phase  5  : 0 unrouted; (Setup:1612, Hold:147854, Component Switching Limit:0)     REAL time: 1 mins 54 secs 
Phase  6  : 0 unrouted; (Setup:1612, Hold:147854, Component Switching Limit:0)     REAL time: 1 mins 55 secs 
Phase  7  : 0 unrouted; (Setup:0, Hold:177889, Component Switching Limit:0)     REAL time: 2 mins 20 secs 
Phase  8  : 0 unrouted; (Setup:0, Hold:177889, Component Switching Limit:0)     REAL time: 2 mins 20 secs

Btw, I am using 14.7, latest version of xilinx with full license. Any suggestion?

Thanks!
 

Phase 8 was the last line output by PAR, or is there more to the report?

If this Phase 8 was the last output from par then it's probably due to some poor constraints that can't realistically be met. Like synchronous transfers between clocks generated in the logic fabric resulting in large amounts of hold time.

I'm making an assumption it's a problem with the hold time as that is 177.889 ns of total hold time violation that is being reported. Which is an excessive amount in any design. Something under a couple of thousand (i.e 2 ns) is more or less normal, but 177,889 isn't.

Regards
 
I had the same problem once with ultiboard. I exploded the over-crowded areas a bit and auto-routing completed within seconds.
 

Hey ads-ee, thanks! Yeah, phase 8 is the last one. I have embedded some external code to my design for some interfacing issues, probably that is causing the problem. How can I trace which signal is causing hold time violation?

mrinalmani, thanks for the reply but what exactly did you do?
 

Re: Place & Route takes too long

I had the same problem once with ultiboard. I exploded the over-crowded areas a bit and auto-routing completed within seconds.
I highly doubt placement is a problem as the design completed routing.
Phase 5 : 0 unrouted; (Setup:1612, Hold:147854, Component Switching Limit:0) REAL time: 1 mins 54 secs
See it has 0 unrouted nets in 1 min 54 seconds.
Now PAR tries to meet timing...
But once it fixes the setup times it ends up with even more hold time violations...This really looks like a design+constraint problem.

Regards

- - - Updated - - -

Hey ads-ee, thanks! Yeah, phase 8 is the last one. I have embedded some external code to my design for some interfacing issues, probably that is causing the problem. How can I trace which signal is causing hold time violation?
It's not going to be a single signal, more likely a whole bunch of them.

You can't really trace anything until PAR completes.
My suggestion...turn down the effort level and let it finish routing the design. I think there is a way to make it not keep trying until it makes timing, but I don't remember off hand what that was. Make sure the advanced options are showing. Once it finishes you can bring up trace and take a look at where the hold violations are occurring. My suspicion will be that they are between two different clock domains that the tool is assuming are synchronous.

mrinalmani, thanks for the reply but what exactly did you do?
Ignore this response as it's not the root of your problem.

Regards
 
Re: Place & Route takes too long

Now it takes two hours+ in a 4GB RAM core 2 duo machine and still running. In another slow machine, the same project already took four hours+ yet not finished. Your suspicion might be true, I have two asynchronous clock domains, but that is part of the design.
 

Are both clocks using clock nets (BUFG or similar) ?
Have you set constraints to ignore timing for all signals that cross the clock domains?
What are the clock frequencies?
 

Re: Place & Route takes too long

Your suspicion might be true, I have two asynchronous clock domains, but that is part of the design.

If it is trying to meet timing for those async domains as well that might make for interesting coffee breaks. If you didn't do so already you should put a TIG (timing ignore) constraint on those async signals. And then double check in the logs that those constraints are actually applied, because that sometimes is a bit sneaky.
 

Are the FPGA close to the limit? If yes, you could try the following:

- Oversize your FPGA (chose a FPGA one or two families bigger that you are using).
- Run place and route.
- The time to finish place and route should be smaller, once you have more logic. Once finished, you can check your critical path on timing analysis.

It may give you a hint of what you need to optimize.
 

Thank you both std_match and mrflibble for your reply. Btw one of the machine finished it just now. Here is the log file. I didn't quite understand my way forward after reading the report tho. And please tell me how I should apply timing ignore thing. Thanks!
 

Attachments

  • implimentation report.txt
    43.9 KB · Views: 122
  • Timing report.txt
    181.5 KB · Views: 125
  • UCF.txt
    6 KB · Views: 121
Last edited:

Re: Place & Route takes too long

need the timing report to understand the path.

The constraints that have a problem look to be out of an MMCM. You should post the timing report and the UCF file you are using.

...and ignore the replies about using bigger FPGAs, as I've said before you've got design and/or constraint issues. It's looking more like constraint problems given the -0.887 ns Hold slack. Something must be specified incorrectly.

Regards

- - - Updated - - -

FYI you could have aborted the run after it wrote out the updated ncd file:

Then read that ncd and ucf into trce to get the timing information against the constraints.
 
The highlighted part is where your problem resides.

I'd need to see the specific code that generates the clocks and perhaps the code for the interface between the two domains.

As the two clocks are multiples of each other I'd assume the path may be a valid path between the two domains. The amount of skew between the two clocks suggest they may not both be generated from the same MMCM?

You should rerun trace using the full_path switch to make it show the clock paths.


Your UCF contains this line:
NET "CLK_AB_P" CLOCK_DEDICATED_ROUTE = FALSE;
which isn't a good idea as this means the clock isn't using the dedicated routing from the package pin to the clock buffers/MMCM/PLLs. According to the document for the ML605 the pin pair is K26(FMC_LPC_LA00_CC_P)/K27(FMC_LPC_LA00_CC_N) correct?:
Code:
NET "CLK_AB_N" LOC="K27";
NET "CLK_AB_P" LOC="K26";
Not sure why you used CLOCK_DEDICATED_ROUTE = FALSE, since that pair of pins are clock capable.

Am I wrong in assuming you didn't use a MMCM/PLL to generate the divide by 2 clock: clk_122_88MHz? If so that is the source of your problem and will be a classic example of why you never want to generate a clock using the core logic. Show me the code for the clock generation, so I don't have to guess.

Regards
 
Last edited:
The amount of skew between the two clocks suggest they may not both be generated from the same MMCM?
Correct! I had two MMCMs, I just thought both will be synchronous(infact they were, both in simulation and in post synthesis). Now I have one MMCM and passed the CLK to all modules only from this MMCM. Now PAR takes a very short time with all time constrains met. Yaay!

a classic example of why you never want to generate a clock using the core logic
What is the best way to do it then?
 
Last edited:

Your problem was a result of one of the MMCMs not being able to use dedicated routingas it was in a different bank.

If you need to use multiple MMCMs (e.g. you have a very large number of clocks to generate or they don't have a common VCO frequncy), you can:
a) feed the FPGA with two versions of the same external clock.
b) feed the single clock through a BUFG, which will then feed both MMCMs.

To avoid generating clocks in the fabric. Use a PLL/MMCM/DCM or instead of making a clock make a siingle clock wide pulse and use that as an enable for any flip-flop that would have used the generated clock. In this way you'll end up with a single clock design with all the related domains on the same clock but only getting enabled every Nth clock.

Regards
 
Thanks that really helped. One last question though, should the 'Clock Path Skew' always be 0.00 or is it acceptable if it is too small or negative(why is it negative btw). If so what is the reference for that, the time specified in the bracket of 'Slack (setup paths): [~]ns (requirement..)'?
 

Clock path skew can be any of positive, negative, and 0. It's normally referenced to the source clock so if one branch of the clock that drives the source register is shorter than the branch driving the destination register you end up increasing setup margin but decreasing hold margin (destination clock is delayed with respect to the source clock). The opposite can also occur where the source clock has a longer clock path than the destination clock. In this case the source clock is delayed with respect to the destination clock and decreases setup margin and increases the hold margin.

With your non-optimal input clock routing you were ending up with the first situation where the destination clock had so much positive skew relative to the source clock that the result was a huge decrease in the hold margin.

Regards
 
Status
Not open for further replies.
Cookies are required to use this site. You must accept them to continue using the site. Learn more…