Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

[SOLVED] post route result coming early

Status
Not open for further replies.

dipin

Full Member level 4
Full Member level 4
Joined
Jul 16, 2014
Messages
223
Helped
14
Reputation
28
Reaction score
14
Trophy points
18
Activity points
1,731
hi,

i had a problem in my design. when i do post route simulation its output coming early than expected.

new.PNG

in this waveform output must come in the next rising edge of the clock (190,000 ps). but the output is coming before that !!!! how it is possible???

if the output is delayed means its ok. but the output coming early.
how can i justify this??


this is my test bench
Code:
module sqrt_test();
  
  reg [31:0]in_data;
  
  
  reg clk;
  reg reset;
  reg [31:0]count;
  
  wire [16:0]out_data;
  wire [17:0]remainder;
  //wire [67:0]temp_in_data1;
  //wire [63:0]xtemp_in_data_1;
  

  
 sqrt   U1(
            .in_data(in_data),
            .clk(clk),
            .reset(reset),
            .out_data(out_data),
            .remainder(remainder)
           // .temp_in_data1(temp_in_data1),
           // .xtemp_in_data_1(xtemp_in_data_1)
            
             );
             
            
    initial begin
      
       
    
       clk  =1;
       reset =1'b1;
       in_data =32'd0;
       
       count = 0;
     
       repeat(10) @(negedge clk);
        reset=1'b0;
        count = 40001;
       
       end
     
  always@(posedge clk) begin
  
  if(!reset) begin
  
  in_data = count;
  count = count+1;
  
  end
  
  end
  
      
  always begin
  
  #5 clk <=~clk;
  
  end

endmodule

and here is my design

Code:
module sqrt(
            in_data,
            clk,
            reset,
            out_data,
            remainder
             );
             
parameter IN_WIDTH = 31; // INPUT WIDTH
parameter OUT_WIDTH = IN_WIDTH >> 1;
parameter IN_CAL = IN_WIDTH >> 2;            
parameter N = 4*(IN_CAL+1); 
parameter Q = N >> 1;         
            
input [N-1:0] in_data;
input clk;
input reset;

output [Q:0]    out_data;
output [Q+1:0] remainder;


reg [Q:0]    out_data;
reg [Q+1:0] remainder;

reg [Q:0] xa_out_data [12:0];
reg [Q:0] temp_out_data;

reg [2*N+3:0] temp_in_data[13:1];

reg [2*N-1:0] xtemp_in_data_1;
reg [2*N-1:0] xtemp_in_data_2;
reg [2*N-1:0] xtemp_in_data_3;
reg [2*N-1:0] xtemp_in_data_4;
reg [2*N-1:0] xtemp_in_data_5;
reg [2*N-1:0] xtemp_in_data_6;
reg [2*N-1:0] xtemp_in_data_7;
reg [2*N-1:0] xtemp_in_data_8;
reg [2*N-1:0] xtemp_in_data_9;
reg [2*N-1:0] xtemp_in_data_10;
reg [2*N-1:0] xtemp_in_data_11;
reg [2*N-1:0] xtemp_in_data_12;

reg [N-1:0] temp_sub_result_1;
reg [N-1:0] temp_sub_result_2;
reg [N-1:0] temp_sub_result_3;
reg [N-1:0] temp_sub_result_4;
reg [N-1:0] temp_sub_result_5;
reg [N-1:0] temp_sub_result_6;
reg [N-1:0] temp_sub_result_7;
reg [N-1:0] temp_sub_result_8;
reg [N-1:0] temp_sub_result_9;
reg [N-1:0] temp_sub_result_10;
reg [N-1:0] temp_sub_result_11;
reg [N-1:0] temp_sub_result_12;


always @(posedge clk)begin
  
  if(reset) begin
    
    temp_in_data[1]  <= 0;
    temp_in_data[2]  <= 0;
    temp_in_data[3]  <= 0;
    temp_in_data[4]  <= 0;
    temp_in_data[5]  <= 0;
    temp_in_data[6]  <= 0;
    temp_in_data[7]  <= 0; 
    temp_in_data[8]  <= 0;
    temp_in_data[9]  <= 0;
    temp_in_data[10] <= 0;
    temp_in_data[11] <= 0;
    temp_in_data[12] <= 0;
    temp_in_data[13] <= 0;
    
    xtemp_in_data_1 <= 0;
    xtemp_in_data_2 <= 0;
    xtemp_in_data_3 <= 0;
    xtemp_in_data_4 <= 0;
    xtemp_in_data_5 <= 0;
    xtemp_in_data_6 <= 0;
    xtemp_in_data_7 <= 0;
    xtemp_in_data_8 <= 0;
    xtemp_in_data_9 <= 0;
    xtemp_in_data_10<= 0;
    xtemp_in_data_11<= 0;
    xtemp_in_data_12<= 0;
    
    temp_sub_result_1 <= 1;
    temp_sub_result_2 <= 1;
    temp_sub_result_3 <= 1;
    temp_sub_result_4 <= 1;
    temp_sub_result_5 <= 1;
    temp_sub_result_6 <= 1;
    temp_sub_result_7 <= 1;
    temp_sub_result_8 <= 1;
    temp_sub_result_9 <= 1;
    temp_sub_result_10<= 1;
    temp_sub_result_11<= 1;
    temp_sub_result_12<= 1;
    
    xa_out_data[0]  <= 0;
    xa_out_data[1]  <= 0;
    xa_out_data[2]  <= 0;
    xa_out_data[3]  <= 0;
    xa_out_data[4]  <= 0;
    xa_out_data[5]  <= 0;
    xa_out_data[6]  <= 0;
    xa_out_data[7]  <= 0;
    xa_out_data[8]  <= 0;
    xa_out_data[9]  <= 0;
    xa_out_data[10] <= 0;
    xa_out_data[11] <= 0;
    xa_out_data[12] <= 0;
     
    remainder   <= 0;
    out_data    <= 0;
    
    
    
  end else  begin
    
            temp_in_data[1][N+1:2]      <= in_data[N-1:0];
            xtemp_in_data_1[N+3:4]      <= in_data[N-1:0];
            temp_sub_result_1           <= 1;
            xa_out_data[0]              <= 0;
            out_data                    <= xa_out_data[IN_CAL+1];
            remainder                   <= temp_in_data[IN_CAL+2][N+3+Q:N+2]; 
      
if(condition)begin
   if(condition) begin
   ...............
   end else begin
   .............
   end   
end else begin 
   if(condition) begin
   ................
   end else begin
   ................
   end
end
   . .
...........
.........
.........
if anybody know please help.
thanks & regards
 
Last edited:

Don't use blocking assignments in an edge triggered always block. You did that in the testbench. With so much of the RTL missing there is no way to determine if you have some coding problem in your pipeline.
 

Hi,
Don't use blocking assignments in an edge triggered always block. You did that in the testbench. With so much of the RTL missing there is no way to determine if you have some coding problem in your pipeline.

i have changer the test bech .but still it not working. can i send the code through e_mail??

my problem is output is not changing at the rising edge??
i have written everything inside the always bolck.

thanks & regards
 

dipin said:
i have changer the test bech .but still it not working. can i send the code through e_mail??
Read the forum rules...this is a violation of the rules. i.e. No don't send your code to me via email.
Either zip the files and post it in a reply or figure out the problem yourself.

dipin said:
my problem is output is not changing at the rising edge??
You're still not understanding the outputs won't change on the rising edge of the clock if the delays called out in the library models of the gate level simulation are telling the simulator to delay the output X number of ns after the inputs toggle.

dipin said:
i have written everything inside the always bolck.
I know, this is the second thread you've started on this subject, and you still haven't responded with useful information that will allow us to help you.

I may have asked the following questions before and gotten no answer (so I'll ask them)...

1. What time scale are you running the simulation with? (1ns/1ps?)
2. Are there # delays in the library models you are using in your gate level simulation? (FPGA library models are usually poorly written, look at an ASIC library model for comparison if you doubt that ;-))
3. Which vendors parts are you using? (i.e. Xilinx, Altera, Microsemi, Actel, etc.)

- - - Updated - - -

Also you never responded to sharath666's request of posting the STA report. Did you even do this?
 
  • Like
Reactions: dipin

    dipin

    Points: 2
    Helpful Answer Positive Rating
hi,
thanks for the replay ads-ee

1 .`timescale 1ns / 1ps

2 . i didnt used any ip for this.

3 . i am using ISE Design Suite 14.4

and this is the sta report generated by ise
--------------------------------------------------------------------------------
Release 14.4 Trace (nt64)
Copyright (c) 1995-2012 Xilinx, Inc. All rights reserved.

F:\xilinx_14.4_install_folder\14.4\ISE_DS\ISE\bin\nt64\unwrapped\trce.exe
-intstyle ise -v 3 -s 1L -n 3 -fastpaths -xml sqrt.twx sqrt.ncd -o sqrt.twr
sqrt.pcf

Design file: sqrt.ncd
Physical constraint file: sqrt.pcf
Device,package,speed: xc6vlx75tl,ff484,C,-1L (PRODUCTION 1.11 2012-12-04)
Report level: verbose report

Environment Variable Effect
-------------------- ------
NONE No environment variables were set
--------------------------------------------------------------------------------

INFO:Timing:2698 - No timing constraints found, doing default enumeration.
INFO:Timing:3412 - To improve timing, see the Timing Closure User Guide (UG612).
INFO:Timing:2752 - To get complete path coverage, use the unconstrained paths
option. All paths that are not constrained will be reported in the
unconstrained paths section(s) of the report.
INFO:Timing:3339 - The clock-to-out numbers in this timing report are based on
a 50 Ohm transmission line loading model. For the details of this model,
and for more information on accounting for different loading conditions,
please see the device datasheet.



Data Sheet report:
-----------------
All values displayed in nanoseconds (ns)

Setup/Hold to clock clk
------------+------------+------------+------------+------------+------------------+--------+
|Max Setup to| Process |Max Hold to | Process | | Clock |
Source | clk (edge) | Corner | clk (edge) | Corner |Internal Clock(s) | Phase |
------------+------------+------------+------------+------------+------------------+--------+
in_data<0> | -0.106(R)| FAST | 1.695(R)| SLOW |clk_BUFGP | 0.000|
in_data<1> | -0.157(R)| FAST | 1.770(R)| SLOW |clk_BUFGP | 0.000|
in_data<2> | 0.132(R)| FAST | 1.306(R)| SLOW |clk_BUFGP | 0.000|
in_data<3> | -0.150(R)| FAST | 1.758(R)| SLOW |clk_BUFGP | 0.000|
in_data<4> | -0.258(R)| FAST | 1.887(R)| SLOW |clk_BUFGP | 0.000|
in_data<5> | -0.159(R)| FAST | 1.788(R)| SLOW |clk_BUFGP | 0.000|
in_data<6> | 0.024(R)| FAST | 1.531(R)| SLOW |clk_BUFGP | 0.000|
in_data<7> | -0.271(R)| FAST | 1.909(R)| SLOW |clk_BUFGP | 0.000|
in_data<8> | -0.318(R)| FAST | 1.980(R)| SLOW |clk_BUFGP | 0.000|
in_data<9> | -0.364(R)| FAST | 2.062(R)| SLOW |clk_BUFGP | 0.000|
in_data<10> | -0.284(R)| FAST | 1.951(R)| SLOW |clk_BUFGP | 0.000|
in_data<11> | -0.307(R)| FAST | 1.966(R)| SLOW |clk_BUFGP | 0.000|
in_data<12> | -0.332(R)| FAST | 2.001(R)| SLOW |clk_BUFGP | 0.000|
in_data<13> | -0.353(R)| FAST | 2.050(R)| SLOW |clk_BUFGP | 0.000|
in_data<14> | -0.350(R)| FAST | 2.055(R)| SLOW |clk_BUFGP | 0.000|
in_data<15> | -0.212(R)| FAST | 1.844(R)| SLOW |clk_BUFGP | 0.000|
in_data<16> | -0.262(R)| FAST | 1.923(R)| SLOW |clk_BUFGP | 0.000|
in_data<17> | -0.120(R)| FAST | 1.717(R)| SLOW |clk_BUFGP | 0.000|
in_data<18> | -0.073(R)| FAST | 1.673(R)| SLOW |clk_BUFGP | 0.000|
in_data<19> | -0.238(R)| FAST | 1.871(R)| SLOW |clk_BUFGP | 0.000|
in_data<20> | -0.444(R)| FAST | 2.203(R)| SLOW |clk_BUFGP | 0.000|
in_data<21> | -0.482(R)| FAST | 2.238(R)| SLOW |clk_BUFGP | 0.000|
in_data<22> | -0.424(R)| FAST | 2.148(R)| SLOW |clk_BUFGP | 0.000|
in_data<23> | -0.405(R)| FAST | 2.122(R)| SLOW |clk_BUFGP | 0.000|
in_data<24> | -0.429(R)| FAST | 2.153(R)| SLOW |clk_BUFGP | 0.000|
in_data<25> | -0.408(R)| FAST | 2.139(R)| SLOW |clk_BUFGP | 0.000|
in_data<26> | -0.415(R)| FAST | 2.143(R)| SLOW |clk_BUFGP | 0.000|
in_data<27> | -0.429(R)| FAST | 2.154(R)| SLOW |clk_BUFGP | 0.000|
in_data<28> | -0.398(R)| FAST | 2.118(R)| SLOW |clk_BUFGP | 0.000|
in_data<29> | -0.357(R)| FAST | 2.055(R)| SLOW |clk_BUFGP | 0.000|
in_data<30> | -0.414(R)| FAST | 2.063(R)| SLOW |clk_BUFGP | 0.000|
in_data<31> | -0.447(R)| FAST | 2.100(R)| SLOW |clk_BUFGP | 0.000|
reset | 4.239(R)| SLOW | 1.428(R)| SLOW |clk_BUFGP | 0.000|
------------+------------+------------+------------+------------+------------------+--------+

Clock clk to Pad
-------------+-----------------+------------+-----------------+------------+------------------+--------+
|Max (slowest) clk| Process |Min (fastest) clk| Process | | Clock |
Destination | (edge) to PAD | Corner | (edge) to PAD | Corner |Internal Clock(s) | Phase |
-------------+-----------------+------------+-----------------+------------+------------------+--------+
out_data<0> | 7.486(R)| SLOW | 3.273(R)| FAST |clk_BUFGP | 0.000|
out_data<1> | 7.910(R)| SLOW | 3.474(R)| FAST |clk_BUFGP | 0.000|
out_data<2> | 7.914(R)| SLOW | 3.480(R)| FAST |clk_BUFGP | 0.000|
out_data<3> | 7.823(R)| SLOW | 3.441(R)| FAST |clk_BUFGP | 0.000|
out_data<4> | 7.549(R)| SLOW | 3.313(R)| FAST |clk_BUFGP | 0.000|
out_data<5> | 7.517(R)| SLOW | 3.287(R)| FAST |clk_BUFGP | 0.000|
out_data<6> | 7.887(R)| SLOW | 3.469(R)| FAST |clk_BUFGP | 0.000|
out_data<7> | 7.927(R)| SLOW | 3.487(R)| FAST |clk_BUFGP | 0.000|
out_data<8> | 7.781(R)| SLOW | 3.421(R)| FAST |clk_BUFGP | 0.000|
out_data<9> | 7.829(R)| SLOW | 3.447(R)| FAST |clk_BUFGP | 0.000|
out_data<10> | 7.816(R)| SLOW | 3.445(R)| FAST |clk_BUFGP | 0.000|
out_data<11> | 8.089(R)| SLOW | 3.576(R)| FAST |clk_BUFGP | 0.000|
out_data<12> | 7.242(R)| SLOW | 3.144(R)| FAST |clk_BUFGP | 0.000|
out_data<13> | 7.131(R)| SLOW | 3.092(R)| FAST |clk_BUFGP | 0.000|
out_data<14> | 7.091(R)| SLOW | 3.062(R)| FAST |clk_BUFGP | 0.000|
out_data<15> | 7.109(R)| SLOW | 3.075(R)| FAST |clk_BUFGP | 0.000|
remainder<0> | 7.891(R)| SLOW | 3.538(R)| FAST |clk_BUFGP | 0.000|
remainder<1> | 7.782(R)| SLOW | 3.489(R)| FAST |clk_BUFGP | 0.000|
remainder<2> | 7.709(R)| SLOW | 3.433(R)| FAST |clk_BUFGP | 0.000|
remainder<3> | 7.605(R)| SLOW | 3.402(R)| FAST |clk_BUFGP | 0.000|
remainder<4> | 7.388(R)| SLOW | 3.284(R)| FAST |clk_BUFGP | 0.000|
remainder<5> | 7.247(R)| SLOW | 3.218(R)| FAST |clk_BUFGP | 0.000|
remainder<6> | 7.243(R)| SLOW | 3.221(R)| FAST |clk_BUFGP | 0.000|
remainder<7> | 7.230(R)| SLOW | 3.215(R)| FAST |clk_BUFGP | 0.000|
remainder<8> | 7.441(R)| SLOW | 3.299(R)| FAST |clk_BUFGP | 0.000|
remainder<9> | 7.232(R)| SLOW | 3.208(R)| FAST |clk_BUFGP | 0.000|
remainder<10>| 7.240(R)| SLOW | 3.198(R)| FAST |clk_BUFGP | 0.000|
remainder<11>| 7.305(R)| SLOW | 3.235(R)| FAST |clk_BUFGP | 0.000|
remainder<12>| 7.249(R)| SLOW | 3.228(R)| FAST |clk_BUFGP | 0.000|
remainder<13>| 7.244(R)| SLOW | 3.221(R)| FAST |clk_BUFGP | 0.000|
remainder<14>| 7.203(R)| SLOW | 3.192(R)| FAST |clk_BUFGP | 0.000|
remainder<15>| 7.336(R)| SLOW | 3.255(R)| FAST |clk_BUFGP | 0.000|
remainder<16>| 7.429(R)| SLOW | 3.305(R)| FAST |clk_BUFGP | 0.000|
remainder<17>| 7.404(R)| SLOW | 3.326(R)| FAST |clk_BUFGP | 0.000|
-------------+-----------------+------------+-----------------+------------+------------------+--------+

Clock to Setup on destination clock clk
---------------+---------+---------+---------+---------+
| Src:Rise| Src:Fall| Src:Rise| Src:Fall|
Source Clock |Dest:Rise|Dest:Rise|Dest:Fall|Dest:Fall|
---------------+---------+---------+---------+---------+
clk | 4.845| | | |
---------------+---------+---------+---------+---------+


Analysis completed Tue Jan 06 14:20:46 2015
--------------------------------------------------------------------------------

Trace Settings:
-------------------------
Trace Settings

Peak Memory Usage: 446 MB

thanks & regards
 
Last edited:

Give us the timing report showing the path/slack from in_data to out_data. You will have to pick this path from the final timing report.
 

There is no slack the OP hasn't created a ucf file. Timing:2698 say so.

- - - Updated - - -

Dipin,
I wasn't referring to IP. A net list has instantiate gates, LUTs, etc. Those primitives are from the simprim library.

In the past I've found that many of Xilinx's simulation primitives used #1 delays, which would mean that every layer of logic you GI throufht might be delaying your result by another 1ns. I typically use a 1ps/1ps timescale, because of this.
 

hi,
There is no slack the OP hasn't created a ucf file. Timing:2698 say so.

- - - Updated - - -

Dipin,
I wasn't referring to IP. A net list has instantiate gates, LUTs, etc. Those primitives are from the simprim library.

In the past I've found that many of Xilinx's simulation primitives used #1 delays, which would mean that every layer of logic you GI throufht might be delaying your result by another 1ns. I typically use a 1ps/1ps timescale, because of this.

if delay problem means total output will be delayed right??
my problem is not about delay. my problem is , my output coming early .on that above wave form output must come at or after 190000 ps(rising edge of clock) .but it is coming before that. i didnt find no reason to justify that!!!!!!!!!!!!!!!!
in my synthesized report maximum frequency is 310 MHz
now iam using 100 MHZ as clock frequency
actually first time iam doing all these .so i am in trouble now.
is there any way to solve this???

thanks &regards
 
Last edited:

It's highly likely that you are interpreting your netlist simulation results incorrectly. Without the files: RTL, netlist, and testbench I don't think anyone will be able to help determine what you are misinterpreting.

- - - Updated - - -

I suggest you run the simulation with a 1MHz clock, so that any # delay problems are insignificant in comparison to the clock period.
 

hi '
thanks ads-ee
i had simulated for 1mhz.
but when i do behavioral out put is coming at 10th clock which is right according to my design

but in post-place and route it is coming in 9th clock cycle????
(which is wrong)
i am struck here for almost 3 weeks:bang:

thanks &regards
 

Then surely you need to analyze this particular path. That is why I am asking you to look into the timing report.
 
  • Like
Reactions: dipin

    dipin

    Points: 2
    Helpful Answer Positive Rating
but when i do behavioral out put is coming at 10th clock which is right according to my design
but in post-place and route it is coming in 9th clock cycle????
(which is wrong)

So how do you know which is correct? Did you dessign the pipeline with exactly 10 registers from input to output or did you run a simulation and pronounce it's 10? Could it be that you really only have a 9 register pipeline and your behavioral simulation is wrong?

You've written your clock in a way that I have never used and theefore can't assess if it 100% correct. Perhaps it is causing delta time issues between the testbench and the DUT (behavioral or netlist).

The standard method is to put the clock in an initial block (by itself) in a forever loop.
 

hi,
thank you for the replay ads-ee

actually i have 10 registers from input to output including the output register.

Code:
output [Q:0]    out_data;
output [Q+1:0] remainder;


reg [Q:0]    out_data;
reg [Q+1:0] remainder;
and this is my updated testbench still the result is same.
`timescale 1ns / 1ps

////////////////////////////////////////////////////////////////////////////////
// Company:
// Engineer:
//
// Create Date: 10:48:40 01/06/2015
// Design Name: sqrt
// Module Name: C:/Users/dipin.divakar/Desktop/ipcoretest/ddd/test.v
// Project Name: ddd
// Target Device:
// Tool versions:
// Description:
//
// Verilog Test Fixture created by ISE for module: sqrt
//
// Dependencies:
//
// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
////////////////////////////////////////////////////////////////////////////////

module test;

// Inputs
reg [31:0] in_data;
reg clk;
reg reset;
reg [31:0]count;

// Outputs
wire [16:0] out_data;
wire [17:0] remainder;

// Instantiate the Unit Under Test (UUT)
sqrt uut (
.in_data(in_data),
.clk(clk),
.reset(reset),
.out_data(out_data),
.remainder(remainder)
);

initial begin
// Initialize Inputs
in_data = 0;
clk = 0;
reset = 1;

// Wait 100 ns for global reset to finish
#100;
reset =0;
count=32'd40000;
// Add stimulus here

end

always@(posedge clk) begin

if(!reset) begin

in_data = count;
count = count+1;

end

end

initial begin
forever begin

#5 clk <=~clk;

end
end
endmodule

thanks and regards

- - - Updated - - -

hi ,
Timing Report

NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE.
FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT
GENERATED AFTER PLACE-and-ROUTE.

Clock Information:
------------------
-----------------------------------+------------------------+-------+
Clock Signal | Clock buffer(FF name) | Load |
-----------------------------------+------------------------+-------+
clk | BUFGP | 320 |
-----------------------------------+------------------------+-------+

Asynchronous Control Signals Information:
----------------------------------------
No asynchronous control signals found in this design

Timing Summary:
---------------
Speed Grade: -1

Minimum period: 3.220ns (Maximum Frequency: 310.592MHz)
Minimum input arrival time before clock: 0.929ns
Maximum output required time after clock: 0.682ns
Maximum combinational path delay: No path found

Timing Details:
---------------
All values displayed in nanoseconds (ns)

=========================================================================
Timing constraint: Default period analysis for Clock 'clk'
Clock period: 3.220ns (frequency: 310.592MHz)
Total number of paths / destination ports: 268568 / 287
-------------------------------------------------------------------------
Delay: 3.220ns (Levels of Logic = 30)
Source: temp_in_data_347 (FF)
Destination: temp_in_data_348 (FF)
Source Clock: clk rising
Destination Clock: clk rising

Data Path: temp_in_data_347 to temp_in_data_348
Gate Net
Cell:in->out fanout Delay Delay Logical Name (Net Name)
---------------------------------------- ------------
FDR:C->Q 7 0.280 0.517 temp_in_data_347 (temp_in_data_347)
LUT2:I0->O 1 0.053 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_lut<2> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_lut<2>)
MUXCY:S->O 1 0.219 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<2> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<2>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<3> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<3>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<4> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<4>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<5> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<5>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<6> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<6>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<7> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<7>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<8> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<8>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<9> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<9>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<10> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<10>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<11> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<11>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<12> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<12>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<13> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<13>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<14> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<14>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<15> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<15>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<16> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<16>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<17> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<17>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<18> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<18>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<19> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<19>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<20> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<20>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<21> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<21>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<22> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<22>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<23> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<23>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<24> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<24>)
MUXCY:CI->O 1 0.015 0.000 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<25> (Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_cy<25>)
XORCY:CI->O 2 0.180 0.718 Msub_temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT_xor<26> (temp_in_data[7][63]_temp_sub_result_7[31]_sub_55_OUT<26>)
LUT5:I0->O 1 0.053 0.000 Mcompar_GND_1_o_temp_in_data[7][63]_LessThan_52_o_lut<9> (Mcompar_GND_1_o_temp_in_data[7][63]_LessThan_52_o_lut<9>)
MUXCY:S->O 1 0.219 0.000 Mcompar_GND_1_o_temp_in_data[7][63]_LessThan_52_o_cy<9> (Mcompar_GND_1_o_temp_in_data[7][63]_LessThan_52_o_cy<9>)
MUXCY:CI->O 28 0.015 0.568 Mcompar_GND_1_o_temp_in_data[7][63]_LessThan_52_o_cy<10> (GND_1_o_temp_in_data[7][63]_LessThan_52_o)
LUT6:I5->O 1 0.053 0.000 temp_in_data[7][31]_temp_in_data[7][31]_MUX_791_o1 (temp_in_data[7][31]_temp_in_data[7][31]_MUX_791_o)
FDR:D -0.012 temp_in_data_358
----------------------------------------
Total 3.220ns (1.417ns logic, 1.803ns route)
(44.0% logic, 56.0% route)

=========================================================================
Timing constraint: Default OFFSET IN BEFORE for Clock 'clk'
Total number of paths / destination ports: 312 / 312
-------------------------------------------------------------------------
Offset: 0.929ns (Levels of Logic = 1)
Source: reset (PAD)
Destination: temp_in_data<1>_30 (FF)
Destination Clock: clk rising

Data Path: reset to temp_in_data<1>_30
Gate Net
Cell:in->out fanout Delay Delay Logical Name (Net Name)
---------------------------------------- ------------
IBUF:I->O 280 0.003 0.606 reset_IBUF (reset_IBUF)
FDR:R 0.320 temp_in_data<1>_30
----------------------------------------
Total 0.929ns (0.323ns logic, 0.606ns route)
(34.8% logic, 65.2% route)

=========================================================================
Timing constraint: Default OFFSET OUT AFTER for Clock 'clk'
Total number of paths / destination ports: 34 / 34
-------------------------------------------------------------------------
Offset: 0.682ns (Levels of Logic = 1)
Source: out_data_15 (FF)
Destination: out_data<15> (PAD)
Source Clock: clk rising

Data Path: out_data_15 to out_data<15>
Gate Net
Cell:in->out fanout Delay Delay Logical Name (Net Name)
---------------------------------------- ------------
FDR:C->Q 1 0.280 0.399 out_data_15 (out_data_15)
OBUF:I->O 0.003 out_data_15_OBUF (out_data<15>)
----------------------------------------
Total 0.682ns (0.283ns logic, 0.399ns route)
(41.5% logic, 58.5% route)

=========================================================================

Cross Clock Domains Report:
--------------------------

Clock to Setup on destination clock clk
---------------+---------+---------+---------+---------+
| Src:Rise| Src:Fall| Src:Rise| Src:Fall|
Source Clock |Dest:Rise|Dest:Rise|Dest:Fall|Dest:Fall|
---------------+---------+---------+---------+---------+
clk | 3.220| | | |
---------------+---------+---------+---------+---------+

=========================================================================


Total REAL time to Xst completion: 25.00 secs
Total CPU time to Xst completion: 24.86 secs

-->

Total memory usage is 298096 kilobytes

Number of errors : 0 ( 0 filtered)
Number of warnings : 2145 ( 0 filtered)
Number of infos : 113 ( 0 filtered)
i think this is the timing report sarath asked .


regards
 

I was expecting a path from in_data to out_data. You have to actually look at the post P&R report to actually find out what is happening.
 

actually i have 10 registers from input to output including the output register.
Code:
output [Q:0]    out_data;
output [Q+1:0] remainder;


reg [Q:0]    out_data;
reg [Q+1:0] remainder;

Really? Not according to your specification of IN_WIDTH.
dipin said:
Code:
parameter IN_WIDTH = 31; // INPUT WIDTH
parameter OUT_WIDTH = IN_WIDTH >> 1;
parameter IN_CAL = IN_WIDTH >> 2;            
parameter N = 4*(IN_CAL+1); 
parameter Q = N >> 1;

So I see the following:
IN_WIDTH: 31
OUT_WIDTH: 31/2 = 15
IN_CAL: 31/4 =7
N: 4*(7+1) = 32
Q: 16

10 is not equal to 16, so what was your point with that code snippet? It certainly doesn't prove you designed a 10 stage pipeline. It only proves you don't know how to tell us how many pipeline stages your design has.

I am giving up with trying to help you, since you keep supplying incorrect and/or useless/almost_useless information.

If you really want help (it will have to be from someone other than me)...post a zip file with all of your code including the testbench(s), and the netlist you've been trying to run. I've wasted too much time trying to get you to give me some information to work with, so I can determine what your problem is. All I've been doing so far is trying to guess what you have done in your design, which has been very unproductive.

To put it another way, I probably would have found the problem in under an hour if I could have run a simulation of your RTL and the netlist. So your 3 weeks of unproductive effort would have been reduced to 1-2 days of effort. Unfortunately you'll have to get someone else to do that.

Good luck
 

hi,

i had a problem in my design. when i do post route simulation its output coming early than expected.

View attachment 112930

in this waveform output must come in the next rising edge of the clock (190,000 ps). but the output is coming before that !!!! how it is possible???

if the output is delayed means its ok. but the output coming early.
how can i justify this??

It's coming out early because the input 'in_data' is violating the setup time requirement. You need to generate in_data in a realistic manner which means that it will be switching some number of nanoseconds after the rising edge of the clock, not one simulation delta (i.e. 0 ns) after the clock.

Kevin Jennings
 

Kevin,

I'm willing to respond to your post ;-)
K-J said:
It's coming out early because the input 'in_data' is violating the setup time requirement. You need to generate in_data in a realistic manner which means that it will be switching some number of nanoseconds after the rising edge of the clock, not one simulation delta (i.e. 0 ns) after the clock.
That really can't be determined. The OP does generate the clock to the DUT and to the in_data input based on the same clock, but as they use the wrong type of assignment they may see problems with how the in_data gets sampled in either the RTL or the netlist. I've pointed this out in their other thread (which I'm too lazy to make a link to). I've also mentioned in this thread that they are probably misinterpreting their simulation results.

Blocking assignments should never be used in an edge sensitive always block (which I've mentioned before). Of course like most answers in this thread nothing was done about it.

Code Verilog - [expand]
1
2
3
4
5
6
always@(posedge clk) begin
  if(!reset) begin
    in_data = count;
    count = count+1;
  end
end



Basically the OP doesn't know how to examine a pipeline in a waveform viewer or interpret what the simulation waveform is showing them. Given that it has been 3 weeks and the problem is still not resolved, I think the OP doesn't have the "tools" to make it as an engineer. Not everyone is cut out to be an engineer.
 

Kevin,

I'm willing to respond to your post ;-)

That really can't be determined.
I determined it simply by looking at the waveforms that were posted. While the outputs were happening at various time after the clock, the input was coincident with the rising edge which implies that it would be violating either setup/hold or both.

The OP does generate the clock to the DUT and to the in_data input based on the same clock, but as they use the wrong type of assignment they may see problems with how the in_data gets sampled in either the RTL or the netlist.
Not disagreeing, but that has nothing to do with how the testbench creates the input stimulus which at present must be violating the setup/hold time requirements of the design. Until that is cleaned up, one should not expect pre-route sim to match post-route sim.

Simply put, unless you model delays in the testbench, you won't be able to use the same testbench for a post-route sim as you were using for the pre-route sim.

We should probably stop now and let the OP chew on it for a fortnight or two.

Kevin Jennings
 

hi,
It's coming out early because the input 'in_data' is violating the setup time requirement. You need to generate in_data in a realistic manner which means that it will be switching some number of nanoseconds after the rising edge of the clock, not one simulation delta (i.e. 0 ns) after the clock.
thad tried this one .but every thing is same.

That really can't be determined. The OP does generate the clock to the DUT and to the in_data input based on the same clock, but as they use the wrong type of assignment they may see problems with how the in_data gets sampled in either the RTL or the netlist. I've pointed this out in their other thread (which I'm too lazy to make a link to). I've also mentioned in this thread that they are probably misinterpreting their simulation results.
i had tried with nonblocking assignment also still output remains the same .

then another thing is i had included the first two registers in waveform window where input is coming from test bench to the design. from that, to the input registers data in coming early and every registers in the design is changing at that instance only(in negative edge).


to ads-ee :
i will post it here. can you please check it??

thanks regards
 
Last edited:

thad tried this one .but every thing is same.
Post the same waveforms from the original post showing that you delayed in_data by more than 2.1ns relative to clk. Post another showing that reset has been delayed by more than 1.428 ns after clk.

Kevin Jennings
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top