Transistor netlist will match regardless of routing, but parasitic burdens vary in size and distribution as you vary the route-to constraints. perhaps your "simple" config "lets things (i.e. detail timing) slide" to the point that this affects outcomes.
My first concern in such a situation would be that a low speed functional, at-speed marginal routing is what's been made (this can pass many checks). Try best and worst timing corners to expose that, by any difference in results (when strobed such that, generally, things line up).