I suspect something happens between the 2nd and 3rd trigger that results in a FIFO overflow condition, which obviously never recovers. Have you run enough times to determine if there is a periodicity in the slippage?
This is why I suggested trying a ping-pong buffer with the read and write pointers always starting at address 0 so the data is always aligned with the t=0 position. Can't have slippage in that type of design.
You still haven't indicated if you've added any diagnostics to the design to determine if you're getting overflow/underflow in your FIFOs. That is the first thing I would do. Just run the signals to a spare pin and use a scope with persistence and see if you get any over/under flowing.
If you have Chipscope/SignalTap I would connect ILAs to the FIFO write and FIFO read logic, FIFO flags, data DMA requests from the PCIe core and use a synchronized (to both clock domains) version of the t=0 trigger as a capture trigger.
I get the feeling you haven't had a lot of experience debugging designs in hardware, it's a skill you should strive to develop. I'm assuming you have a TB for the design and it worked fine there? If so try modifying the stimulus to more closely match the hw if you can (I suspect there is a mismatch between the TB stimulus and the actual PCIe transactions).
- - - Updated - - -
BTW did you used the core generator to build a dual clock asynchronous FIFO? You didn't use a single clock FIFO by mistake? Also if you did add/use the FIFO flags like empty/full....empty should only be used by the read clock domain and full should only be used on the write clock domain. If you use them on the wrong domain you might see issues with data slippage. You really need to add over/under flow detection and send it to pins that you can monitor.
Not trying to be condescending, just want to make sure you correctly understand as I just checked your profile and found out your still a student
overflow would be done by looking for writes when the FIFO is full, in the write clock domain.
underflow would be done by looking for reads when the FIFO is empty, in the read clock domain.