The skid buffer has no "functional" purpose. It is inserted to improve timing, like "normal" pipelining.
The problem is that there is one signal (ready) going in the other direction. To get the full timing improvement, that signal also needs a register. The master will then see the ready signal one clock cycle "too late" when it is set low by the slave.
If there is only one register in the skid buffer, the ready signal going back to the master can only be active for one clock cycle at a time, because the slave can set ready=0 but the master would see it too late.
This means that every second clock cycle the ready going back to the master must have ready=0 even if the slave has ready=1 all the time. The throughput will be cut in half.
The skid buffer with two registers solves that problem. It can accept one more write from the master when the slave sets ready=0.