See this timing diagram from PIC24 reference manual to illustrate the clock polarity and sampling time alternatives:
View attachment 178944
The diagram is valid for 8-Bit SPI operation, the device also supports 16-Bit frames.
You see 8 clock cycles divided by dashed vertical lines. Depending on the SPI mode, SCK is not necessarily changing at the first cycle start, but SDO is always set at the cycle boundary. SDI is either sampled at the middle of cycle (SMP = 0, normal operation) or delayed at the cycle end (SMP = 1).
I see that I caused confusion by mentioning "additional internal clock cycles". They exist in fact in most microcontroller SPI interfaces, but as fast system clock cycles, not divided SCK clock cycles. In most implementations, the whole SPI interface is designed as synchronous logic operated at the system clock ("bus clock" in the NXP block diagram), but the simplified diagram abstracts from internal clock and seems to use only the baud generator output clock. If it only uses a divided clock, then it's at least 2*SCK frequency.