Single path architectures have top compromize speed versus matching because low Vdsat, big area, low current density are exactly opposite to high ft, low parasitic cap and so on.
But if the input cap is not critical you can combine two pathes with different optimizations together. So using a speed optimzed two stage design together with a high gain, low offset three stage design which share a common output stage.
I will use this opamp for sample and hold I think to use two path and ı try it. but because of clock's of sample and hold switches opamp settling time increase . I think its reason is clockfeedtrough and charge sharing. I use dummy switches for it but i didnt solve the problem.