This is somewhere between a Class E PA and a DC-DC
converter output stage, I guess.
You definitely will want some sort of anti-shoot-through
to keep from burning all your power before it gets off chip.
I use "ballistic" (designed asymmetry) mostly, and just
hand-fiddle the taper until it lines up right at slow/hot.
I'd say you want both the final and both the N and P
predriver to be near zero shoot-through.
I would use multiple bond wires and try to fly power &
ground into the interior of the power stage if allowed.
Using separate bonds for the high and low side drive
legs could be a good idea too, helping isolate the
output dV/dt from the device that's supposed to be
turning off (some).
125MHz, full swing is going to be a challenge. On 0.5um
SOI I've gotten to 1GHz full swing in very limited logic,
400MHz chip scale clocks, but nothing like a power
driver and had to stick to a 2:1 taper to make it go
(of course if you're not stuck with MIL temp range,
that's a nice bonus). I think on JI technology you are
at about the same place. On that same 0.5 SOI node
others have made 2GHz Class E PAs at about 30%
efficiency. Using this node I've done DC-DCs to 10A
and 85% efficient at 5MHz, but it's starting to fade
there. This is not a 5V rated technology, but 3.3V.
Also forget 5V in a 0.5um gate length unless they
managed to engineer the hell out of the drains. Or
expect to be using a thick oxide I/O device with a
longer than the 0.5um advertised L.
Your alternative is to go to a shorter node like
0.25um, and stack FETs. This is as low as you can
go and stand off 5.5V with a stack of 2. Probably
pushing the well voltage limits (at least, rated).
You will gain a lot of speed but also need to split
the powertrain into high side and low side, make a
stiff-enough midrail, have level shifting that comes up
clean and so on.