Generally a bad idea to try to match currents when the hFE is so mismatched and depends greatly on how you measured hFE vs the actual operating range from no load to full load. Since worst case is full load, I would suggest the approach I suggested, if you have no other choice and choose the Rb instead of Re where the power Watt rating resistors will be smaller.
Naturally all devices must be at the same temperature on the same heatsink but variances due to in Rjc thermal resistance are unavoidable but help mitigate thermal runaway. I find that the optimal passive solution is to choose series equalizer R's that are in the same range as the ESR of the devices, in this case the Vbe diode ESR when saturated.
Below I chose
Rb * hfe =6 approximately for each case. The equivalent output impedance would be seen at each emitter as an increase of Rb/hfe.
This is a rough cut showing the voltage drop across the series pass transistors as 8V which affects heat loss in the transistor. Since Ib is < 10% of Ic, most of the temperature rise will be the Vce*Ic which is also the effective ESR or Rce of each device. I expect less variation in Rce than hFe, so this should balance the transistors just as effectively as using separate Re's which will get hotter.
In the above simulation, I chose a DC of 4V and AC of +/-1V to drive the output current of 40A into the load of 0.1. This is purely for current sharing and the actual output voltage could be anything higher, as I assume the bypass is an emitter follower arrangement. It could be 40V and 1 Ohm load for 40A and give the same results. The supply could be 8V above the 40V with a reasonable variation, but in this case is just showing the drop above the regulated output voltage.
Unless you give more details, nothing more I can say.