Consider the thermal stack of the product, to think about
the transient thermal behavior.
You have the core device, inside a die, inside a package,
on a board.
Each "assembly" has a different ultimate heat capacity
(your CTH). The die is simplest; call it all silicon, volume
X*Y*Z, specific heat, there's your gross CTH. You may
also find a thermal time constant formula for this
prismatic material shape, using the dimensions and
constant.
Device inside the die, is a fraction of the volume and
the X-Y extents (see remark about heat concentration
in the drift region, and look at a layout). Thermal time
constant and CTH will be lower (if CTH is framed by
"what is my maximum die temp reached?" against a
time varying load).
Package adds heavy metal heat slug and a better
"DC" thermal path. The time for device to reach
steady state temp has increased, the rate of heat removal
also increases but so does the time for heat to get from
drift region to package baseplate. I'd treat this as a two
part problem as materials are now heterogeneous in
type and size.
It would be comforting if your calculations returned
numbers of the same order, as the steady state and
pulsed power ratings (but then they must also
comprehend the vendor's test setup, heatsink,
airflow, etc. which has now gone beyond any hand
calcs, becoming province of multiphysics or thermal
simulators.