I don't quite understand the "random walk" part of his argument; I am not sure if he is changing the arrival path times or just the channel gains themselves. Also, I have not seen the reference he is talking about.
However, the power in the last tap being half the power in the first tap is similar to the half-life argument from chemistry; for a parameter lambda, the last multipath component (since we are only considering multipath components that are up to 1/2 of the largest multipath) is given as (log(2)/t), or in your notation T_m*log(2). Goldsmith probably discusses how the paths are distributed between the first and the last multipath components, but reasonable assumptions are either randomly or uniformly distributed.
I think the model that I provided above will yield a similar model to the one in the paper you discuss; it should at least give reasonable results. But as far as I can tell without looking at Goldsmith's book, the decaying function is just a simple exponential with the last multipath component located at the point where it would be half the amplitude of the first multipath component.