Let's start from "old" technologies (technology node ~0.25 um or older).
According to a classical MOSFET theory, Vt of a long-channel device is independent of the channel/gate length.
If channel length is decreased, depletion regions around source/body and drain/body p-n junctions start to overlap, and the barrier for electron injection from source to channel is decreased (even for zero Vds).
Barrier lowering leads to lower Vt.
People are talking about Vt roll-off curve - dependence of Vt on gate length L, where Vt is constant at long L and decreases (both for NMOS and PMOS) monotonically as L gets smaller.
This is a classical short-channel effect - decrease of Vt with decrease of L.
In newer processes, there is so-called halo or pocket implant, where substrate/body/channel is more heavily doped near source/drain junctions - this is done in order to suppress drain-induced barrier lowering - decrease of Vt with increase of Vds voltage.
When L gets shorter, halo regions overlap, leading to effectively higher substrate doping, and thus higher Vt. This is called a reverse short-channel effect - Vt increase with decrease of L.
With further decrease of L, depletion regions overlap, barrier is lowered, and Vt gets smaller.
Thus, Vt is constant at large L, increases as L gets shorter, and sharply decreases at even shorter L.
So, depending on technology, doping details, range of L, etc. - Vt curves (Vt versus L) may have different dependence, even for different devices in the same technology.
In real life, there may be further complications, due to accuracy of compact models (talking about simulations), or Vt measurement of definition details (talking about measurements), etc.
Same thing is happening with Vt dependence on channel/gate width.
A classical narrow channel effect is increase of Vt with decrease of W - was typical for technologies utilizing LOCOS isolation.
An inverse narrow channel effect - decrease of Vt with decrease of L - is typical for technologies with STI (shallow-trench isolation).
There are physical reasons why Vt depends differently versus W for different isolation technologies - it is explained in good textbooks and in numerous publications, also - on the web.
Max
-------------