Another way to look at this: The velocity of an acoustic wave is 340m/s and you need to resolve 1mm which is 0.001m/340m = 2.9us, while the minimum Tburst for the TCT40-16R/T is 200us. The accepted range resolution of a pulsed radar or sonar is [Pwidth x V]/2 or 34mm given the minimum Tburst. As others have pointed out the period of 40kHz is 25us which again much longer than the required 2.9us resolution.
++++++++++++++++++
There are low cost linear optical encoders that can resolve 20um. This would give a self calibrating 0 paper height when the roller was against the platen and then could measure the actual paper thickness if this is of use, it would definitely detect multiple sheets. Being self calibrating and optical it should prove reliable.
A lower cost solution may be to use a rotary optical encoder. If the roller wheel was on a arm 25.4mm long a 360 degree arc would be 2Pi x 25.4mm = 160mm thus a 1mm sheet would rotate the encoder about 1/160 * 360 or about 2.25 degrees. A low cost encoder with only 256 pulses per turn can resolve 1.4 degrees over the small range you require.
The approximation assumes the arm is nearly parallel with the paper and is used over a small range. Once a real geometry is laid out tan(θ)= paper/arm with θ corrected for the actual starting angle will improve the estimate.
They look and mount like a potentiometer. A Bourns ENC1J-D28-L00256L is an example. With its two channel quadrature output it actually has 512 pulses per turn and can report whether a step is up or down. (need a accurate end of paper detector?)
+++++++++++++++++++
An approach I have seen used but I do not find reliable is the Sharp optical paper tray sensors. They are ok for paper/not paper but are paper brightness dependent when used for your application. Here is one we have used: GP2A200LCS0
Taken from the data sheet:
GP2A200LCS0F Series are OPIC output, reflective
photo-interrupters with emitter and detector facing the
same direction in a molding that provides non-contact
sensing.
The emitter and detector are set at angles to each other. They are designed to distinguish between light and dark but due to the angle they are fairly distance sensitive. With a distance of 2.5mm it works well in our application but I do not trust it. I do not know what performance you would see with a 1mm step.