Those delays are quite wrong!
When you use a command like 'delay(40)' expecting a delay of 40mS which is possibly the right delay your LCD needs. However, when you call your delay function, all you do is count 40 down to zero to create the delay. With a 12MHz clock you execute 3,000,000 instructions per second. Your compiler may be producing code that is a small as 2 instructions so when you want 40mS you actually get only 0.013mS which is far too short.
You have three options to fix it:
1. use HTC's built in delay functions if it has them (I don't use HTC so I'm not certain but most compilers have delay functions in them).
2. increase the value you pass to the function by a factor of 3,000 times - but beware of exceeding the limit for an int value. delay(40); becomes delay(120,000);.
3. slow down the delay function so each count you request from it takes longer.
Brian.