Got it folks. Ok, the reason is:
In FFT, we work with orthogonal basis, but they are not orthonormal.! Each basis vector has magnitude of √N. How? Read on...
Each component of the basis vector is a complex number. Sin
2θ + Cos
2θ becomes 1 for each component when we do <x, x>. So the magnitude of a basis vector is √[N×1] = √N.
We conveniently divide the net result by √N (ideally). Consider a matrix containing twiddle factors, say W. Assume each column represents a basis vector. For IDFT, we simply need to multiply each freq component's amplitude with corresponding basis vector (complex sine wave), and sum up, thereby reconstructing the original signal. On expanding that product and rearranging, you'll find it is same as inner product between X(k) and rows of W . Therefore in matrix form it becomes X*W
T (X is a column vector and W is N×N; n is row index, k is col index). And finally divide by √N once, because the basis vectors in W are not normalized, and once more by √N if our X(k) was made from W which is not normalized. Hence divide just by √N.√N=N while taking IDFT.
Some authors prefer to maintain symmetry and divide by √N or √(2π) in case of continuous domain in both DFT and IDFT. Many prefer N or 2π in just IDFT for simplicity. May be because its simple to divide by N and avoid calculating square root in computer. The beauty is that conj(W) is same at W
T, and hence the formula for IDFT remains same as DFT except a change in sign (if you choose to preserve the symmetry)
Hope that helped someone. And do correct me if I'm wrong. Have a great day