Is this a single register with 1024 bits? If so, it does not really matter, it will be implemented with 1024 flops. While you only get a handful of flops per CLB, the FPGA compiler will distribute them and route the clock and control lines so that timing is met.
If you want a number of addressable registers with this many bits, then you have to use internal RAM. While the compiler will infer when it can use RAM, it is often not as efficient as you can be. Basically, you need to use one of the wizards or CoreGen to implement it in an efficient way.