Yep most of the Spartan 3/6 family have 18 bit multipliers primitives built into the silicon. There is normally a table with the number CLBs, IO pads and 18x18 multipliers is on the first or second page of the family datasheet for spartan 3 or 6. Perhaps the Virtex FPGAs have bigger multipliers that would be faster / more efficient than chaining 18-bit ones together although the cost is a lot higher.
To use the multipliers together you can simply use the CORE Generator IP wizard available in the free version of Xilinx ISE (webpack). This allows you to set the input width and output width to 32 bits / 64 bits respectively and configure things like working with signed or unsigned numbers or strobes to indicate when a multi-cycle operation is complete. The wizard will report how many 18x18 multipliers it requires to perform the specified function, and can additionally use CLBs to make up for the rest, although this will exhaust your combinational logic resources very quickly...
Note: if you want to work with floating point numbers, I believe there is a separate IP core that implements FPU functions like addition / division etc etc (also free in ISE). This will also use the 18x18 multiplier primitives automatically.