I don't see any reason a pipelined divider CAN'T work; regardless of the number of cycles, you can still get one division per clock cycle. For your application you need to do 2.3E6 divisions/sec, perfectly doable with even a single divider. Depending on your device, you can do multiple divisions in parallel.
It's hard to tell what an "optimal" solution is, when you give us very few details. I'd suggest you get the most expensive FPGA you can afford, one with lots of DSP blocks.