Re: ofdm implementation
So, I believe you are not impmenting any channel impairment s (multi-path, fading, ...) or any frequency-time offset. Then You don't need to implement FEC encoding in the Tx and decoding in Rx, and you don't need to do synchronization or ... in demod.
You will need to implement just a BPSK modulator, FFT, BPSK Slicer (just comparator with zero to see if you are in the right or left of axis). You may need to implement the cyclic prefix.
Depending on number of subcarriers, you may need a pulse shaping filter (Square root raised cosine) but if the number of subcarriers are high, you don't need.
If the tx data is going out of FPGA and then received back using A/D and D/A, then you will need a resampling filter, but if all the Tx/Rx chain is in FPGA, then you dont.
For FFT you can find Verilog / VHDL cores and sample codes on web. You usually don't design these fundamental blocks and just use the IPs or available codes.
A more complicated example of what you are designing can be found in PHY layer (chapter 8.3) of IEEE 802.16-2004 (wimax) standard. If you could, look in there and see the chain.
One thing which might be important in your design is how u use best of bits when you go through FFT or filtering. You should use full range of your coefficients and outputs (say 18 bits) so that you will have least level of noise floor and spur (if important for you).
The rest of modules that you might build would probably be a random signal generator (LFSR pseudo-random generator) in the input, and a comparator in the output to compare the received signal with the transmitted signal and count the errors. You can then add some kind of noise to the tx signal as channel effects and simulate your design to see how much BER you will get.
Lots can be done in such project but if you are going to graduate soon, I recommend that you have a look at 802.16 standards so that when you are being interviewed, you can claim that you are familiar with this standard.
If you have any specific question, let me know. I have only worked with Xilinx though.
Regards - TS