First design the instruction set in Word or the back of an envelope.
Write a instruction set simulator.
Write an assembler.
Write a C compiler.
Run some benchmarks and make sure the instruction set gives you the performance you need.
Develop a microarchitecture for the CPU.
Model in simulator, and check performance is still good.
Start design of CPU basic blocks. Things like adders, multipliers, bus units, register files, instruction decoders, etc. Some blocks will be written in a HDL like Verilog. Other blocks will full custom design (i.e. hand drawn schematics then layout). Verify blocks in simulation.
Pull together blocks in to larger modules.
Verify RTL sims of complete CPU against simulation model.
Prototype in FPGA.
Synthesize the Verilog to gates for target tech.
Run automated place and route to create a layout. Combine with full custom blocks.
DRC/LVS/STA/LEC/ATPG
Tapeout.
Cross fingers.