I have always wanted a good synthesis tool that can create a latch based design from code that was written to use Flip Flops. There are some performance benefits of a latched based design, but not enough tools to support this type of design.
Example of a latch based design that is used to increase the frequency and lower the power of a 2 stage piped design.
always @(posedge clk)
y0 <= a_in * b_in;
always @(posedge clk)
x0 <= (y0 & c_in) | (d_in & f_in);
// -----------------------------------------------------
// latch design (using clk_div2, a clock created using divided by 2 of clk)
// the logic is duplicated. There is a silicon cost for the area of the 2 multipliers, and 2 is needed to keep the pipelined feature from the code. Always a trade off, some speed for 2X the silicon area.
always @(*)
if (!clk_div2) y0_p0 <= a_in * b_in;
always @(*)
if (clk_div2) y0_p1 <= a_in * b_in;
always @(*)
if (clk_div2) x0_p1 <= (y0_p0 & c_in) | (d_in & f_in);
always @(*)
if (!clk_div2) x0_p0 <= (y0_p1 & c_in) | (d_in & f_in);
// The example above shows how duplicate logic is created to generate the y0 and x0 signals. This is a quick example, the a_in, b_in, c_in, d_in, and f_in signals should be duplicated the same as for y0 and x0 but I tried to keep this example short.
// The clock speed can be increased because or the time borrowed by the multiplier. And latches are faster than Flip Flops, and a LC resonate clock can be used.