paulpawlenko
Junior Member level 2
Maybe I should then. A DFF w/ scan should have around 30 transistors. Your XOR should have another 10T or so. We are now up to 40T or so. It seems you are now using a NOR? I can't keep track anymore. Truth is... your 10T transistor is either a salesman speech or the innocence of someone that is not a circuit designer. You can't build an ASIC that is not testable. You need DFFs w/ scan, period.
I really don't want to discuss the merit of the overall idea, I still think you are reinventing the wheel and that you have no clue of scalability.
So let's keep the discussion on the circuit side of things. Have you considered that your design has timing issues? Back to back flop connections will suffer from hold problems. Your PSP needs one or more delay cells. Your transistor count is now 46T or so. Let's round it up to 50, you are at 5x your initial budget already.
Now, because the datapath is so short, it is possible you could run this at 4GHz or so in a modern technology. Good luck powering up all these cells running this fast. You will inevitably run into power distribution issues. Maybe you can only run 1/4 of the PSPs that fast while the rest of the system has to be put to sleep.
Clock distribution for this 'canvas' would be challenging. This surely looks like one of those designs were the clock tree is responsible for 50% of the power consumption or so. Not to mention the overhead in area, as all those buffers have to be put somewhere on the die. Maybe you are up to 60T per PSP now.
This is what you get when a mechanical engineer turned into software developer declares himself a visionary. A bunch of GARBAGE. If you show up at TSMC or some other foundry with this idea, even their junior engineers would cringe.
You previously messaged me:
"Here is a fact for you: if your processor has only 10 transistors, it doesn't have a single flop. If it has no flop, it isn't a synchronous design."
So do you now understand that it does computation, hence has flops?
Yes, I have considered timing issues. For power distribution, note the number of power pins on any i7 and make sure you are comparing apples to apples.
Instead of trying to pry out the IP that I chose to not divulge, why don't you do your analysis on the NP4P since that design freely available.
The claim for NP4P is this: I can reach 200 petaflops for the 23 city TSP on a device that fits into a standard PC box.
That beats the SUMMIT supercomputer using minimal hardware so its a big claim. If the design is garbage, you should be able to show why.
Running at 1 GHz, using 300 mm wafers my methods should require less than 10 wafers thereby fitting into a PC box.
The NP4P design is listed and I have a functional SPICE model running for 6 cities, you can use that as your baseline.
I'll send you the SPICE files if you like and you can run it yourself.
Show me why this design will not work as advertised.
And I do have an ASIC company (who does outsource to TSMC) working on an estimate.