A Revolutionary Massively Parallel Processing Architecture

paulpawlenko · Jul 23, 2018

ThisIsNotSam said:
Maybe I should then. A DFF w/ scan should have around 30 transistors. Your XOR should have another 10T or so. We are now up to 40T or so. It seems you are now using a NOR? I can't keep track anymore. Truth is... your 10T transistor is either a salesman speech or the innocence of someone that is not a circuit designer. You can't build an ASIC that is not testable. You need DFFs w/ scan, period.

I really don't want to discuss the merit of the overall idea, I still think you are reinventing the wheel and that you have no clue of scalability.

So let's keep the discussion on the circuit side of things. Have you considered that your design has timing issues? Back to back flop connections will suffer from hold problems. Your PSP needs one or more delay cells. Your transistor count is now 46T or so. Let's round it up to 50, you are at 5x your initial budget already.

Now, because the datapath is so short, it is possible you could run this at 4GHz or so in a modern technology. Good luck powering up all these cells running this fast. You will inevitably run into power distribution issues. Maybe you can only run 1/4 of the PSPs that fast while the rest of the system has to be put to sleep.

Clock distribution for this 'canvas' would be challenging. This surely looks like one of those designs were the clock tree is responsible for 50% of the power consumption or so. Not to mention the overhead in area, as all those buffers have to be put somewhere on the die. Maybe you are up to 60T per PSP now.

This is what you get when a mechanical engineer turned into software developer declares himself a visionary. A bunch of GARBAGE. If you show up at TSMC or some other foundry with this idea, even their junior engineers would cringe.

You previously messaged me:
"Here is a fact for you: if your processor has only 10 transistors, it doesn't have a single flop. If it has no flop, it isn't a synchronous design."
So do you now understand that it does computation, hence has flops?

Yes, I have considered timing issues. For power distribution, note the number of power pins on any i7 and make sure you are comparing apples to apples.
Instead of trying to pry out the IP that I chose to not divulge, why don't you do your analysis on the NP4P since that design freely available.

The claim for NP4P is this: I can reach 200 petaflops for the 23 city TSP on a device that fits into a standard PC box.
That beats the SUMMIT supercomputer using minimal hardware so its a big claim. If the design is garbage, you should be able to show why.
Running at 1 GHz, using 300 mm wafers my methods should require less than 10 wafers thereby fitting into a PC box.
The NP4P design is listed and I have a functional SPICE model running for 6 cities, you can use that as your baseline.
I'll send you the SPICE files if you like and you can run it yourself.
Show me why this design will not work as advertised.
And I do have an ASIC company (who does outsource to TSMC) working on an estimate.

ThisIsNotSam · Jul 23, 2018

I am sure you don't know what a wafer is. You don't know what hold timing is either, do you? You don't know how power is distributed in an ASIC. It's not about the pins. That was a problem back in the 80s.

Please don't send me anything. I have spent enough time analyzing your GARBAGE for free.

Who the hell designs in spice? The 80s called...

Do not put any of your money on this. You don't have the knowledge to propose anything in this domain. I will leave this discussion now, someone has to do real research around here, right?

paulpawlenko · Jul 23, 2018

ThisIsNotSam said:
I am sure you don't know what a wafer is. You don't know what hold timing is either, do you? You don't know how power is distributed in an ASIC. It's not about the pins. That was a problem back in the 80s.

Please don't send me anything. I have spent enough time analyzing your GARBAGE for free.

Who the hell designs in spice? The 80s called...

Do not put any of your money on this. You don't have the knowledge to propose anything in this domain. I will leave this discussion now, someone has to do real research around here, right?

Each time I address your actual issues, instead of making a counter argument you just start ranting vaguely without any real point.
Instead of having you guess at the internals of my programmable architecture, I offered an open technical specification for you to refute and, instead of proving me incorrect, as you should easily be able to do, you punt. Not surprising.

Many times your responses directly indicate your clear lack of understanding of basic concepts regarding the design intent yet while simultaneously insisting that it won't work based upon these very misconceptions that I repeatedly reveal. The fact that you tirelessly rant, flame and misrepresent does not itself determine your incompetence but it does show technical desperation and defines who you are.

I do not anticipate any timing problems at all.
I anticipate identifying defective cells through testing through simple IO for each function then marking defects and ensuring the compiler bypasses each defective cell. I anticipate critical paths (less than .01%) will be overdesigned. Yes manufacturing defects are a concern because I, admittedly, do not have a lot of experience or knowledge in this area. But if Intel can reliably produce processors as complex as they do, I seriously doubt that an architecture as simple and symmetrical as 4P will pose any major hurdles. Same holds true for power distribution that I, again admittedly, have only examined generally over the entire computing surface. The architecture is a specification for a family of designs so going into too much detail for a single circuit is counter productive as the layouts will certainly evolve before the first chip is ever built. Will 4P require significant research prior to production? Absolutely. But that is the price to be paid for high performance, scalability and security offered by 4P.

The simple fact is that von Neumann processors run serial instructions, are outdated and a better solution is needed to exploit parallelism.
Everyone knows this. Where is your design, Sam?

ThisIsNotSam · Jul 23, 2018

Go study ASICs.

ads-ee · Jul 24, 2018

Closing unproductive thread that poses no technical question or answers (and it is now at post #44).

If you wish you may create a blog on the site, but avoid starting any threads that do not ask a question or respond to a question with an answer (or clarification question).

If you do create a blog make sure to follow the rules for forum blogs.

Welcome to EDAboard.com

A Revolutionary Massively Parallel Processing Architecture

paulpawlenko

Junior Member level 2

ThisIsNotSam

Advanced Member level 5

paulpawlenko

Junior Member level 2

ThisIsNotSam

Advanced Member level 5

ads-ee

Super Moderator

ThisIsNotSam

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Connect with us

Online statistics

Forum statistics