paulpawlenko
Junior Member level 2
I would like to start a technical discourse about a a massively parallel architecture called Practical Plentiful Parallel Processing or 4P. These processors are designed by effectively inserting tiny processors between memory cells thereby creating a very large parallel hardware canvas. The main idea is to keep processing localized unless and until the data needs to be streamed off chip for IO such as graphics, network or disc. All data is referenced by a data stream consisting of a single address and element count. This drastically simplifies the hardware by eliminating the need for addressing and 3 cache levels. The hardware simplicity allows an order of magnitude (minimum) of performance increase per unit area of silicon.
This hardware canvas operates very similar to a physical IC layout except that it is entirely programmable. Modules are similar to procedures from conventional languages except that they are instantiated by placing them onto a physical area of the hardware canvas where they operate on data as it streams through them similar to commands in the UNIX shell. Modules can have multiple inputs & outputs and can operate asynchronously by sleeping until a true value on a specified input line signals them to wake. Simple, serial modules occupy only a small amount of canvas while parallel SIMD modules, such as graphics, can have identical modules in rows and columns running in parallel over large portions of the canvas.
The streaming IO makes communicating between physical chips logically identical to communicating on chip excepting the additional latency incurred when starting a new stream. With some exceptions, physical pins are functionally programmable and can be allocated as a resource by the operating system as can area on the canvas. One exception is the main control pin on each physical chip that maintains the highest level of authority to control the operation of any area on the physical chip at all times. The security areas of the operating system are typically loaded via secure physical connection or secure network connection via this pin.
Contrast this with conventional processor architectures where address pins are mandated by the hardware specification. By allowing general use of hardware resources such as pins and processing canvas space, the architecture give tremendous power to the program developer that is simply not possible through conventional, instruction/operand based designs. These designs are also greatly limited by the serial instruction streams they process. The 4P hardware canvas runs every module in parallel, with every memory cell being updated every clock tick.
I could keep writing for hours on why this architecture is so greatly superior to conventional architectures but I am just trying to get the conversation started. Please feel free to post any questions or comments. I look forward to reading them.
Paul
This hardware canvas operates very similar to a physical IC layout except that it is entirely programmable. Modules are similar to procedures from conventional languages except that they are instantiated by placing them onto a physical area of the hardware canvas where they operate on data as it streams through them similar to commands in the UNIX shell. Modules can have multiple inputs & outputs and can operate asynchronously by sleeping until a true value on a specified input line signals them to wake. Simple, serial modules occupy only a small amount of canvas while parallel SIMD modules, such as graphics, can have identical modules in rows and columns running in parallel over large portions of the canvas.
The streaming IO makes communicating between physical chips logically identical to communicating on chip excepting the additional latency incurred when starting a new stream. With some exceptions, physical pins are functionally programmable and can be allocated as a resource by the operating system as can area on the canvas. One exception is the main control pin on each physical chip that maintains the highest level of authority to control the operation of any area on the physical chip at all times. The security areas of the operating system are typically loaded via secure physical connection or secure network connection via this pin.
Contrast this with conventional processor architectures where address pins are mandated by the hardware specification. By allowing general use of hardware resources such as pins and processing canvas space, the architecture give tremendous power to the program developer that is simply not possible through conventional, instruction/operand based designs. These designs are also greatly limited by the serial instruction streams they process. The 4P hardware canvas runs every module in parallel, with every memory cell being updated every clock tick.
I could keep writing for hours on why this architecture is so greatly superior to conventional architectures but I am just trying to get the conversation started. Please feel free to post any questions or comments. I look forward to reading them.
Paul