I remember the first time the von Neumann architecture was laid out for me and me thinking "woah that's bottlenecked" and immediately thinking it would make more sense to do the computation where the memory was, or replace "memory" with just a huge pile of registers or something other than what I was looking at.
This is really exciting stuff, I can't help but think a marriage of this approach with HP's memristor technology would bring us screaming along an amazing architecture path for the next several decades.
But then again, I'm concerned that the limited use cases for this being presented are basically already performed by various custom (and cheap and power efficient) DSPs. Is all that's really being envisioned here just a lower power alternative to DSPs? I think the vision can be much bolder.
Of course it would make sense to do the computation where the memory is. Trouble is the memory area is dramatically larger than the computation area.
Imagine if you were a reference librarian, asked for facts like some kind of ancient Google. Suppose your library was the size of your bedroom- you could very quickly find facts. You only have to cross the room. Now suppose you are right in the middle of the Library of Congress. You are smack dab in the middle of it- you are where the memory is. But you're still going to spend half your time just running about the building due to its sheer size!
The only ways to solve that problem are:
- Make memory smaller. Engineers have been hard at work at this for decades.
- Use less memory. This is slower.
- Use a memory hierarchy. This is what we do today, and is analogous to you sitting in a bedroom-sized library with the Library of Congress just down the street, and a young courier who fetches you books from it.
The other challenge is speed. We can't have a huge pile of registers because fast memory is less-dense than slow memory. So 1KB of CPU registers occupies a lot more space than 1KB of DRAM- but DRAM is a poor choice for registers because of how slow it is.
This is really exciting stuff, I can't help but think a marriage of this approach with HP's memristor technology would bring us screaming along an amazing architecture path for the next several decades.
But then again, I'm concerned that the limited use cases for this being presented are basically already performed by various custom (and cheap and power efficient) DSPs. Is all that's really being envisioned here just a lower power alternative to DSPs? I think the vision can be much bolder.