I'm thinking about data structures, which are currently outperformed by smarter cache-line friendly algorithms as for instance balanced mutable binary search trees vs smart trie or B-tree implementations. These index structures are of course meant as an example for the wide range of algorithms, which have to take cache-lines into account nowadays.
Do you think these "old" algorithms are getting some kind of renaissance in the future?
Imagine a RISC CPU with a chunk of Static RAM, running slow (100 Mhz), and synchronously, with no need for any of the usual tweaks like cache, pipelines, branch prediction, etc. Such a CPU would be very small, and if you sized the RAM correctly, you could put all of it into a grid that would more than make up for the slow clock rate, and still be fairly low power.
In the end, though, I think we'll end up programming arrays of 4x4 bit look up tables that talk to their neighbors. But not directly, through a few layers of optimizing compilers. The advantage is that it could be fast, fault tolerant, and would all run in parallel. You could compute as fast as you could shovel data in one end, and out the other.