Columns
Technolog: Enter a parallel universe
This month, I was fortunate enough to be in on a conference call with Microsoft researcher Jim Gray. Jim is one of those people who immediately make you feel stupid, being able to talk effortlessly about any of his numerous ultra-technical research interests with barely a pause for thought.
Jim's research topics are many and varied (research.microsoft.com). One of his papers particularly grabbed my attention, first because it described a form of what you could call guerrilla computing: hijacking hardware resources to do a job when the resources in question were specifically designed to do something entirely different. And second, because those resources are available on almost any PC and can run programs far faster than your CPU could ever manage.
You're probably familiar with the claims card manufacturers make for their graphics hardware: the launch of ATi's X1900 series was accompanied by claims from the company that the GPU boasts greater than one teraflops of computing power. That's a thousand-billion floating-point calculations every second.
So there you are, with a graphics card loaded up with 128MB of local memory - the newest ATi workstation cards have a full gigabyte - attached to a GPU by a connection with ten times the bandwidth of the connection between the main processor and system RAM. And a GPU with ten times the raw power of the system's primary CPU. Yes, that's right: memory bandwidth and GPU power are both around ten times greater than your puny Core Duo and accompanying components. Doesn't it seem a shame to use it just for games?
That's what Gray and his co-researchers thought.
ADVERTISEMENT |
|
At this point, you may be wondering how this can be true and why we're not all running Windows on nVidia and ATi GPUs, leaving Intel to twist in the wind. Well, there are limitations. The GPU is a relatively flexible computing device, but not as flexible as a full-blown CPU. The vertex and pixel shader execution units are designed for strictly sequential operations; as soon as you want to perform looping and branching, it all gets more difficult, although the latest GPUs supporting Shader Model 3 are more flexible in this regard. It all means that some types of computing problem are more suited to GPU execution than others; problems involving large array calculations are best.
On top of that, programming a GPU for efficient operation isn't easy. Graphics cards achieve their huge aggregate floating-point power because they're massively parallel processors. Forget dual cores; a GPU has dozens of the things. I called the pixel and vertex shaders execution units: that's just another word for a processor. Each is capable of running simple programs - shaders - which calculate geometry transformations in the case of vertex shaders, and pixel values in the case of pixel shaders. The guerrilla GPU programmers hijack these processors and fool them into making them do unrelated tasks in the guise of shaders.
The massively parallel nature of a GPU touches upon a problem I've talked about before; namely, that programming highly parallel computing systems is immensely difficult, because humans think in sequential terms. While a GPU finds it easy to do lots of things in parallel, a human has difficulty in working out how to make it efficiently do those things in parallel. When I asked Jim if he saw a time when the average programmer could fire up their favourite development tool and start solving the problem at hand using parallel coding as easily as he or she could write in serial-based C code, his answer was a straight 'no'.
