Core i7-980X PC versus eight-core Xeon workstation

30 Jul 2010
smallptoutput462
smallptoutput462

Having been writing about photo-realistic 3D graphics rendering for issue 192 of the magazine, I've been getting myself back up to speed with the state of 3D graphics and looking into the absolute best techniques for achieving realistic lighting. And along the way I've got a new insight into the sheer speed of the latest CPUs.

Turns out the best 3D rendering algorithm is a hugely intensive method known as path tracing, which is sort of like ray tracing's dad. The theory behind the method actually pre-dates ray tracing, but it's only now that PCs are getting fast enough for experimental dabbling at home.

The good part is that, while it needs a heck of a lot of computing power to do, path tracing is actually a fairly simple technique to implement.

But where to get a path-tracing application to play with?

Well, Kevin Beason has written a beautiful example of minimalist programming with his path-tracing renderer, smallpt. It's a complete functioning renderer, with a 3D scene (based on the research-standard Cornell box scene) embedded into the program.

Smallpt generates and saves to disk the fully rendered, near-photorealistic image you can see above. And it's written in a ridiculously compact 99 lines of C++ code. That's the entire renderer, including the scene itself.

Kevin provides only the source code on his site, but I fancied running smallpt. So I spent a couple of hours getting it to compile under Visual C++ Express 2010, which is completely free and you can download from here.

The code assumes you're using the open-source GCC compiler and his code includes some Linux/gcc programming tricks that don’t work under Windows, but a bit of tweaking later I had it rendering the Cornell-box scene.

It is awfully compute-intensive though, taking over 12 minutes to render a grainy 100-samples-per-pixel version on my Core 2 6300 everyday office PC:

Aha! This was a perfect opportunity to put my new quad-core Core 2 Q9400 system, that our lovely IT department built me a couple of weeks ago, through its paces. I added a few lines to the code of smallpt to get it to give me an overall time in seconds for the complete render, and set it going.

Straight away, render time came down to 252 seconds - just over four minutes.

Then I remembered my dual-Xeon workstation muscle machine, originally a test ‘white box’ from Intel that, ahem, never found its way back to them. The only reason I don’t use it as an everyday machine is its excessively loud industrial-level cooling system. But with its dual, quad-core Xeon processors, which cost some frightening amount of money when new, this was the perfect job for the Beast.

I set up the machine in a corner of the PC Pro Labs (well away from complaints about the noise) and installed Windows 7 Ultimate x64, just to make the test fair since that's what's running on my other PCs.

Then I fired up smallpt.exe and postponed making my next cup of tea, knowing it would rip through the render before I could even rise from my chair.

Oh.

Turns out my once-mighty eight-core workstation, barely over three years of age, is now slower for raw compute speed, and by a heck of a margin, than my quad-core machine.

In fact its two Xeon X5340 CPUs took 493 seconds to churn through the smallpt render: getting on for twice as long as my quad-core.

Deflated, I switched off the machine, then wandered over to Mike Jennings in his own corner of the Labs, engrossed in a graphics-card group test for the next issue of PC Pro.

“What’s the CPU in your test rig, Mike?”

“Oh, it’s a Core i7 980X. Six cores. Really fast!”

“Ah. Fast you say? Um, mind if I use it when you’re done?”

“Sure.”

So I did.

It’s not often I class a computer as astonishingly fast, but hell’s teeth this one certainly is.

The render completed in 73 seconds. That’s almost three-and-a-half times faster than my nearly-new Q9400 machine, and nearly seven times faster than my not-exactly-old, dual-Xeon workstation that was worth a good four thousand pounds when it was new.

Let's consider those results on a per-socket basis.

With this pure-CPU, highly multithreaded task, the latest generation of enthusiast-level Intel CPUs are over thirteen times faster per processor than the professional-level Xeon CPU of three-and-a-bit years ago. And about five times faster per core.

I knew all this before, but seeing that machine chew through the render with such ferocious speed really brings home the level of engineering achievement that Intel continues to manage, year after year.

Try it yourself

If you want to try the unofficial PC Pro smallpt render test on your machine, you can download my compiled version here.

But wait! The multithreading needs the Microsoft OpenMP support DLL, vcomp90.dll, and the program won't work without it.

The free-but-faffy way to get it is to install the Microsoft Visual C++ 2008 Redistributable Package from here.

Once the redistributable is installed, search for vcomp90.dll - it should be hiding in a subfolder somewhere within C:\Windows\winsxs - and just copy it to the same folder as smallpt.exe.

Now double-click the smallpt.exe file and the renderer will open in a command-prompt box, churn away for a while and save the rendered image file to the same folder when the render is complete. It'll also give you the time taken to render when it's finished.

rendering

You can open the resulting .ppm image using GIMP for Windows.

Let us know your results, for machines both old and new.

Has anybody out there got a machine that will break the minute mark?

Update:

Check out the posts below and you'll see that Intel itself has risen to the challenge. Read all about the superchilled Intel test rig.

Read more

Blogs