Skip to navigation

PCPro-Computing in the Real World Printed from www.pcpro.co.uk

Register to receive our regular email newsletter at http://www.pcpro.co.uk/registration.

The newsletter contains links to our latest PC news, product reviews, features and how-to guides, plus special offers and competitions.

// Home / Blogs

Posted on July 30th, 2010 by David Fearon

Core i7-980X PC versus eight-core Xeon workstation

smallptoutput462

Having been writing about photo-realistic 3D graphics rendering for issue 192 of the magazine, I’ve been getting myself back up to speed with the state of 3D graphics and looking into the absolute best techniques for achieving realistic lighting. And along the way I’ve got a new insight into the sheer speed of the latest CPUs.

Turns out the best 3D rendering algorithm is a hugely intensive method known as path tracing, which is sort of like ray tracing’s dad. The theory behind the method actually pre-dates ray tracing, but it’s only now that PCs are getting fast enough for experimental dabbling at home.

The good part is that, while it needs a heck of a lot of computing power to do, path tracing is actually a fairly simple technique to implement.

But where to get a path-tracing application to play with?

Well, Kevin Beason has written a beautiful example of minimalist programming with his path-tracing renderer, smallpt. It’s a complete functioning renderer, with a 3D scene (based on the research-standard Cornell box scene) embedded into the program.

Smallpt generates and saves to disk the fully rendered, near-photorealistic image you can see above. And it’s written in a ridiculously compact 99 lines of C++ code. That’s the entire renderer, including the scene itself.

Kevin provides only the source code on his site, but I fancied running smallpt. So I spent a couple of hours getting it to compile under Visual C++ Express 2010, which is completely free and you can download from here.

The code assumes you’re using the open-source GCC compiler and his code includes some Linux/gcc programming tricks that don’t work under Windows, but a bit of tweaking later I had it rendering the Cornell-box scene.

It is awfully compute-intensive though, taking over 12 minutes to render a grainy 100-samples-per-pixel version on my Core 2 6300 everyday office PC:

image

Aha! This was a perfect opportunity to put my new quad-core Core 2 Q9400 system, that our lovely IT department built me a couple of weeks ago, through its paces. I added a few lines to the code of smallpt to get it to give me an overall time in seconds for the complete render, and set it going.

Straight away, render time came down to 252 seconds – just over four minutes.

Then I remembered my dual-Xeon workstation muscle machine, originally a test ‘white box’ from Intel that, ahem, never found its way back to them. The only reason I don’t use it as an everyday machine is its excessively loud industrial-level cooling system. But with its dual, quad-core Xeon processors, which cost some frightening amount of money when new, this was the perfect job for the Beast.

I set up the machine in a corner of the PC Pro Labs (well away from complaints about the noise) and installed Windows 7 Ultimate x64, just to make the test fair since that’s what’s running on my other PCs.

Then I fired up smallpt.exe and postponed making my next cup of tea, knowing it would rip through the render before I could even rise from my chair.

Oh.

Turns out my once-mighty eight-core workstation, barely over three years of age, is now slower for raw compute speed, and by a heck of a margin, than my quad-core machine.

In fact its two Xeon X5340 CPUs took 493 seconds to churn through the smallpt render: getting on for twice as long as my quad-core.

Deflated, I switched off the machine, then wandered over to Mike Jennings in his own corner of the Labs, engrossed in a graphics-card group test for the next issue of PC Pro.

“What’s the CPU in your test rig, Mike?”

“Oh, it’s a Core i7 980X. Six cores. Really fast!”

“Ah. Fast you say? Um, mind if I use it when you’re done?”

“Sure.”

So I did.

It’s not often I class a computer as astonishingly fast, but hell’s teeth this one certainly is.

The render completed in 73 seconds. That’s almost three-and-a-half times faster than my nearly-new Q9400 machine, and nearly seven times faster than my not-exactly-old, dual-Xeon workstation that was worth a good four thousand pounds when it was new.

Let’s consider those results on a per-socket basis.

With this pure-CPU, highly multithreaded task, the latest generation of enthusiast-level Intel CPUs are over thirteen times faster per processor than the professional-level Xeon CPU of three-and-a-bit years ago. And about five times faster per core.

I knew all this before, but seeing that machine chew through the render with such ferocious speed really brings home the level of engineering achievement that Intel continues to manage, year after year.

Try it yourself

If you want to try the unofficial PC Pro smallpt render test on your machine, you can download my compiled version here.

But wait! The multithreading needs the Microsoft OpenMP support DLL, vcomp90.dll, and the program won’t work without it.

The free-but-faffy way to get it is to install the Microsoft Visual C++ 2008 Redistributable Package from here.

Once the redistributable is installed, search for vcomp90.dll – it should be hiding in a subfolder somewhere within C:\Windows\winsxs – and just copy it to the same folder as smallpt.exe.

Now double-click the smallpt.exe file and the renderer will open in a command-prompt box, churn away for a while and save the rendered image file to the same folder when the render is complete. It’ll also give you the time taken to render when it’s finished.

rendering

You can open the resulting .ppm image using GIMP for Windows.

Let us know your results, for machines both old and new.

Has anybody out there got a machine that will break the minute mark?

Update:

Check out the posts below and you’ll see that Intel itself has risen to the challenge. Read all about the superchilled Intel test rig.

Tags: , , , , ,

Posted in: Hardware, Software

Permalink

Follow any responses to this entry through the RSS 2.0 feed.

You can skip to the end and leave a response. Pinging is currently not allowed.

211 Responses to “ Core i7-980X PC versus eight-core Xeon workstation ”

  1. Chris Says:
    July 30th, 2010 at 2:46 pm

    Is it possible to post your modified source as Norton 2010 took it upon itself to delete the binary as soon as it was downloaded?

     
  2. David Fearon Says:
    July 30th, 2010 at 2:57 pm

    Will post as soon as I’ve cleaned it up to make it less embarrassing Chris :-)

     
  3. Hooch Says:
    July 30th, 2010 at 3:15 pm

    I couldn’t get smallpt to run. Get an error: “The application was unable to start correctly (0xc000007b)” I also found 4 version of vcomp90.dll in various subfolders in the winsxs directory. I just copy the most recent one.

     
  4. Craig Says:
    July 30th, 2010 at 3:15 pm

    I sympathise completely with the deflated feeling you went through.

    My younger brother is doing a degree in architecture. As such he uses things like Auto CAD and 3d Studio Max every day. It became clear at Christmas his machine was no longer up to task. One of his projects was going to be late if he didn’t get the render going soon. The computer suites were full of people who were rendering for over 24hours, machines were crashing etc. He handed me his project and I ran it overnight on my Q6600 and it took just under 6 hours and was done, deadline made.

    I then rebuilt his machine on the cheap (i5 760, 4GB DDR3 dual and a cheap Geforce GT 240 with a gig (came to about £350 I think although not certain)). I also popped in a non stock cooler to give it a hope in hell of not overheating. I also (and here is the key) installed the CUDA plugin for 3D studio Max and re-ran the same render at the same settings etc.

    9 mins – Didn’t even get warm.

     
  5. George Says:
    July 30th, 2010 at 3:18 pm

    I got the dll here: http://www.dll-files.com/dllindex/dll-files.shtml?vcomp90 worked perfectly.. 1.7MB or 27KB? hmmm :)

     
  6. Laurent Says:
    July 30th, 2010 at 3:30 pm

    Interesting, my single socket Xeon 5345 did it in 331 seconds, by some margin quicker than your dual socket Xeon 5340. Surely these two CPUs are not that different?

     
  7. David Wright Says:
    July 30th, 2010 at 3:34 pm

    My Sony Vaio laptop did it in 202 seconds (1.6Ghz Core i7-620QM, 8GB RAM)

     
  8. David Fearon Says:
    July 30th, 2010 at 3:44 pm

    @Laurent – that is indeed interesting! Maybe I’ll head back down to the Labs and poke around in the BIOS of the Xeon machine, see if I can get more out of it.

    @Hooch – you need the DLL version that matches your machine architecture. Easiest thing is just to try each one to find one that works!

     
  9. milliganp Says:
    July 30th, 2010 at 3:51 pm

    353 seconds on Opteron 1352 Quad Core 2.1GHz. Not earth-shattering but it’s a point on a graph.

     
  10. Phil Says:
    July 30th, 2010 at 3:52 pm

    My stock i7-930 did it in 120 seconds, just goes to show how much faster the i7980X really is!

     
  11. Neil Deadman Says:
    July 30th, 2010 at 3:55 pm

    287 seconds

    Intel Xeon W3520 @ 2.67GHz
    Windows 7 Professional 32-bit
    2GB RAM

    ———————————–

    660 seconds

    Windows 7 Enterprise 32-bit
    Intel Core 2 Duo E4500 @ 2.20GHz
    3.5GB RAM

     
  12. David Wright Says:
    July 30th, 2010 at 4:08 pm

    Oops, typo, that should be a Core i7-720QM, I also re-ran it, with all other apps closed (I had done it with chat clients running and doing a remote support session to install a scanner. That brought it down to 199 seconds.

     
  13. Jujhar Says:
    July 30th, 2010 at 4:26 pm

    190 seconds on a Windows 7, Dell Precision m6500 running a Quad Core Intel Core i7 820QM @ 1.73Ghz

     
  14. Kevin Waite Says:
    July 30th, 2010 at 4:37 pm

    Thanks David, now I have CPU envy!
    My results on an AMD Phenom 9950 Quad at 2.6GHz with 4G RAM on Win7 was 303 seconds.

     
  15. Andrew Farmer Says:
    July 30th, 2010 at 4:46 pm

    303 seconds on E8400 @ 4Ghz 4Gb RAM
    Win 7 Pro

     
  16. Andrew Farmer Says:
    July 30th, 2010 at 4:54 pm

    Down to 291 seconds with Kaspersky off!

     
  17. IBrown Says:
    July 30th, 2010 at 5:05 pm

    Do I see a new PC Pro review benchmark in the making?

     
  18. N Says:
    July 30th, 2010 at 5:29 pm

    I didn’t manage to break the 60s barrier. In fact, it was a slightly disappointing 1935s on my 1.6Ghz Atom. I have now shelved my plans to use it to render Toy Story 4..

     
  19. Vic Says:
    July 30th, 2010 at 6:05 pm

    XP Pro: i7-930 stock speed

    111 seconds!

    Wow!

     
  20. JohnAHind Says:
    July 30th, 2010 at 6:18 pm

    114 seconds on my newly built Core i7 930 over-clocked to 4GHz!

    Thanks for this – was looking for a way to max out all 8 threads!

    Nice to see the old “Hit Enter to exit” joke making a new appearance. Almost as good as “Press start button to stop your computer”!

     
  21. JohnAHind Says:
    July 30th, 2010 at 6:23 pm

    @Vic: Thanks for raining on my parade! For the record I’m on Windows 7, so there does seem to be a significant performance penalty.

     
  22. Alex Says:
    July 30th, 2010 at 7:21 pm

    Core i7, OC’d to 3570Mhz (170.0 x 21) = 94 seconds

     
  23. jrbarnett Says:
    July 30th, 2010 at 8:35 pm

    Core 2 Duo E6600, just over 3.5 years old, Win7 Pro 32 bit:
    652 seconds.

     
  24. billynw10 Says:
    July 30th, 2010 at 8:36 pm

    Oh dear, Core i5 @ 2.67 GHz, ended up 185 secs and grainy, I shut down NIS 2010 and it took 4 secs longer and I thought my Graphics card was CUDA-enables sigh.

     
  25. Captainford Says:
    July 30th, 2010 at 9:01 pm

    Core i7 920 @4.00ghz 89 seconds while watching a movie as well. I love this machine!!

     
  26. Mike Baldwin Says:
    July 30th, 2010 at 9:57 pm

    Well mine did it in 225 sec on an AMD 1055T – 6 core ,O/C to 3.4ghz with 2gb of 800mhz ddr2 ram running on win7 – 64bit

     
  27. StoneDecroze Says:
    July 30th, 2010 at 10:08 pm

    Dell XPS M1530, 3 years old W7 Home Premium x64 4GB RAM

    569 seconds

    Core 2 Duo T7500 @ 2.20GHz

    Self built desktop

    225 seconds

    Phenom II X4 965 3.4 Ghz 4 Core

    W7 Home Premium x64 8GB RAM.

    I wonder what the Phenom could do if it was overclocked????

     
  28. Captainford Says:
    July 30th, 2010 at 10:40 pm

    Sorry should have been more specific, Core i7 920@ 4.00 ghz, 6gb ddr 3, EX58-UD5 motherboard. Running windows 7 pro 64 bit. Just tried again and this time got an 88. Most impressed as I bought this machine in oct 09 and it still seems to be holding its own :-) .

     
  29. dermotcd Says:
    July 30th, 2010 at 11:03 pm

    Dell Studio 15 core i5-520M, 4GB DDR3 RAM, windows 7 home premium took 233 seconds

     
  30. Philip Says:
    July 30th, 2010 at 11:29 pm

    237 seconds on a Core i5-520M, 4GB DDR3 Windows 7 Professional 32-bit in a Dell Latitude E4310.

     
  31. Eugene Says:
    July 30th, 2010 at 11:45 pm

    I ran the render prog on my 2 notebooks, the first one an “oldy” with a T4400 CPU, it took 547sec.
    The second notebook a more recent one with a i3-310M CPU here it only took 294sec. Not bad I think eventhough hopelessly slow compaired to some of the other machines listed in the comments.
    Both machines run Win7 home premium (the old one) and Win7 Pro both 32bit.

     
  32. Alex Says:
    July 31st, 2010 at 12:17 am

    I changed the OC setting on my i7-920 to (185×20) shut down all non-essential stuff and got the time down to 90 seconds. That’s with Vista x64.

     
  33. Alex Says:
    July 31st, 2010 at 12:28 am

    190×20 gives 87seconds but the temperatures were right up there. Definitely not a long term overclock!

     
  34. Captainford Says:
    July 31st, 2010 at 12:53 am

    I also managed an 87 with nothing running except sophos but I don’t think I will do any better without overclocking more and I am not going to do that. I am more than happy with 87.

     
  35. Lomskij Says:
    July 31st, 2010 at 1:23 am

    79 seconds with i7 920 @ 4.20 GHz (Vista x64). That was as far as I dared to push my CPU. Running it at 3.00 GHz took 104 seconds, so simple maths dictate that 980X overclocked to 4.10 GHz should break one minute record.

     
  36. JohnAHind Says:
    July 31st, 2010 at 10:36 am

    Now I’m thinking I may not actually be overclocking – my system is very similar to @Captainford except the CPU is a 930 and the MB a Gigabyte X58A-UDR3. This was bought as a “pre-overclocked” bundle of MB, CPU and memory. But the “system” panel in W7 shows “930 @ 2.80GHz 2.79 GHz”. Can any overclocker tell me: should the second frequency figure be the actual overclocked speed? (i.e. 4.0GHz in my case) .. or do I have to go into the BIOS to read the actual speed?
    It’s all very confusing – I have Piriform Speccy – it shows “Stock Core Speed” at 2800MHz, but it then shows an individual “Core Speed” for each of the four cores at only 1619.2MHz. What is this about?

     
  37. Alex Says:
    July 31st, 2010 at 11:39 am

    @JohnAHind
    The best utility I have for monitoring a system whilst doing an overclock is Core Temp. It’s a free download that shows actual speed, Base Clock speed, Frequency multiplyer, individual core loading and individual core temp (very useful that one). Always keep your i7s below 100C and preferably below 90C.

    Windows only ever reports the stock speeds of your chip irrespective of any OC settings you may have specified in the BIOS.

    What happens with i7s is that when they aren’t being used the multiplier drops thus reducing the overall speed of the CPU and saving energy. This is where the 1619Mhz comes from.
    You won’t see the full speed until you load up the CPU for which the benchmark in question is ideal.

    If your machine isn’t OC’d to start off with, disable turbo boost and gradually increase the base clock in 5Mhz steps doing an intensive 10min test each time to ensure stability.

    @Lomskij
    That’s an impressive OC. I would be interested to hear how you did it – what BIOS settings you are using and what CPU cooler you’ve got.

     
  38. Mike Says:
    July 31st, 2010 at 12:13 pm

    On an old athlon X2 4400 win 7 and 2Gb, Nvidia 240, took 636 seconds. Not a good image either-grainy

     
  39. Lomskij Says:
    July 31st, 2010 at 12:31 pm

    @Alex
    Actually it’s a very basic OC: i7 920 D0 @ 4.20 GHz, CPU Ratio = 21, BCLK frequency = 200, core 1.41V.
    Mobo asus rampage II extreme, cooling: megahalems + 2x 120mm nexus fans (paste mx-2), case: lian-li pc-x2000 with front fascia removed.
    Running chassis fans @ 1,000 rpm and cpu fans @ 1,600 rpm, cpu temperature sits around 40C idle and 50C load, which is pretty good.

     
  40. Mike Baldwin Says:
    July 31st, 2010 at 12:32 pm

    I have just tried an interesting experiment.With 2 cores active on the AMD 1055T 6 core processor – render time was 389 secs.So you would think that with 6 cores active it should be 2/3rds better at 130 secs.
    In fact it is just over 1/3rd better at 223 secs at best.Just goes to show what happens when you have to share the available on die level 1 & 2 cache between 2 cores then 6.

     
  41. Dave A Says:
    July 31st, 2010 at 1:44 pm

    AMD Phenom II with Liquid cooling and 4GB Ram, stock speed 3 cores on, 393 seconds, oc to 3ghz from 2.6ghz and down to 283 seconds next either more oc or stock speed with 4th core unlocked

     
  42. Dave A Says:
    July 31st, 2010 at 2:00 pm

    Just overclocked CPU (Phenom II 710 2.6 Stock) to 3.5 ghz, 273 seconds

     
  43. Vic Says:
    July 31st, 2010 at 3:26 pm

    Booted XP in Safemode – and it ran 112 seconds. Hmmm, and I got the 111 seconds with Media player playing too. But I’m sure I can get it to the ninetys if I overclock. (Nice – this i7)

     
  44. JohnAHind Says:
    July 31st, 2010 at 4:00 pm

    @Alex: Many thanks, Core Temp does give much more useful and credible information that Windows or Speccy! After actually applying the overclock (182×22 = 4GHz), I got a much more impressive 83 seconds, although it did max out two of the four cores at 100c during the smallpt run.

     
  45. Hooch Says:
    July 31st, 2010 at 4:21 pm

    Got it working. 118s on Core i7 920 @ stock, 6GB DDR3, GA-EX58-UD5, W7. Thought my CPU was going to melt!

     
  46. Lomskij Says:
    July 31st, 2010 at 8:11 pm

    Ok, going extreme: 73 seconds! :-)
    Short time OC: i7 920 @ 4.5GHz (21 x 215, 1.5V), cpu reached lovely 72C during the test. I guess here my race ends, as trying to push the frequency any higher makes windows fall to BSOD :-/

     
  47. Rich Says:
    July 31st, 2010 at 8:20 pm

    Core i7920 at 4ghz with asus rampage 2 extreme. 83 seconds.
    Core temp showed temperature rose to 69C. Interestingly, with hyper threading turned off time increased to 127s.

     
  48. Sarcen Says:
    July 31st, 2010 at 11:39 pm

    On my Dell Studio 1557 i7 Q720 laptop Win7 professional 64 bit with 4GB DDR3, no OC, I managed it in 195 sec – This seems slightly better than Dave Wright’s (#7 & 12) Sony (similar spec but he has twice the memory – not sure why)

     
  49. MarkD Says:
    August 1st, 2010 at 5:03 am

    Q8300 @ 2.5GHz 4GB Win7 Pro
    Norton 2010 off – 266 Sec
    Norton 2010 on – 267 Sec

     
  50. David Wright Says:
    August 1st, 2010 at 9:29 am

    @Sarcen possibly different back ground tasks running on our machines – mine is probably still clogged up with some Sony guff. ;-)

     
  51. Fith Says:
    August 1st, 2010 at 10:40 am

    So how well would it run if you implemented this article?

    http://www.pcpro.co.uk/news/358027/gpu-compiler-could-turn-desktops-into-supercomputers

     
  52. Mike Baldwin Says:
    August 1st, 2010 at 11:33 am

    It would be interesting if the source code was handed to Nvidea,ATI & microsoft to see what times we could get if it was rewritten in Cuda(Nvidea)/Direct Compute(ATI) and DX11(Microsoft) to take advantage of the GPU.I wonder what the times would be then?

     
  53. JohnAHind Says:
    August 1st, 2010 at 2:56 pm

    @Rich: What cooling are you using? I have the same CPU and overclock as you and am getting the same benchmark result, but the “High” temperature is 100c. I am suspicious of this value as it is too much of a round number – I suspect it is the max of the measurement system not the actual maximum temperature, which is worrying (though no sign of instability). I have a Zalman sealed system water-cooler on the CPU so I would expect to get pretty optimal cooling performance and am disappointed my temperatures are so much higher than yours.
    Interestingly Core Temp reports the multiplier increasing from x12 to x22.I would hope the multiplier would be automatically limited to keep the temperature safe, but Core Temp shows it sitting at x22 even after repeated runs with all four cores maxed at 100c.
    The total power consumed by the system rose from 180w to 334w when running Smallpt!

     
  54. JohnAHind Says:
    August 1st, 2010 at 3:02 pm

    @David Fearon: Could you maybe make a couple of improvements:
    1. Trap the exception if the program cannot write the output file (for example if it is in Program Files and file protection prevents this). Most of us are not very interested in the output!
    2. Make an auto-repeat option so the program can be used for long term burn-in tests.

     
  55. Paul B Says:
    August 1st, 2010 at 5:19 pm

    Core i5-750@2.67Ghz, 4GB Ram, Win 7 x 64 HP:
    194 seconds – Turbo Boost Off
    183 seconds – Turbo boost On

    Nice to know Turbo Boost makes some difference.

     
  56. Lomskij Says:
    August 1st, 2010 at 9:39 pm

    @JohnAHind: something is definitely wrong either with your cooling system or with the temperature sensors. Any single core on my i7 920 never exceeded 75C, and I run it at 21 x 200 @ 4.20GHz, 1.41v, on air. And your system doesn’t go into BSOD when reaching such ridiculous temperatures?!
    By the way, what’s your voltage?

     
  57. Captainford Says:
    August 1st, 2010 at 11:58 pm

    Hi, I bumped up my system to 190 x 21 (i thought I was running at 4ghz before but I wasn’t.) now getting a score of 82 which I am most pleased with. However during the test my core temps got to 93,93,90 and 89 respectively. This while within the limits of the processor seems very hot to me. I have got an antec 902 case with all fans on full and a Noctua CPU cooler which I am not sure of the model but it is enormous. Is this just the fact that some chips are better (cooler) than others or is there something wrong? Idle temps are 55-60. Any help would be most kindly appreciated.

     
  58. Rich Says:
    August 2nd, 2010 at 8:37 am

    @JohnAHind
    Yeah, I have to agree with Lomskij, I think something is wrong with your cooling. I have a megahalems prolimatech cooler in an antec 1200 case. i7 920 idles at around 42-38 C and maxes out at 75C (at 4ghz) if I leave prime95 running for an hour. I think the i7 920 will not slow down until it reaches above 100C, so maybe you are right at the edge…

     
  59. JohnAHind Says:
    August 2nd, 2010 at 10:49 am

    @Lomskij@Rich: I misreported some things, first I have a 930 CPU, not 920, second my cooler is a Corsair H50. The 182×22 overclock was in a profile supplied by OCUK and included a vcore boost to 1.3v and a DRAM boost to 1.64v. The overclock does eventually trigger a MB overtemp alarm set at 90c so I guess the measured values are not far wrong.
    I took the overclock off and Core Temp now reports about 40c at no load rising to 70c when running Smallpt (with the overclock it was 57c at no load maxing out at 100c as reported before.
    Definitely looks like the Corsair is not doing its stuff – its pump and fan are definitely running and the radiator is not excessively hot, so either the circulation is too low or the contact with the chip not good enough – I will need to investigate when I get the time.
    Thanks again for your help and to David for giving us this displacement activity!

     
  60. David Fearon Says:
    August 2nd, 2010 at 11:38 am

    @billynw10 – the executable is purely a CPU test, it won’t make use of any CUDA or GPU features. But for those who’ve wondered above how fast a GPU version of smallpt would be, check out http://davibu.interfree.it/opencl/smallptgpu/smallptGPU.html

     
  61. Lavan Nallainathan Says:
    August 2nd, 2010 at 4:24 pm

    Intel Xeon E5530 4GB RAM 131 Seconds.

    Will run it on our dual X5670 with 32GB RAM tomorrow and post results.

     
  62. Philip Says:
    August 2nd, 2010 at 4:45 pm

    Work PC: Xeon W3520 @ 2.67 Ghz with 12 GB RAM – 116 Seconds.

    I would imagine overclocking would make this a touch better – not going to risk melting my work PC though…!

     
  63. billynw10 Says:
    August 2nd, 2010 at 8:25 pm

    Thanks for the info, still unsure why it would be grainy. Going to check out your link cheers again

     
  64. billynw10 Says:
    August 2nd, 2010 at 8:26 pm

    @David Fearon: Thanks for the info, still unsure why it would be grainy. Going to check out your link cheers again

     
  65. JeffL Says:
    August 2nd, 2010 at 11:23 pm

    Q6600 Quad Core OC to 3GHz, 4GB of RAM, Windows 7 64bit = 248 seconds. Fairly happy with that.

     
  66. TrevorH Says:
    August 3rd, 2010 at 1:43 am

    Assuming these tests were done at 100 samples per pixel then the Linux version seems to be dramatically quicker. I compiled from the original source not the modified version and mine completes 100 samples per pixel in 35 seconds. Core i7 980x not overclocked.

     
  67. M Ahmed Says:
    August 3rd, 2010 at 12:30 pm

    Windows 7 Ultimate x64
    Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz
    4GB RAM

     
  68. JohnAHind Says:
    August 3rd, 2010 at 4:18 pm

    Just for the record in case I have put anyone off the Corsair H50, after reseating the pump/block on the chip I can now do 82s on my i7 930 @4GHz with the temp maxing at 80c.
    @TrevorH: I think this is probably the compiler rather than Linux – better optimisation will work wonders on an app like this and David used the free version of Visual C++. If he publishes the source I will try a recompile on the commercial version.
    @M Ahmed: Not much point giving us your spec unless you also give the benchmark time!

     
  69. Eric Says:
    August 3rd, 2010 at 8:39 pm

    Dell PowerEdge R710
    Intel Xeon E5620 @ 2.40 GHz (x2)
    36.0 GB RAM
    Windows Server 2008 Enterprise x64

    137 seconds

     
  70. Alistair Kemp (Intel) Says:
    August 4th, 2010 at 11:21 am

    Well here at Intel we could not resist the challenge, particularly when in our IT dpeartment we have overclocking nut Steve “DaFidgie” Anderson. We gave him the challenge of beating 60s but restricted him to only with a single socket setup. Last night he ran it on his rig. Result: 50 seconds to complete.
    Setup details here:
    http://www.flickr.com/photos/inteluknewsroom/4859279593/

     
  71. Mark Laurence Says:
    August 4th, 2010 at 11:38 am

    I just might have to do some testing on overclocked dual X5680s :)

     
  72. Mark Laurence Says:
    August 4th, 2010 at 1:15 pm

    App does not seem to work properly with dual CPUs. We tested on an EVGA SR2 with overclocked X5680s at 4.4ghz. 57s with one CPU enabled, 80s with both enabled, 133s with both enabled HT off. Exactly the opposite from what you would expect.

     
  73. David Fearon Says:
    August 4th, 2010 at 2:53 pm

    @Mark Laurence: interesting result that tallies with my disappointment at my own dual Xeon’s results. It’s possible that with such a tight loop there’s a thread-dispatch overhead or somesuch that’s negating the advantage. I’ll try tweaking the OpenMP pragma to make the threading slightly coarser (it’s line 79 in Kevin Beason’s code but I moved it down to the beginning of the x loop – line 82 – since that gave better performance on my Core 2 Duo) and see if that improves things on multiple socket systems.
    It doesn’t surprise me too much that performance is degraded with HT on though; if all the threads are performing the same operations – as they are in this case – then the virtual cores are likely fighting each other for the same physical resources.

     
  74. Aftab Says:
    August 5th, 2010 at 8:04 am

    Q9650

    243s @ 3.0 GHz
    180s @ 3.6 GHz

    25.9% improvement in time with a 20% overclock

    Interestingly 185s @ 3.7 GHz. I think it was overheating a bit too much and throttling CPU speed.

     
  75. M Ahmed Says:
    August 5th, 2010 at 3:36 pm

    It helps if I posted the time… :D

    168s

     
  76. Nick St Aubyn Says:
    August 5th, 2010 at 10:27 pm

    Another slow dual CPU result – dual Xeon X5472 with 16Gb RAM gave me 412 seconds…

     
  77. Rob E Says:
    August 6th, 2010 at 1:56 pm

    Just to show you how far things have come in a relatively short time… I’ve recently decommissioned a couple of servers that have gone end-of-life as they’re around 5 years old. I replaced them with Dell R710s at pretty much exactly the same spec as Eric’s (above) so no need to repeat that result.
    However, I thought I’d give it a go on one of the old boxes.

    Xeon MP 2.2Ghz x 4 (quad socket, so 8 physical cores)
    4GB RAM
    Server 2003

    710 Seconds. Those servers were the dog’s danglies when we shelled out £12,000 each for them back in the day. Now they’re only marginally faster than David’s E6300.
    The march of progress eh?

     
  78. JohnAHind Says:
    August 6th, 2010 at 4:19 pm

    @David Fearon: Au contraire re HT – it is definitely significantly faster with it on, as I think Mark Laurence said. I just tested it on my i7 930 @4GHz: 4core/8thread – 82secs, 4core/4thread – 124secs. Hyperthreading is actually more valuable here than overclocking. Interestingly this is also reflected in the temperatures – 12degrees hotter with HT on!

     
  79. Mike Woods Says:
    August 6th, 2010 at 5:18 pm

    From the sublime to the ridiculous: MSI Wind U130 netbook (1.6Gh Atom) – 2334 Sec! However, this is a bit of a cheat as I’ve uped the RAM to 1Gb.

     
  80. Rob E Says:
    August 6th, 2010 at 8:51 pm

    Ooo now there’s a challenge. Who can bring in the worst result? I’m sure there’s a manky old PII floating around somewhere in our ‘retired box’ area that we haven’t got round to skipping yet…

     
  81. leif Says:
    August 7th, 2010 at 1:10 am

    leif@patroler:~/src/smallpt$ time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m49.354s

     
  82. n10cities Says:
    August 7th, 2010 at 1:16 am

    289 seconds – AMD Phenom II X3 Black Edition 2.6 GHz OC’d to 3.2, 4 GB RAM

     
  83. Spoo Says:
    August 7th, 2010 at 1:24 am

    So very tempted to try this on my AMD K6/2. However, I’d have to compile it myself as the machine runs Linux right now and I’m not sure I can face the faff!

     
  84. Brian Says:
    August 7th, 2010 at 1:58 am

    Core i7 920 running at 3.6GHz and 12GB of RAM, got it rendered in 92 seconds. ACPI temp never got above 41C and the highest any core got was 56C. Running a custom liquid cooling setup with all Koolance parts, cooling the CPU and both HD 5870s.

     
  85. Fry-kun Says:
    August 7th, 2010 at 2:04 am

    [xxxx@xxxx smallpt]$ time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m29.127s
    user 5m41.044s
    sys 0m0.093s

    Ha, 29 seconds :)
    But of course I cheated – this server from my work has 32G RAM and 2 6-core CPUs (AMD 2427). OTOH, no overclocking of any kind.

     
  86. Catty Nebulart Says:
    August 7th, 2010 at 2:08 am

    319 seconds to render on my aging phenomen II x4 980, using wine.

    62 seconds when compiled from source.

     
  87. Noob Says:
    August 7th, 2010 at 2:13 am

    Ran it on my stock 1055t – 6 cores 2gig ram. smallpt 100 completed in only 44 seconds!

    haha. beaten intel’s liquid cooling using AMDs stock fan!

     
  88. OrangeTide Says:
    August 7th, 2010 at 2:32 am

    42.5s – dual i7
    % time ./smallpt 100
    Rendering (100 spp) 100.00%./smallpt 100 325.30s user 0.16s system 766% cpu 42.462 total

     
  89. PIBM Says:
    August 7th, 2010 at 2:50 am

    My 2 years old dual xeon dev server took 32 seconds to run it. There must have been something wrong on your setup

    time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m32.824s
    user 4m17.610s
    sys 0m0.060s

     
  90. Lee Says:
    August 7th, 2010 at 3:28 am

    475 seconds with AMD Athlon 64 X2 5600 @ 2.8GHZ, 2GB Ram, Windows 7 32 bit. Not that it matters for the render, ATI Radeon HD 5700. And yes, it does run Crysis.

     
  91. Pat Bowen Says:
    August 7th, 2010 at 3:37 am

    274 sec. on a 3-year-old Core 2 Quad Q9300 (2.5 GHz)

    Gonna go try this on XP (second 1/2 of my dual-boot sys.) to see if I can improve on that a little.

     
  92. Pat Bowen Says:
    August 7th, 2010 at 3:56 am

    Oops! Doesn’t appear to work on the 64-bit edition of XP Pro.

    Oh, well.

     
  93. JF Laplante Says:
    August 7th, 2010 at 4:08 am

    159 secs

    AMD Phenom II 1090T @ 4.2ghz on air
    8gb RAM
    Win 7 64 bits

    Result #87 seems strange…

    I had several version of vcomp90.dll in the winxs dir. Some of them prevented smallpt.exe from starting.

     
  94. Rob Says:
    August 7th, 2010 at 4:17 am

    Seems to run *much* faster on Linux than windows. 58s on a Phenom II 955 on Fedora 13.

     
  95. Tom Says:
    August 7th, 2010 at 4:23 am

    There is something seriously wrong with the optimizations in your windows binary…

    Ran in 36 seconds on a 4 x 8224 SE AMD opteron IBM x-server running linux (8 total cores at 3.2GHz)

     
  96. Rob Says:
    August 7th, 2010 at 4:23 am

    Dell 910, Intel X7560 32 cores, 512GB, CentOS 5.5, GCC 4.4.4

    time ~/smallpt 100
    Rendering (100 spp) 100.00%
    real 0m7.284s
    user 6m42.990s
    sys 0m0.107s

     
  97. Smart_Monkey Says:
    August 7th, 2010 at 4:33 am

    Very disappointed with my result. 236 seconds running a Phenom II x4 965BE @ 3.6ghz, 4gb OCZ BlackEdition (7-7-7-30 @ 1300mhz) and Windows 7 Professional 64-bit. I was hoping <150 seconds.

    I had to use a 32bit version of the dll for the application to work.

     
  98. ROn Says:
    August 7th, 2010 at 4:46 am

    Rendering with 100 samples per pixel: 100.00%

    RENDER COMPLETE. Render time: 669 seconds.
    4 year old dell walmart special AMD X2 2.1ghz AM2 socket

     
  99. Private Name Says:
    August 7th, 2010 at 4:46 am

    Takes 22 seconds on a Quad-CPU, Quad-Core Xeon X5560 @ 2.80GHz, using 14 out of the 16 cores (was run inside a KVM instance with access to 14 of the cores).

    # time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m21.964s
    user 5m0.981s
    sys 0m0.231s

     
  100. drphilngood Says:
    August 7th, 2010 at 5:04 am

    Rendering (100 spp) 100.00%
    real 0m31.440s
    user 4m6.590s
    sys 0m0.020s

     
  101. drphilngood Says:
    August 7th, 2010 at 5:06 am

    addendum: forgot specs =(
    870 Lynnfield

     
  102. Anonymous Says:
    August 7th, 2010 at 5:37 am

    Gave this an honest try on an IBM BlueGene/P supercomputer, but I can’t get it to compile with automatic vector optimizations (mpixlcxx -o smallpt -O5 smallpt.cpp -qarch=450 segfaults on run)… and I don’t want to futz with the source code. Without the optimizations (mpixlcxx -o smallpt smallpt.cpp -qarch=450) DOES “run”, but 256 cpu’s took > 3 minutes to compute smallpt at only 4 samples/pixel (slower than an intel atom)

     
  103. schatterjee Says:
    August 7th, 2010 at 5:37 am

    The multi-processor results seem strange. My i7 980x did it in 62 seconds.

     
  104. doop Says:
    August 7th, 2010 at 5:49 am

    Intel Core i5 650 @ 3.2ghz (stock)
    4gb Dual Channel DDR3

    ubuntu 10.04

    time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 1m11.334s
    user 4m40.100s
    sys 0m0.240s

     
  105. drphilngood Says:
    August 7th, 2010 at 6:23 am

    @ schatterjee
    I guess we could post screenshots; here’s mine: http://img.photobucket.com/albums/v490/drphilngood/Screenshot-smallpt-revised.png

     
  106. Andy0x2a Says:
    August 7th, 2010 at 6:44 am

    time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 17m55.874s
    user 32m10.921s
    sys 0m4.116s

    Fuck yah! 1075s on an Intel(R) Atom(TM) CPU N280 @ 1.66GHz
    running Ubuntu Netbook Remix 10.4

     
  107. Zatraz Says:
    August 7th, 2010 at 6:59 am

    Takes 4.2s on a IBM Power 780, 32 cores @ 4.1GHz

    # time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m4.244s
    user 2m15.144s
    sys 0m0.083s

     
  108. J_Fellenbaum Says:
    August 7th, 2010 at 7:23 am

    616 seconds
    E6600 @ 2.4ghz

    My system always runs hot, but I hit 101*C in core 1 and 99* C in core 2 at 99% complete. Guess its a good thing I don’t normally stress this system too much.

     
  109. Rhubarb Says:
    August 7th, 2010 at 7:46 am

    I can beat that with 28 seconds :D
    time ./smallpt 100
    Rendering (100 spp) 100%
    real 0m28.278s
    user 3m43.530s
    sys 0m0.270s
    Running: Ubuntu 10.04 64bit
    6Gig DDR3 RAM
    Core i7 975 @ 4.15GHz (single socket, 4 Cores, 8 with HT)
    Running the original smallpt app – I’m guessing the windows version / compiler isn’t as efficient as gcc-4.4

     
  110. ayembee Says:
    August 7th, 2010 at 7:51 am

    101 seconds; core i7 940 @3.2, win7×64, 6GB RAM — all 8 threads pegged at 100%, wow. haven’t seen a cpu do that before!

     
  111. John Klos Says:
    August 7th, 2010 at 8:01 am

    Eight core (plus hyperthreading) 2.26 GHz Mac Pro, Mac OS X 10.6.4:

    g++ smallpt.cpp -o smallpt -O3 -fopenmp -ffast-math
    time ./smallpt 100
    Rendering (100 spp)
    100.00%391.575u 0.267s 0:25.43 1540.8% 0+0k 0+7io 0pf+0w

    Yes, you read that properly – less than 26 seconds when running 16 threads.

    Still waiting for the VAX to finish…

     
  112. NZ-Antz Says:
    August 7th, 2010 at 8:54 am

    616 Seconds on my HP 6910 Laptop (CPU T8100 @2.10GHz 4GB Memory Windows 7 x64)

     
  113. Jason Pritchard Says:
    August 7th, 2010 at 8:58 am

    David can you post your source code for the windows rewrite? I would like to see how you got it to work in .net.

     
  114. Jason Pritchard Says:
    August 7th, 2010 at 8:59 am

    Can you post the source code you complied in .net? I want to see how you changed it.

     
  115. davek Says:
    August 7th, 2010 at 9:55 am

    Mobile AMD Athlon 4 @ 1.2GHz. (circa 2004)
    Ubuntu 10.04 Netbook Remix
    $ time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 34m49.818s
    user 23m24.616s
    sys 0m8.641s

    Hmm. Maybe it would be worth replacing this lappy.

    Predicted finish on Via C3 @ 600Mhz > 6000 seconds.

     
  116. Ian C Says:
    August 7th, 2010 at 11:37 am

    332s on E7300@3.33GHz (and ~40C core temps) WinXP Pro 32bit

     
  117. davek Says:
    August 7th, 2010 at 12:18 pm

    8567.931 seconds Via C3 @ 600Mhz running Debian Lenny.
    Almost 9000!!!

     
  118. Jason Says:
    August 7th, 2010 at 1:05 pm

    Render Time: 342 seconds
    ASUS P5Q Pro LGA 775 Intel P45 ATX Intel Motherboard
    Intel Core 2 Duo E8500 Wolfdale 3.16GHz LGA 775
    Mushkin Blackline 4GB (2 x 2GB) DDR2 1066 (PC2 8500) Dual Channel
    Vista Ultimate SP1 64-bit
    ————————
    Render Time with WINE: 1618 Seconds
    Render Time compiled on box: 767 Seconds
    BIOSTAR 945GC Micro 775 LGA Micro ATX Intel Motherboard
    Intel Celeron 420 Conroe-L 1.6GHz LGA 775
    CORSAIR 2GB (2 x 1GB) DDR2 SDRAM DDR2 667 (PC2 5300)
    Gentoo Linux (Kernel 2.6.33)
    WINE version 1.1.12
    ————————
    Render Time: with WINE:1574 Seconds
    Render Time compiled on box: 773 Seconds
    Dell Inspiron 2200 Laptop (Intel Pentium M 1.7GHz & 512MB Ram)
    Gentoo Linux (Kernel 2.6.27.12)
    WINE version 1.1.44

     
  119. Pelle Says:
    August 7th, 2010 at 1:41 pm

    On an dual core i3 laptop I get:
    275 sec in Win7-64,
    98 sec in Ubuntu-64 and
    157 sec with Ubuntu-32 as guest in Virtual Box (Win7-64 host)!
    Conclusion: Windows really sucks when it comes to computing, alternatively, the adaption to Windows is not as easy as the author thought.

     
  120. ahmet Says:
    August 7th, 2010 at 1:59 pm

    ubuntu 10.04 64bit
    intel i7 980x (not overclocked)
    8gb

    time ./smallpt 100

    Rendering (100 spp) 100.00%
    real 0m22.649s
    user 4m25.870s
    sys 0m0.020s

     
  121. Henry Says:
    August 7th, 2010 at 7:18 pm

    On a Pentium 133 (no MMX)
    $ time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 292m56.066s
    user 291m55.199s
    sys 0m7.108s

     
  122. Sean Benton Says:
    August 7th, 2010 at 7:24 pm

    AMD users, fret not. There appears to be something wrong with the optimizations in the windows binary. I ran smallpt 100 in an ubuntu virtual box machine on my Phenom II X4 965 @3.4 Ghz and got this:

    sean@sean-ubuntu-2:~/Desktop/smallpt$ time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 1m22.444s
    user 5m22.656s
    sys 0m1.688s

     
  123. Wayne Says:
    August 7th, 2010 at 7:33 pm

    106s
    Almost 2-year old stock Core i7 940.

     
  124. vivo Says:
    August 7th, 2010 at 8:36 pm

    this timings with NO overclocking but some tinkering with gcc options.
    With small toys like this program it can pay a 20% improvement

    root# time nice -n -19 ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m57.522s
    user 3m41.974s
    sys 0m0.352s

    g++ -O3 \
    -m64 \
    -march=nocona -pipe \
    -ffast-math \
    -ftree-parallelize-loops=8 \
    -funroll-all-loops \
    -fopenmp \
    smallpt.cpp \
    -o smallpt

     
  125. vivo Says:
    August 7th, 2010 at 8:42 pm

    0m36.904s

    Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz

    time nice -n -19 ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m36.904s
    user 4m50.778s
    sys 0m0.040s

    g++ -O3 \
    -m64 \
    -march=core2 -pipe \
    -ffast-math \
    -ftree-parallelize-loops=8 \
    -funroll-all-loops \
    -fopenmp \
    smallpt.cpp \
    -o smallpt

     
  126. Dan Says:
    August 7th, 2010 at 8:47 pm

    88 seconds on my O/C’ed i7-920 d0 to 4.0 GHz running win7 x64 pro, 12 GB RAM. EVGA 3x SLI m/b.

     
  127. vivo Says:
    August 7th, 2010 at 8:56 pm

    wait, it may be not such important to use those flags, I’ve tried again on the i7 950 with:
    g++ -O3 -m64 -fopenmp smallpt.cpp -o smallpt

    time nice -n -19 ./smallpt 100 ; sensors
    Rendering (100 spp) 100.00%
    real 0m38.208s
    user 5m1.787s
    sys 0m0.020s

    cpu ended at a maximum of 71°C

    so maybe gcc is getting better?
    gcc (Gentoo 4.4.4-r1 p1.0, pie-0.4.5) 4.4.4

    why I’m getting so low times comparing to those of the majority of people here?

    and yes, I’ve checked image.ppm and it’s good
    ???

     
  128. Matt Says:
    August 7th, 2010 at 9:41 pm

    Very old Sony Vaio PII 300 with 160MB Ram, 119 minutes 34 seconds running Ubuntu.

     
  129. Rob E Says:
    August 7th, 2010 at 10:30 pm

    @Henry.
    Boo, you spoilt all my fun. I thought we had a some right old sheds to hand, but you win. I curse the decision we made last year to skip the 286 that we’d been using to cook EPROMs . With a bit of foresight, victory would have been mine!
    For the record, it all bowled along plenty quick on my OC’ed 920, but I’m still staggered by how crap it ran on the quad socket server rig. Wow.

     
  130. Stan Says:
    August 7th, 2010 at 10:38 pm

    Intel I7 Core 950
    3 Gig memory
    Linux Fedora 11 Kernel 2.6.3.10
    ——–
    real 0m36.860s
    user 4m50.667s
    sys 0m0.097s
    ______
    He Shoots! He Scores!

     
  131. John Klos Says:
    August 7th, 2010 at 11:41 pm

    Amiga 1200, 60 MHz m68060:
    :
    34727.345u 9979.182s 12:34:08.70
    :
    That’s 579 minutes or so. If a 16 thread Xeon 5500 series system can do it in 26 seconds, then that’s 1350 times faster. Divide that by 16 (the number of threads the Xeon system runs simultaneously) and again by 37.75 (the ratio of clock speed (2266/60)), and the Xeon is only 2.23 times faster than the m68060!

     
  132. spreeuw Says:
    August 8th, 2010 at 12:14 am

    AMD Athlon(tm) 64 X2 Dual Core Processor 5600+
    32bit 2.6.33.2
    time -p ./smallpt 100
    Rendering (100 spp) 100.00%
    real 521.05

     
  133. plod Says:
    August 8th, 2010 at 2:13 am

    127s on hp Z800 with single E5540 at a measly 2.53GHz on XP 64

     
  134. Vengeance Says:
    August 8th, 2010 at 2:35 am

    Intel i7 920 @ 4.0ghz w/ HT
    Win 7 Pro x64
    6GB RAM
    83 seconds while doing other things in the background.

     
  135. AussieBrusader Says:
    August 8th, 2010 at 4:33 am

    Toshiba TE2100
    WinXP SP3
    512MB RAM
    2596 Sec

     
  136. Not N Says:
    August 8th, 2010 at 5:35 am

    Could someone organize these, like the Hexus Pi-Fast Challenge did back in the day?
    http://pifast.hexus.net/pifast.php

     
  137. Luizg Says:
    August 8th, 2010 at 7:14 am

    root@eris:/tmp# time ./smallpt 5000
    Rendering (5000 spp) 100.00%
    real 6m3.586s
    user 367m35.048s
    sys 0m0.250s
    Or for 100spp:
    root@eris:/tmp# time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m7.849s
    user 7m37.749s
    sys 0m0.160s

    This is a quad socket Xeon X7560 system. 32 cores at 2.26ghz (With HT on, 64 threads)

     
  138. JohnDoe Says:
    August 8th, 2010 at 10:13 am

    Intel Pentium 4 (Northwood) 2,53GHz , Win XP 768MB ram.
    Render time: 2367 seconds.
    Maybe I should try on my linux pc (CEL @ 300MHz, 256MB ram)…

     
  139. Pete Says:
    August 8th, 2010 at 10:35 am

    640s on a Core2 E4500 2.2Ghz
    Vista 32 4G ram

    Actually, FinalRender for 3DSMAX is a lot faster than that little program. I can get much higher quality in a quarter the time.

     
  140. thiscodeiswrong Says:
    August 8th, 2010 at 4:36 pm

    This code is wrong.
    It relies on linux’s erand function behavior.
    But the compiled version uses different erand’s and that’s why this is so much slower than linux’s version.

    Really, this is the code’s author fault. The computational complexity of code depends on random values from erand function. Way to go.

     
  141. N30N Says:
    August 8th, 2010 at 5:03 pm

    Processor: Intel(R) Core(TM) i7 CPU 920 @ 3.8GHz,
    OS: ArchLinux x86_64,
    Result: 100spp in 28.4 seconds.

     
  142. Ho Tuan Says:
    August 8th, 2010 at 10:06 pm

    21 SECONDS.
    Dual AMD Opteron 12 Core (24 Cores total), with NO SPECIAL COOLING (just a basic Dynatron A6 fan for each CPU).

    Here’s the catch, I had to run it in Ubuntu 9.04 x64. The Windows version distributed on your site was behaving eratically (it kept alternating between freezing for ~ 20 seconds and bursting, so it took 595 seconds to finish on Windows, which was unjustifiably slow). Any ideas why it’s bottlenecked on Windows?

    Unless Intel’s rig was similarly bottlenecked on Windows, I think the AMD Opteron Magny Cours beats the Intel i7 on performance AND price (total machine cost less than $2500) hands down.

     
  143. Ho Tuan Says:
    August 8th, 2010 at 10:13 pm

    Running the process with high priority on Ubuntu, I got it down to LESS THAN 19.9 SECONDS, using Dual AMD Opteron 6168 Magny Cours CPUs.

    time nice -n -20 ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m19.900s
    user 7m33.940s
    sys 0m0.110s

     
  144. Ho Tuan Says:
    August 8th, 2010 at 10:15 pm

    oops, I meant to say LESS THAN 20 SECONDS…

    I will now remove one of the CPUs to see how a single CPU AMD Opteron 6168 compares to the i7 980x.

     
  145. Ho Tuan Says:
    August 8th, 2010 at 10:42 pm

    38.36 SECONDS –> SINGLE CHIP AMD OPTERON 6168 Magny Cours(12 Cores total).

    I measured this after removing the second CPU from my ASUS KGPE-D16 board. Again, this was done with NO SPECIAL COOLING beyond a basic Dynatron A6 fan. Compared to Intel’s monstrosity, they should be embarassed with 50 seconds.

    time nice -n -20 ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m38.360s
    user 7m30.590s
    sys 0m0.080s

    The cost for this CPU is only $750USD and the fan cost me 20 bucks (compared to $999 for the i7 980x + God know’s how much for Intel’s freezer box). AMD, are you listening?

     
  146. furinkan Says:
    August 8th, 2010 at 10:57 pm

    Ooh, you might be onto something there, thiscodeiswrong (#140).

    I think it’s unfair to say it’s the author’s “fault”. They targeted the code to GCC, so it’s really an oversight of this blog’s writer. The behavioral difference in random function is an implementation detail.

    For reference, I tried this on my x3650, 2-way E5420 (they don’t do HT). All stock, production server, RAM is irrelevant: 40sec for 100spp.

    Buoyed by this rather pleasing speed, I’m running a 3840×2400 render at 10k spp. For more fun, I’ve tweaked the scene a little, making the red wall specular, added another small mirror-ball, and reduced transmission/reflection on the balls to make it look a bit realer.

    Here’s one I prepared earlier:
    http://furinkan.meidokon.net/img/smallpt_1920×1200_5000spp_hacked_layout_tweaked_walls.jpg

     
  147. drphilngood Says:
    August 8th, 2010 at 11:17 pm

    @ Ho Tuan
    “Intel’s monstrosity”, is at a huge disadvantage to boxen, like yours, testing with Linux, and would beat your scores if running Linux while destroying them if not limited to one socket, as well. Not to mention the fact that there are much cheaper i7s that would also beat your score.
    AMD make great CPUs, at a good price, but so does Intel.

     
  148. Ho Tuan Says:
    August 9th, 2010 at 12:12 am

    @ drphilngood
    Point taken. I want to see how the Intel machine performs on Linux. I am genuinely curious. For standardization, let’s say Ubuntu. I couldn’t really do a fair test on a Windows machine because of issues with the version of smallpt Fearon distributed.

    Regarding your point that cheaper i7’s beating out an AMD Opteron Magny Cours, it is totally unfounded. Perhaps you should instead criticize the fact that it is unfair for me to compare a 12 core (single-socket) and 24 core (dual-socket) system to Intel’s 6 core CPU, or that it is unfair to compare server grade CPUs to consumer CPUs. Beyond that, it should not be at all surprising that a 12 or 24 core system could beat out a 6 core system at a task that is inherently multithreaded (even if, per core, the AMD chip is way lacking).

    That said, I think it is ridiculous that many of the Intel i7s are more expensive than the majority of AMD’s server-grade CPUs. Kudos to Intel for pushing the frontiers on per-core performance, but it is hard to justify the cost of that performance to customers when you’re playing a game of diminishing returns.

    My company build HPCs and servers around Harvard/MIT and it is ridiculous how expensive Intel chips are.

    Let’s just say that there is a good reason why the world’s fastest supercomputer, the Cray XT5 uses only AMD Opterons. You simply can’t get that much computing power with Intel without breaking the bank.

     
  149. drphilngood Says:
    August 9th, 2010 at 12:33 am

    @ Ho Tuan
    Ho Tuan said:
    “Regarding your point that cheaper i7’s beating out an AMD Opteron Magny Cours, it is totally unfounded.”
    ___________________________________
    See the scores above. My own i7 870(see comments 100 & 101, above), in a rig I built about a year ago, cost only around $500USD, then, which is cheaper than the $750USD, you claim to have paid for your AMD Opteron 6168 Magny Cours.
    Here are the scores:
    Yours -> “38.36 SECONDS –> SINGLE CHIP AMD OPTERON 6168 Magny Cours(12 Cores total).”
    Mine -> 31.440s(see comment #100 & 101) Intel i7 870(4 Cores total)
    …and there are many similar scores, above.
    Now, how is my statement, “unfounded”?

     
  150. Anonymous Says:
    August 9th, 2010 at 2:40 am

    A version that uses Boost Random is posted here: http://pastebin.org/459419. Same performance on Linux. Should give much better performance on Windows. Tested with Boost 1.38 and 1.41.

     
  151. Ho Tuan Says:
    August 9th, 2010 at 4:04 am

    @drphilngood
    I stand corrected. I didn’t notice your posts amidst the flurry of discussion.
    Question: Your “rig I built about a year ago” somehow beat out a stock i7 980x on Linux, the prize machine in question. How the heck did you manage that?
    ————————
    Post #66 TrevorH:
    Assuming these tests were done at 100 samples per pixel then the Linux version seems to be dramatically quicker. I compiled from the original source not the modified version and mine completes 100 samples per pixel in 35 seconds. Core i7 980x not overclocked.
    ———————————
    Ultimately, I’m guilty of making an unfair comparison between server and consumer chips. Server chips are underclocked and then priced higher in exchange for guaranteed reliability and cooler chips. The Opteron 6168 is, by designed, underclocked to 1.9GHz (even though the die, if sold as a Phenom, could
    in theory run at ~3 GHz with stock coolers). My Opteron 6168 stayed at 45C at 100% utilization during the duration of the test on stock coolers. Can your i7 do that?

     
  152. drphilngood Says:
    August 9th, 2010 at 4:31 am

    @ Ho Tuan
    Like many, if not most, of us here, I overclock, but you knew that, didn’t ya’? Didn’t you mention, earlier, that your rig beat “Intel’s monstrosity”, that’s “overclocked to nearly 5GHz.” ;-)
    Hopefully, a screenshot will eliminate any lingering doubts you might have, though:
    http://img.photobucket.com/albums/v490/drphilngood/Screenshot-smallpt-revised.png

     
  153. drphilngood Says:
    August 9th, 2010 at 4:42 am

    addendum:
    Sorry, I thought the rest of your post was directed at TrevorH.
    No, but each core idles at between 30C & 34C and the hottest core only reached 59C during the test. In addition, the cores jumped back to idle temps within a couple of seconds of completion.

     
  154. Stan Says:
    August 9th, 2010 at 5:34 am

    Ok. Here is a fast one:
    Pentium II 450Mhz 256Meg ram
    Linux Fedora 4 Kernel 2.6.11
    74 Minutes 9 Seconds.
    PS: Ho Tuan, is AMD paying you?

     
  155. yggdrasil Says:
    August 9th, 2010 at 6:10 am

    84 Seconds
    •i7 860 @ 4GHZ.
    •8GB 1333 RAM.
    •Asus P7P55D Deluxe

     
  156. Nikola Says:
    August 9th, 2010 at 6:15 am

    Done!
    486@500MHz (AMD Geode LX800, Alix1c board with 256Mb, ubuntu server 8.4, while running some extra things triggered by crond, stock kernel)
    =178 minutes, 22 seconds

    That’s over 10 700 seconds…beat that Davek (#115)!¨ Only Henry (#120) has worse time, I have to pull my 486@66 from attic to bet him!

     
  157. Ho Tuan Says:
    August 9th, 2010 at 6:48 am

    @drphilngood
    So let me get this straight…
    A single, STOCK, AMD Opteron Magny Cours ($750), that is arguably way underclocked, finishes the job in 38 seconds, 3 seconds behind a single, STOCK, i7 980x ($1000). Double the AMD up and it finishes in 19.9 seconds (cost $1500). Last I checked, you can’t double up an i7, but you can double up the i7 980x’s server version, the Xeon 5670 (total cost $3200). Now the Xeon 5670 is like a slightly underclocked i7 980x, so it’s not surprising that it lags behind Magny Cours in multithread-application benchmarks (http://www.anandtech.com/show/2978/amd-s-12-core-magny-cours-opteron-6174-vs-intel-s-6-core-xeon).
    Huh…
    In my line of work, servers have to work reliably at low TDP without racking up our customer’s AC bill, so overclocks don’t count in the cost/performance comparison. Also, 59C is way too hot if I need to build HPC clusters capable of crunching numbers at 100% capacity 24/7 for weeks on end.
    @Stan
    No. AMD is not paying me. Unlike Intel, I doubt they have the extra cash to pay me.

     
  158. Dr. Nick Says:
    August 9th, 2010 at 7:03 am

    Hiiiiiiii everybody!

    @Stan
    Agreed–AMD should totally pay @Ho Tuan.
    @DrFeelinGood
    Is Intel paying *you*?

     
  159. drphilngood Says:
    August 9th, 2010 at 7:09 am

    @ Dr. Nick
    Shhhhh, don’t tell or the IRS will put me in a higher tax bracket, again. :p

     
  160. drphilngood Says:
    August 9th, 2010 at 7:40 am

    @ Ho Tuan
    That’s all fine and good but, if you remember, our conversation started because you thought you had beaten the score of “Intel’s monstrosity”, and stated that, “they should be embarassed with 50 seconds.” So, I thought I’d do you a favor and give you the 411 on why there was such a huge discrepancy in the scores. After that, I have just been answering your questions.
    Yes, you keep telling us that your Magny Cours’ are server chips; we all understand that. However, the OP finished the article by asking for “results, for machines both old and new.” So, if you feel that your machine is at a disadvantage, I’m sorry but it was YOUR decision to post YOUR results. Perhaps, later on, they’ll have some sort of server benchmark that you’ll do better in.
    _________________________________________
    To all:
    I support both AMD and Intel; I just want a fast CPU and could care less where it comes from. *Looks over at server containing recycled AMD socket 939 4800+ that cost me $1K USD a half-dozen years ago* Since I realize that only competition will insure that CPU performance keeps improving without prices skyrocketing, I hope both companies survive so their competition will continue. ;-)

     
  161. David Fearon Says:
    August 9th, 2010 at 10:47 am

    @thiscodeiswrong: The Windows-compiled version does in fact use erand48(), taken from the FreeBSD resource at http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/gen/erand48.c
    @vivo and others compiling it with gcc/g++: yes it is faster when compiled under g++ but I never said it wasn’t :-) . Possibly the erand48() that Kevin Beason uses is quicker, or perhaps sqrt() is super-optimized by g++ (there’s a lot of sqrt in there).
    It was intended as a quick CPU test, not an exercise in code optimization. Nonetheless I’m interested to see where the speed-up comes from under g++. If anyone fancies profiling the code please feel free!

     
  162. jackinthebox Says:
    August 9th, 2010 at 2:46 pm

    Running Ubuntu 10.04 on Core i7 980x without overclock.

    root@localhost:/test# time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m36.231s
    user 6m31.300s
    sys 0m0.110s

     
  163. jackinthebox Says:
    August 9th, 2010 at 3:10 pm

    @drphilngood
    No, you weren’t answering @Ho Tuan’s questions, you were failing to see his point. He was comparing the performance of Intel’s, high-priced, super-overclocked rig to a low price, standard AMD high-performance chip that seemed to behave just as well, maybe even better, to make a point on economics. You obviously missed that by throwing your overclocked i7 at him to refute his econ point.
    It was premature for him to claim that he “beat” the Intel rig, but @Ho Tuan acknowledged in his first post, that we still don’t know how the Intel rig does in Linux, so its hard to tell. Based on everyone else’s score, he still had one of the best single chip scores (esp. for an unoverclocked, single, chip stock CPU), and with two Opterons, the best score, with the exception of that Quad socket Xeon X7560(7.849s!), but those chips cost $4,000 a pop.
    I’m an Intel guy myself, but even I will admit that AMD clearly wins on economics, and you’d be hard pressed to find someone who doesn’t agree. I just want to have the fastest chips, even if I go broke ;) .
    If you want to continue your pissing contest, take it to the other forum (though it looks like there’s more AMD love over there).

     
  164. The_Nephilim Says:
    August 9th, 2010 at 3:16 pm

    E8400 @ 4.0GHZ 262 Seconds!!

     
  165. Anonymous Says:
    August 9th, 2010 at 4:25 pm

    @David Fearon: It isn’t GCC, it’s the Linux implementation of erand48() that’s at play. All of the other erand48() implementations I’ve seen appear to use a mutex. This affects real scalability. Linux doesn’t because it offers a re-entrant version, erand48_r(). The version used in the smallpt test isn’t really thread safe, contrary to the software authors assertions.

    The version posted above should be about as fast as the Linux implementations on all platforms. And it is thread safe without needing a mutex, since there is a separate PRNG assigned to each thread.

     
  166. drphilngood Says:
    August 9th, 2010 at 4:33 pm

    @ jackinthebox
    I will respectfully disagree. Perhaps if you read more of our discussion, you will see things differently.
    Yes, he had a great score with two CPUs, and a good one with one, but from his post in this thread and in this one:
    http://www.pcpro.co.uk/blogs/2010/08/06/intels-own-superchilled-test-rig/#comments
    that:
    …it was obvious he didn’t understand that he was comparing his Linux score to the Intel rig’s Windows score and I told him so.
    The OP asked us to post our smallpt scores, not to debate which CPU designer/manufacturer is better/cheaper, so I fail to see any pertinent point that you say I missed. Furthermore, I wasn’t “throwing” my “overclocked i7″, at him, I never mentioned it *to him* until he said, “Regarding your point that cheaper i7’s beating out an AMD Opteron Magny Cours, it is totally unfounded.” Perhaps you should read/reread our entire exchange.
    Yes, as I’ve already acknowledged, I believe that AMD makes great chips and I support both with my purchases. I’m not sure I would say “that AMD clearly wins on economics”, but they very well might and I never said they didn’t, either.
    Finally, your “pissing contest”, remark seems strange, and out of place, in a thread in which the OP asked for benchmark scores. However, I don’t feel that I’ve misbehaved, in any way, and I don’t think it is your place to tell me “take it to the other forum”, nor was it necessary.

     
  167. alfred Says:
    August 9th, 2010 at 6:47 pm

    What are the modification did you make to get the smallpt.cpp to compiled in MSVC++ 2010? Can you make the code available?

     
  168. Nicko Says:
    August 9th, 2010 at 7:33 pm

    My dual 2.66 GHz MacPro (X5550, total 8 cores/16 threads) nails this in 21.3 seconds. As pointed out by JohnAHind above, this is probably a result of better code (and perhaps better OpenMP support) from gcc 4.2 compared to the free version of Visual C++ rather than any difference in the operating system.

    lyon:~ nicko$ time ./smallpt 100
    Rendering (100 spp) 100.00%
    real 0m21.332s
    user 5m20.907s
    sys 0m0.699s

     
  169. jackinthebox Says:
    August 9th, 2010 at 10:38 pm

    Ouch, 239 seconds on Win7 with a Phenom II x4 955 and 4GB of DDR2 RAM.

    @drphilngood
    Everyone was contributing benchmark scores to this list, including @Ho Tuan, until you decided to open your mouth. The ratio of your ego to actual test score contribution on this list converges on infinity. Do us all a favor and muzzle yourself. It’s guys like you that take the fun out of this.

     
  170. James Says:
    August 10th, 2010 at 4:48 am

    213 seconds on my Core i3 530 2GB DDR3

     
  171. drphilngood Says:
    August 10th, 2010 at 5:44 am

    Just for laughs, I decided to try running it with Wine and my 870 Lynnfield managed 100 samples per pixel in 88 seconds. If anyone wants to also try it, but experiences problems, just lemme’ know and I’ll try to help. =)
    Think I’ll try running it on the ole’ x2, now.
    @jackinthebox
    You are entitled to your opinion but please stop with the personal attacks. I have nothing else to say to you.

     
  172. Ho Tuan Says:
    August 10th, 2010 at 7:11 am

    @jackinthebox
    I appreciate your support, though let’s try to keep it peaceful from now on.
    Let’s all simma down now.

     
  173. AMD12 Says:
    August 10th, 2010 at 2:23 pm

    2xAMD8431 (12 cores total)
    Linux RHEL5.5 64 bit

    $ g++ -O3 -march=amdfam10 -fopenmp -ffast-math smallpt.cpp -o smallpt
    $ time ./smallpt 100 2>/dev/null >/dev/null

    real 0m30.271s
    user 5m53.548s
    sys 0m0.950s

     
  174. drphilngood Says:
    August 10th, 2010 at 2:38 pm

    @AMD12
    Awesome share; duplicating your test knocked nearly two full seconds off my score. Kudos!!
    real 0m29.650s
    user 3m54.020s
    sys 0m0.020s
    {Lynnfield 870}

     
  175. John Klos Says:
    August 10th, 2010 at 5:10 pm

    My last one – here’s a VAXstation 4000/60:
    :
    249192.062u 4637.635s 71:55:41.77
    :
    Seventy-one hours. Darn – I thought it’d have been a little faster…

     
  176. Zing Says:
    August 13th, 2010 at 8:12 am

    My Athlon 64 X2 workstation: 667s. Stupidly expensive 4×4 Xeon system from Dell: 815s, but the app was only using half the cores (why?) so let’s call it 407s.

     
  177. Henry Says:
    August 17th, 2010 at 1:57 pm

    @John Klos
    Oh no! I’ve been beaten!

     
  178. Stuart Says:
    September 3rd, 2010 at 3:29 pm

    Xeon E5430 @ 2.66Ghz
    IBM ThinkStation.
    Quad core I think, running Server 2008.
    smallpt took 363 seconds! I’m pretty impressed!

     
  179. John Says:
    September 12th, 2010 at 2:40 am

    E4400 2gb ram with 64bit win7 took 745sec

     
  180. Jonathan Says:
    September 19th, 2010 at 6:17 pm

    My AMD Phenom II X4 955 Black editon processor at stock speed of 3.2Ghz with 4 Gb of g-skill 1600mhz ddr3 ram running windows 7 pro 64 bit managed 240s! Im so pleased!

     
  181. Boyter Says:
    September 21st, 2010 at 11:10 pm

    Just a few….

    AMD Phenom(tm) 9550 Quad-Core Processor
    Rendering (100 spp) 100.00%
    real 1m27.508s
    user 5m46.590s
    sys 0m0.270s

    Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
    Rendering (100 spp) 100.00%
    real 2m41.740s
    user 10m11.720s
    sys 0m0.020s

    Intel(R) Xeon(R) CPU E5506 @ 2.13GHz (4 core)
    Rendering (100 spp) 100.00%
    real 1m55.870s
    user 7m41.781s
    sys 0m0.084s

    AMD Sempron(tm) Processor 2600+
    Rendering (100 spp) 100.00%
    real 14m27.438s
    user 13m11.683s
    sys 0m1.731s

     
  182. Ben Says:
    November 7th, 2010 at 1:34 pm

    Sony Vaio P Series Shamefull Girlie/Handbag laptop that barely handles YouTube
    1833 seconds – performance you can be truely proud of.

     
  183. Yosmil Says:
    December 3rd, 2010 at 10:24 am

    NEC Versa L1101
    Pentium M 1.8GHz, 1GB DDRI RAM
    Windows 7 Ultimate 32-bit
    (Using X-OS Transformation Pack)

    1781 seconds

     
  184. Regan Klarin Says:
    February 28th, 2011 at 9:35 am

    mine goes here: interl i3 2.93gb
    32 seconds

     
  185. EG Says:
    March 14th, 2011 at 6:50 pm

    288 seconds
    2x Dual Core Intel Xeon 5160 3.0Ghz
    4Gb ram

     
  186. Andy Says:
    March 19th, 2011 at 12:15 am

    My Dual Core AMD Fx60 2.61GHz with Twin Invidia 7800GT (SLI cards) 539 seconds. I was quite pleased considering it is now 4 and a bit years old.

     
  187. Dan Says:
    May 29th, 2011 at 8:15 am

    Intel Core 2 DUO E8500 @ 3.8ghz
    4gb ddr 2 ram
    win 7 x64

    312 seconds…..

     
  188. oswaldo Says:
    June 23rd, 2011 at 2:24 am

    Intel i7 2600K @ 4.1ghz
    16gb hyperTX 1666
    win7 x64
    71 seconds…. end….

     
  189. Abhi Says:
    July 25th, 2011 at 6:48 am

    Intel core i7 740QM @ 1.86Ghz
    8GB RAM
    win7 x64
    3 runs: 175~177 seconds

     
  190. PCR Says:
    August 28th, 2011 at 6:26 am

    Intel Atom N 230 (single core, Ubuntu 10.04 (64 bit) command line only)
    12 min, 18.466 s
    Intel P4, 3.06 GHz, Ubuntu 11.04 32 bit, 12 min 42.878
    Intel i5 460 (4GB Ram) Win 7 Pro 64 bit, 248 seconds
    AMD Athlon 3500 + (2 GD Ram) Ubuntu 10.04 (64 bit) 366 s

     
  191. Ironchimp Says:
    September 10th, 2011 at 6:19 pm

    Intel i7 2600k @ 4.5 Ghz
    Corsair Vengeance 16GB
    Win 7 x64
    67 seconds

     
  192. dt192 Says:
    October 20th, 2011 at 6:08 am

    Laptop with Intel i7 Q720 Quad Core @ 1.6 GHz – 364s [6m 4s]

     
  193. shangshuiw Says:
    November 6th, 2011 at 3:48 pm

    Maybe you should try seeing both sides of this issue instead of assuming that yours is the only valid opinion. Id still read it, I like the way you write. But I can see some people getting upset

     
  194. Li Tracey Says:
    November 21st, 2011 at 11:27 pm

    The things i have seen in terms of computer system memory is the fact there are specifications such as SDRAM, DDR etc, that must fit in with the technical specs of the motherboard. If the personal computer’s motherboard is very current while there are no operating system issues, modernizing the memory space literally takes under a couple of hours. It’s on the list of easiest personal computer upgrade techniques one can visualize. Thanks for discussing your ideas.

     
  195. Shaun Pugh Says:
    December 20th, 2011 at 5:47 pm

    time ./smallpt 100

    Rendering (100 spp) 100.00%
    real 0m32.854s
    user 4m15.115s
    sys 0m0.162s

    This is on an early 2011 15″ Macbook Pro, core i7 2.3 Ghz (2820QM) running under OSX 10.7.2 with 16GB RAM.

     
  196. Shaun Pugh Says:
    December 20th, 2011 at 5:50 pm

    I didn’t realise how long ago this article was written, but I wondered about the 1 minute challenge and just over a year later that doesn’t seem too hard to beat. It’s good to see that we are indeed seeing some progress with CPU technologies. :o )

     
  197. S Kenny Says:
    January 22nd, 2012 at 10:08 am

    intel i7 3930K@4.6ghz, 16gb , windows ult 64, took 51 seconds

     
  198. Misio Says:
    April 9th, 2012 at 12:16 pm

    102s
    i7 920 @3GHz
    6GB ram

     
  199. Neil Says:
    April 21st, 2012 at 11:56 am

    209 Seconds
    Asus N53SV
    Intel i5-2410M
    Win7 Pro 64bit

     
  200. Nektarios Says:
    May 13th, 2012 at 3:24 am

    I don’t understand… I ran your program with a dual Xeon E5 2687w 3.1Ghz, and it scored a 135 seconds. With 16 cores (32 with HT) I was expecting a lot better… Any thoughts as to why?

     
  201. Pickster Says:
    May 16th, 2012 at 4:01 pm

    Something seems up with the multi threading on this little program, that or it can be very memory bandwidth limited.

    I was going to say your numbers for the workstation seem off but desided to put my money where my mouth is.

    First I tried on my C2D laptop. A T7700 running at 2.6ghz which got 491 seconds.

    Then I tried on my workstation, running two intel E5345s at 2.45ghz. With the same tech in each and almost the same clock speed the work station should be almost 4x faster. It infact finished in 374 seconds.

    The last test I tried was on my i7 2600k at 4.5ghz which finished in 66 seconds.

    Looking at my numbers and yours for the workstations, I think it’s fair to say that memory bandwidth limited those times. My Xeon CPUs only have a little clock speed advantage over yours but not enough to make up around 100 seconds. However the E5345s have a 1333 FSB (infact mine is slightly over clocked to make it 1400 FSB) and the E5340s only have a 1066 FSB. Given this difference and comparing it to the time difference I think we can conclude both the workstations are bandwidth limited on this particular program.

     
  202. Nektarios Says:
    May 22nd, 2012 at 1:54 pm

    It’s not an FSB bus issue. I say this because I ran another test, this time with two smallpts running at the same time. The result? 62/64 seconds for both! 32 cores in total, with 16 being physical cores. Any idea why this is the case?

     
  203. mnfisher Says:
    October 28th, 2012 at 12:46 pm

    Got 77s on an i7 2600k (homebuilt machine, Win7 64bit) – overclocked to 4.2GHz, and processor throttled back to 3.6 at about 50% complete. Just using the stock Intel cooler so the overclock will have to go!

     
  204. SEBASCHELL Says:
    November 1st, 2012 at 5:53 am

    I7-920@ 2.67Ghz 6Gb ram, Win 7 64Bit, render time : 121 sec.

     
  205. Janusz Says:
    January 14th, 2013 at 7:04 pm

    good soft but problem with Dual processors and more….
    I7 x980 79 sec. But Dual Xeon wrong test. Please fix it.

     
  206. peter Says:
    March 23rd, 2013 at 1:15 pm

    77s on a HP Compaq 8300CMT i7-3770

     
  207. Steve Says:
    April 1st, 2013 at 8:44 pm

    1868s on a FS Amilo pro V2030 laptop,1G ram, Celeron M 370 1500Mhz

    310s on a FS Celsius D2587, 2G ram, E7200 Core 2 Duo 2533Mhz

     
  208. Matt Says:
    April 29th, 2013 at 11:02 pm

    98 Seconds …
    AMD Phenom X6 1075T 3.0Ghz
    8GB RAM, Win7 64bit.

     
  209. Joseph Says:
    August 14th, 2013 at 9:37 pm

    This is a unique way to benchmark. I wish all the stats and examples were in a spreadsheet for better comparison.

     
  210. John Says:
    February 28th, 2014 at 12:25 pm

    138 seconds on dual quad core i7 2.39 GHz
    8GB ram, Win 7 64bit

     
  211. David Says:
    September 29th, 2014 at 11:32 pm

    Core i7 3770k not over clocked with 16gb ram and Windows 7 Pro 64bit 78 seconds

     

Leave a Reply

Spam Protection by WP-SpamFree

* required fields

* Will not be published

Authors

Categories

Archives

advertisement

SEARCH
SIGN UP

Your email:

Your password:

remember me

advertisement


Hitwise Top 10 Website 2010