<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PC Pro blog &#187; statistics</title>
	<atom:link href="http://www.pcpro.co.uk/blogs/tag/statistics/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.pcpro.co.uk/blogs</link>
	<description>Blogging in the real world</description>
	<lastBuildDate>Wed, 08 Feb 2012 16:54:13 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>The top fallacy in statistics: sample size</title>
		<link>http://www.pcpro.co.uk/blogs/2010/10/07/the-top-fallacy-in-statistics-sample-size/</link>
		<comments>http://www.pcpro.co.uk/blogs/2010/10/07/the-top-fallacy-in-statistics-sample-size/#comments</comments>
		<pubDate>Thu, 07 Oct 2010 12:51:00 +0000</pubDate>
		<dc:creator>Tim Danton</dc:creator>
				<category><![CDATA[Rant]]></category>
		<category><![CDATA[Ballmer]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://www.pcpro.co.uk/blogs/2010/10/07/the-top-fallacy-in-statistics-sample-size/</guid>
		<description><![CDATA[ In my foolishness, I signed up for a ten-week module on statistics whilst studying for my Maths degree. And I hated it with a vengeance. It soon became crystal clear that I found 99 out of 100 topics exceptionally dull.
However, with the spiralling number of surveys appearing in the media with each passing year, [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.pcpro.co.uk/blogs/wp-content/uploads/2010/10/Statisticsandsamplesizes.jpg"><img style="border-right-width: 0px; margin: 0px 0px 5px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="pen showing diagram" src="http://www.pcpro.co.uk/blogs/wp-content/uploads/2010/10/Statisticsandsamplesizes_thumb.jpg" border="0" alt="pen showing diagram" width="464" height="348" /></a> In my foolishness, I signed up for a ten-week module on statistics whilst studying for my Maths degree. And I hated it with a vengeance. It soon became crystal clear that I found 99 out of 100 topics exceptionally dull.</p>
<p>However, with the spiralling number of surveys appearing in the media with each passing year, having a certain amount of knowledge about statistics has come to my aid on numerous occasions. Because it turns out that even intelligent people don’t really understand statistics at all.</p>
<p>Here, I’d simply like to address the number one, burning misunderstanding people have about statistics: <em>the sample size has to be similar in number to the total population in a study</em>.</p>
<p><span id="more-25981"></span></p>
<p>No. Honestly, no. I realise it’s not wholly obvious, but no.</p>
<p>Take the recent example of the survey of Microsoft employees via Glassdoor.com, which showed that <a href="http://www.pcpro.co.uk/blogs/2010/10/06/what-microsoft-employees-think-of-steve-ballmer/">around half of them were unhappy with Steve Ballmer’s performance</a>.</p>
<p>Because the sample size – that is, the number of people surveyed – was around 1,000, and the overall number of Microsoft employees is around 80,000, one leading Microsoft blogger sent a tweet saying: “Those surveyed for that report equates to about 0.625% of Microsoft employees&#8230; again, hardly representative at all. Seems very flawed.”</p>
<p>The 0.625% stems from the 500 Microsoft employees who weren’t happy with Ballmer.</p>
<p>But it’s not flawed at all. To explain why, I’ll analyse that survey backwards.</p>
<p>We’ll assume that Microsoft has precisely 80,000 employees, and that precisely 50% of them don’t believe Ballmer is doing a good job.</p>
<p>If we repeatedly surveyed 383 people (randomly chosen each time), then statistics show that 19 times out of 20 the results – that is, 95% of the time – we’d get a result showing between 45% and 55% of them didn’t believe Ballmer was doing a good job.</p>
<p>To switch that into statistical speak, that’s a 95% confidence level with a 5% margin of error (that is, 50% plus or minus 5%).</p>
<p>I know what you’re thinking: 95% confidence level isn’t enough. So let’s go for 99%. Assuming the same conditions – 80,000 employees, 50% unhappy – then we’d need a sample size of 659 people.</p>
<p>To put that into plain English, with a sample size of 659 precisely 99 out of 100 surveys would show a result of between 45% and 55% being unhappy with Ballmer’s performance.</p>
<blockquote><p>9,985 out of 10,000 surveys would show 45% to 55% of Microsoft employees were Ballmer-unhappy</p></blockquote>
<p>What happens if we up the sample size to 1,000? The confidence level increases to 99.85%, so 9,985 out of 10,000 surveys would show 45% to 55% of Microsoft employees were anti-Ballmer.</p>
<p>In fact, <a href="http://www.glassdoor.com/Reviews/Microsoft-Reviews-E1651.htm" target="_blank">the Glassdoor.com ratings</a> are based on a 1,119 sample size, giving a confidence level of 99.92%. Pretty strong.</p>
<p>One final point: the biggest problem with any survey is finding a truly random sample. Glassdoor.com doesn’t appear to vet its survey respondents (other than for insults, trade secrets or defamation), so you or I could contribute our own reviews should we so wish.</p>
<p>You could also argue that, as a recruitment-orientated website, it will be biased towards current Microsoft employees looking to leave, or ex-employees with a grudge.</p>
<p>But neither of those potential flaws would explain why <a href="http://www.glassdoor.com/Reviews/Oracle-Reviews-E1737.htm" target="_blank">Oracle’s CEO</a> gets such a high approval rating when his company rating is actually <em>lower</em> than Microsoft’s. As such, we can all have confidence in Glassdoor.com’s sample-gathering techniques, while Steve Ballmer should think about how he can change his employees’ minds.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pcpro.co.uk/blogs/2010/10/07/the-top-fallacy-in-statistics-sample-size/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Stretching the truth by snipping the figures</title>
		<link>http://www.pcpro.co.uk/blogs/2009/04/15/stretching-the-truth-by-snipping-the-figures/</link>
		<comments>http://www.pcpro.co.uk/blogs/2009/04/15/stretching-the-truth-by-snipping-the-figures/#comments</comments>
		<pubDate>Wed, 15 Apr 2009 10:33:02 +0000</pubDate>
		<dc:creator>Darien Graham-Smith</dc:creator>
				<category><![CDATA[Hardware]]></category>
		<category><![CDATA[Rant]]></category>
		<category><![CDATA[damned lies]]></category>
		<category><![CDATA[figures]]></category>
		<category><![CDATA[graphs]]></category>
		<category><![CDATA[lies]]></category>
		<category><![CDATA[pr]]></category>
		<category><![CDATA[spin]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://www.pcpro.co.uk/blogs/?p=5416</guid>
		<description><![CDATA[Here’s something that winds me up. This is a graph that was published to accompany a high-profile hardware launch last year. I won’t name names, but you can probably guess who produced it and what they were trying to show:

As you can see, across various tests the red bar is three, four, even six times [...]]]></description>
			<content:encoded><![CDATA[<p>Here’s something that winds me up. This is a graph that was published to accompany a high-profile hardware launch last year. I won’t name names, but you can probably guess who produced it and what they were trying to show:</p>
<p><a href="http://www.pcpro.co.uk/blogs/wp-content/uploads/2009/04/slide1.png"><img class="aligncenter size-full wp-image-5417" src="http://www.pcpro.co.uk/blogs/wp-content/uploads/2009/04/slide1s.png" alt="" /></a></p>
<p>As you can see, across various tests the red bar is three, four, even six times as tall as the green one. But hold on — because that’s <em>not</em> an accurate reflection of relative performance.<span id="more-5416"></span></p>
<p>You’ve probably already spotted the reason why: the Y axis doesn’t start at zero! Instead, it originates at a rather arbitrary 0.8, greatly exaggerating the difference in scale between the green and red bars. A more neutral representation of the same figures would see the red team still win, but by a decidedly less jaw-dropping margin:</p>
<p><a href="http://www.pcpro.co.uk/blogs/wp-content/uploads/2009/04/slide2.png"><img class="aligncenter size-full wp-image-5417" src="http://www.pcpro.co.uk/blogs/wp-content/uploads/2009/04/slide2s.png" alt="" /></a></p>
<p>Of course, skipping over part of an axis can sometimes be justified. If you’re charting small changes in large numbers, it makes sense to zoom in a little, just for the sake of clarity. But here the graph isn’t intended to illustrate a trend: it’s supposed to convey, at a glance, just how much bigger one set of numbers is than another. And that’s precisely what it doesn’t do.</p>
<p>Don’t think I’m picking on any one company here: this type of spin is part and parcel of marketing, in the IT business and beyond. And to be honest, I rather enjoy the mental work-out of decoding official PR messages to get to the truth. It just irks me that they think we’re that gullible.</p>
<p>What’s the most shameless marketing claim you’ve come across?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pcpro.co.uk/blogs/2009/04/15/stretching-the-truth-by-snipping-the-figures/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

