To: neolib who wrote (2459 ) 7/18/2007 12:35:29 PM From: KLP Read Replies (1) | Respond to of 4152 How to Understand Statistics....bbc.co.uk Lies, damn lies and statistics. - Mark Twain Things to Look Out For 47.3% of all statistics are made up on the spot. - Steven Wright Where did the data come from? Who ran the survey? Do they have an ulterior motive for having the result go one way? How was the data collected? What questions were asked? How did they ask them? Who was asked? Be wary of comparisons. Two things happening at the same time are not necessarily related, though statistics can be used to show that they are. This trick is used a lot by politicians wanting to show that a new policy is working. Be aware of numbers taken out of context. This is called 'cherry-picking', an instance in which the analysis only concentrates on such data that supports a foregone conclusion and ignores everything else. A survey on the effects of passive smoking, sponsored by a major tobacco manufacturer, is hardly likely to be impartial, but on the other hand neither is one carried out by a medical firm with a vested interest in promoting health products. If a survey on road accidents claims that cars with brand X tyres were less likely to have an accident, check who took part. The brand X tyres may be new, and only fitted to new cars, which are less likely to be in accidents anyway. Check the area covered by a survey linking nuclear power plants to cancer. The survey may have excluded sufferers who fall outside a certain area, or have excluded perfectly healthy people living inside the area. Do not be fooled by graphs. The scale can be manipulated to make a perfectly harmless bar chart look worrying. Be wary of the use of colours. A certain chewing gum company wanted to show that chewing gum increases saliva. The chart showed the increase in danger to the gums after eating in red and safe time after chewing in blue. However the chart showed that the act of chewing would have to go on for 30 minutes to take the line out of the danger zone. The curve was just coloured in a clever way to make it look like the effect was faster. Perhaps the most important thing to check for is sample size3 and margin of error. It is often the case that with small samples, a change in one sample or one data item can completely change the results. Small samples can sometimes be the only way to get the analysis done, but generally the bigger the sample size, the more accurate the results are and the less likely a single error in sampling will affect the analysis. For example, people will go on about how 95% of children passed their exams at such a school and 92% of children passed their exams at a different one, but the sample sizes are not actually big enough for the difference to be statistically significant: in a year group of 100, a 3% difference is a difference of three students, which makes the difference insignificant. The Problem with Statistics The main problem with statistics is that people like favourable numbers to back up a decision. For example, when choosing an Internet provider, most people will choose the one with the most customers. But that statistic does not tell you other useful things like what their customer turnover might be, what their connection reliability is, what the mean time taken to answer a technical fault call is, and so on. People will simply make the assumption that a lot of customers means that the company should be be all right. Generally this is true, but there are companies which work by having a large body of customers, providing bad service and making it hard for people to cancel their agreements. Just because a company is the most popular, does not automatically mean it is the best. Common sense can cloud statistical results. For instance, a technology firm discovered that 40% of all sick days were taken on a Friday or a Monday. They immediately clamped down on sick leave before they realised their mistake. Forty per cent represents two days out of a five day working week and therefore is a normal spread, rather than a reflection of swathes of feckless opportunists trying to extend their weekends. Fundamental to the mathematics of probability is the requirement for conditional probabilities to be independent of each other, such as dice rolls or coin flips. If they are not independent the maths stops working and the answers stop making sense. However, a lot of statistics are worked out at a distance from the core events, so working out if the results are valid can be next to impossible. This is essentially the same as the gambler who thinks his luck must change soon because he couldn't continue to have bad luck all night. This is wrong; there's nothing to say the dice should start rolling your way based on previous behaviour. Lots More at Link >>>>>>>>http://www.bbc.co.uk/dna/h2g2/A1091350