To: Bread Upon The Water who wrote (514 ) 9/2/2011 6:29:50 PM From: Nadine Carroll 2 Recommendations Read Replies (2) | Respond to of 85487 BOTW, did you miss the whole Climategate document leak? East Anglia, keepers of the HADCRUT global temp dataset, fudged the numbers. Most reporters concentrated on the Climategate emails, since they couldn't read the program files. The emails had juicy bits like "I used Mike's trick to hide the decline." But I read one long Climategate file called HarryReadMe.txt, the diary of a programmer named Harry (he's been id'd, I forget his last name), and some of Harry's code files. Harry had a job which should have been straightforward - take the published global temperature dataset and bring it up to date with the new data. That should have been a matter of uploading the published files from the archive, reading the instructions, running the programs again to make sure it all still worked and you got the same numbers as before, and then manipulating the new data into the right format, and running the programs again to produce the new dataset. What Harry actually found was 11,000 files with no documentation or instructions. Damn few comments in the code, even. He had to figure out from scratch how to run the programs and what the numbers in the datasets meant. He had to laboriously try to recreate previously published results by a system of trial and error. It took him months. He would note triumphantly, 'getting closer, results are within half a degree of published results' after successful trials. Mind you, this is a dataset where tens of a degree are regarded as significant differences. Having no instructions how to merge data sources smoothly, he would note 'have no idea how to do this, must just make it up' - and you would see in his code, hardwired vectors designed for the express purpose of fudging the end of one data source to meet the beginning of another data source. The whole thing was just massively unprofessional from a software handling point of view (I used to program and manage software projects) and unscientific in its handling of data.