SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Politics : Foreign Affairs Discussion Group

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: epicure who wrote (211582)1/2/2007 7:09:18 PM
From: neolib  Read Replies (1) of 281500
 
Once you've looked at that link & graphs, the more interesting points need to be considered. First, note this sentence by Tabarrok:

Buchanan received 0.78 percent of the vote in Palm Beach County. By comparison, he received an average of 0.46 percent of the vote in the other Florida counties.

Now let me point out that Buchanan actually got 17465 votes out of a total of 5957092 votes in the state for a total of 0.293%. Above Tabarrok claims 0.46%. What the heck? Both are correct. The total population mean is in fact 0.293%, but the sample mean (view each of the 67 counties as a sample, cumulatively sampling the entire population once) is in fact 0.46. The sample means have a Stdev of 0.32324, which means that the population mean estimated from the exhaustive sampling (usually much smaller samples are used) is fully 0.55 std dev's from the true population mean. This is rather startling given the large sample size (67) and the fact that it exhaustively samples the entire 5M + population.

This leads to the obvious conclusion that the samples are somehow not sampling a uniform population. On a little bit of reflection, it becomes obvious, that the correct first step of normalization is not sufficient here. The 2'nd plot of Tabarrok's, which makes Palm Beach look OK, is in fact deceptive itself. The problem lies in the assumption of sampling normalcy and what the x-axis of the graph conveys. In both plots shown, the x-axis merely represents sample position. It is in fact the alphabetical order of the counties. Seems harmless. In fact, it is one method of making the data look random, and random is what Tabarrok wants you to see. You could of course sort the x-axis by say either ascending or descending order of the %vote, and in fact, you could also single out Palm Beach and sift it's position was the data was sorted, just to make it stand out. Any method of such sorting is equally valid, because the x-axis does not in fact convey any information here, it is simply sample number, and any ordering of sample number is just as valid as any other ordering. Random ordering, and there are many such combinations, make the data look nicely random.

Instead, you need data pairs of total vote and %vote for Buchanan. If you plot %vote for Buchanan vs total county vote, the x-axis now means something. The resulting plot is well worth looking at. I provide you the raw data below in comma delimited format, so you can plot it in Excel. The first column is the actual Buchanan vote, the second is total county vote, and the third is %Buchanan vote for each county. I've already sorted in ascending county vote order, but you will need to plot the third column against the 2'nd column, so the order is not important.

Once you do this, toss out Palm Beach (it is obvious) and compute some stats on the remaining 10 largest counties. I get a mean of 0.21347 and a std dev of 0.071175. Palm Beach clocks in at 0.788136 which is 8.07 std devs away from the mean. Please note that a linear regression which would show the slightly negative slope with size, would make this even worse (IIRC close to 10 std devs). Then look at where all the samples that are > Palm Beach lie. They are all clustered down near zero on the X-axis. LOL! This is one chart that I kept on my wall for a couple of years to remind myself how screwy “statistics” can be when one does not look at the details. Enjoy!

The data:

Buchana Votes, Total Votes, Buchanan %
39,2410,1.618257261
10,2505,0.399201597
9,3365,0.267459138
37,3826,0.967067433
23,3964,0.580221998
33,4644,0.710594315
29,4666,0.62151736
90,5174,1.739466564
29,5395,0.537534754
29,5642,0.514002127
71,6144,1.155598958
29,6162,0.47062642
30,6233,0.481309161
27,6808,0.396592244
76,7395,1.027721433
36,7805,0.461242793
88,8021,1.09712006
22,8138,0.270336692
73,8154,0.895266127
46,8587,0.53569349
65,8673,0.749452323
43,9853,0.436415305
108,12441,0.86809742
67,12724,0.526563974
38,14727,0.25802947
102,16300,0.625766871
120,18318,0.655093351
89,18508,0.480873136
114,22261,0.512106374
90,23581,0.381663203
148,26222,0.564411563
83,27111,0.306148796
47,33878,0.138733101
127,35149,0.361318956
105,49622,0.211599694
311,50319,0.618056798
145,55657,0.260524283
270,57200,0.472027972
186,57353,0.32430736
248,58805,0.421732846
229,60746,0.376979554
112,62013,0.180607292
242,65219,0.371057514
182,66896,0.272064099
267,70680,0.377758913
124,77989,0.158996782
263,85729,0.306780669
289,88611,0.32614461
122,92141,0.132405769
563,102956,0.546835541
282,103113,0.273486369
271,110221,0.245869662
502,116648,0.430354571
194,137634,0.140953543
570,142731,0.399352628
305,160942,0.189509264
532,168486,0.315753238
496,183256,0.270659624
305,184377,0.165421934
570,218395,0.260994986
652,264636,0.246376154
446,280125,0.159214636
847,360295,0.235085139
1013,398469,0.254223039
3407,432286,0.788135632
788,573396,0.137426839
560,625362,0.089548134
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext