SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Politics : Foreign Affairs Discussion Group -- Ignore unavailable to you. Want to Upgrade?


To: epicure who wrote (211582)1/2/2007 5:52:24 PM
From: neolib  Read Replies (1) | Respond to of 281500
 
After a bit of searching I found the links I needed. First go see this:

mail-archive.com@gmu.edu/msg00569.html

One Alexander T. Tabarrok, Ph.D. has this to say:

But Palm Beach County is an unusually large county.
"Buchanan received a lot of votes in Palm Beach County because there
are a lot of voters in Palm Beach County," says economist Dr.
Alexander T. Tabarrok, research director for The Independent
Institute. When votes cast for Buchanan are calculated as a
percentage of votes cast for all presidential candidates, the results
in Palm Beach County do not appear unusual.

But Palm Beach County is an unusually large county.
"Buchanan received a lot of votes in Palm Beach County because there
are a lot of voters in Palm Beach County," says economist Dr.
Alexander T. Tabarrok, research director for The Independent
Institute. When votes cast for Buchanan are calculated as a
percentage of votes cast for all presidential candidates, the results
in Palm Beach County do not appear unusual.

"The percentage of votes received by Pat Buchanan from voters
in Palm Beach County is consistent with his overall performance in
Florida," says Tabarrok.

Buchanan received 0.78 percent of the vote in Palm Beach
County. By comparison, he received an average of 0.46 percent of the
vote in the other Florida counties. (See accompanying charts.)
Although Buchanan received a larger share in Palm Beach County than
in the average Florida county he performed even better in some other
counties, such as Calhoun, where voting errors are not alleged to
have occurred. Buchanan's Palm Beach share of the vote did not depart
significantly from the average.

Some people may have mistakenly marked their ballots for
Buchanan in Palm Beach County as in other counties but there is no
evidence of this in the voting data. "The impression given by Salon,
CNN and others that Buchanan's vote in Palm Beach County was
unusually large is a statistical fraud. I am shocked and concerned
that reputable news organizations would present data in such a
misleading and naïve manner especially given the importance of clear
thinking at this time", said Tabarrok.



Look at the two plots shown in the link above. Indeed, the 1st one which was shown far and wide by Dems, commits the sin noted above, failure to normalize. Good for Mr Tabarrok. He then produces the 2'nd graph which is normalized by total county vote, which now shows Palm Beach looking a good deal more normal.

However, stare at the two graphs, and in particular, overlay them and ask yourself what you see, and think about this quote from Mr Tabarrok: "But Palm Beach County is an unusually large county.
"Buchanan received a lot of votes in Palm Beach County because there
are a lot of voters in Palm Beach County,"

What is going on?



To: epicure who wrote (211582)1/2/2007 7:09:18 PM
From: neolib  Read Replies (1) | Respond to of 281500
 
Once you've looked at that link & graphs, the more interesting points need to be considered. First, note this sentence by Tabarrok:

Buchanan received 0.78 percent of the vote in Palm Beach County. By comparison, he received an average of 0.46 percent of the vote in the other Florida counties.

Now let me point out that Buchanan actually got 17465 votes out of a total of 5957092 votes in the state for a total of 0.293%. Above Tabarrok claims 0.46%. What the heck? Both are correct. The total population mean is in fact 0.293%, but the sample mean (view each of the 67 counties as a sample, cumulatively sampling the entire population once) is in fact 0.46. The sample means have a Stdev of 0.32324, which means that the population mean estimated from the exhaustive sampling (usually much smaller samples are used) is fully 0.55 std dev's from the true population mean. This is rather startling given the large sample size (67) and the fact that it exhaustively samples the entire 5M + population.

This leads to the obvious conclusion that the samples are somehow not sampling a uniform population. On a little bit of reflection, it becomes obvious, that the correct first step of normalization is not sufficient here. The 2'nd plot of Tabarrok's, which makes Palm Beach look OK, is in fact deceptive itself. The problem lies in the assumption of sampling normalcy and what the x-axis of the graph conveys. In both plots shown, the x-axis merely represents sample position. It is in fact the alphabetical order of the counties. Seems harmless. In fact, it is one method of making the data look random, and random is what Tabarrok wants you to see. You could of course sort the x-axis by say either ascending or descending order of the %vote, and in fact, you could also single out Palm Beach and sift it's position was the data was sorted, just to make it stand out. Any method of such sorting is equally valid, because the x-axis does not in fact convey any information here, it is simply sample number, and any ordering of sample number is just as valid as any other ordering. Random ordering, and there are many such combinations, make the data look nicely random.

Instead, you need data pairs of total vote and %vote for Buchanan. If you plot %vote for Buchanan vs total county vote, the x-axis now means something. The resulting plot is well worth looking at. I provide you the raw data below in comma delimited format, so you can plot it in Excel. The first column is the actual Buchanan vote, the second is total county vote, and the third is %Buchanan vote for each county. I've already sorted in ascending county vote order, but you will need to plot the third column against the 2'nd column, so the order is not important.

Once you do this, toss out Palm Beach (it is obvious) and compute some stats on the remaining 10 largest counties. I get a mean of 0.21347 and a std dev of 0.071175. Palm Beach clocks in at 0.788136 which is 8.07 std devs away from the mean. Please note that a linear regression which would show the slightly negative slope with size, would make this even worse (IIRC close to 10 std devs). Then look at where all the samples that are > Palm Beach lie. They are all clustered down near zero on the X-axis. LOL! This is one chart that I kept on my wall for a couple of years to remind myself how screwy “statistics” can be when one does not look at the details. Enjoy!

The data:

Buchana Votes, Total Votes, Buchanan %
39,2410,1.618257261
10,2505,0.399201597
9,3365,0.267459138
37,3826,0.967067433
23,3964,0.580221998
33,4644,0.710594315
29,4666,0.62151736
90,5174,1.739466564
29,5395,0.537534754
29,5642,0.514002127
71,6144,1.155598958
29,6162,0.47062642
30,6233,0.481309161
27,6808,0.396592244
76,7395,1.027721433
36,7805,0.461242793
88,8021,1.09712006
22,8138,0.270336692
73,8154,0.895266127
46,8587,0.53569349
65,8673,0.749452323
43,9853,0.436415305
108,12441,0.86809742
67,12724,0.526563974
38,14727,0.25802947
102,16300,0.625766871
120,18318,0.655093351
89,18508,0.480873136
114,22261,0.512106374
90,23581,0.381663203
148,26222,0.564411563
83,27111,0.306148796
47,33878,0.138733101
127,35149,0.361318956
105,49622,0.211599694
311,50319,0.618056798
145,55657,0.260524283
270,57200,0.472027972
186,57353,0.32430736
248,58805,0.421732846
229,60746,0.376979554
112,62013,0.180607292
242,65219,0.371057514
182,66896,0.272064099
267,70680,0.377758913
124,77989,0.158996782
263,85729,0.306780669
289,88611,0.32614461
122,92141,0.132405769
563,102956,0.546835541
282,103113,0.273486369
271,110221,0.245869662
502,116648,0.430354571
194,137634,0.140953543
570,142731,0.399352628
305,160942,0.189509264
532,168486,0.315753238
496,183256,0.270659624
305,184377,0.165421934
570,218395,0.260994986
652,264636,0.246376154
446,280125,0.159214636
847,360295,0.235085139
1013,398469,0.254223039
3407,432286,0.788135632
788,573396,0.137426839
560,625362,0.089548134