SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Politics : Politics for Pros- moderated -- Ignore unavailable to you. Want to Upgrade?


To: greenspirit who wrote (468681)1/30/2012 8:35:44 PM
From: FJB1 Recommendation  Read Replies (1) | Respond to of 793676
 
Yahoo data scientist: It’s Romney-Christie or Gingrich-Rubio

By Derrick Harris Jan. 30, 2012, 4:14pm PT 1 Comment

gigaom.com

Sen. Marco Rubio (R-FL)

According to a predictive analysis experiment by a Yahoo data scientist, U.S. voters can expect to see either a Mitt Romney-Chris Christie or a Newt Gingrich-Marco Rubio ticket to face off against Obama-Biden in this year’s presidential election. The experiment, which author David Pennock explained on Yahoo’s The Signal blog Monday morning, highlights both the strengths and weaknesses of using web data to predict human behavior.

The point of Pennock’s experiment is to determine likely vice-presidential candidates based on what the web is saying. He found there’s a 25 percent chance Romney would pick Christie, currently the governor of New Jersey, whereas there’s a 30 percent chance Gingrich would choose Florida senator Rubio. Interestingly, although Romney and Rubio are anti-correlated (i.e., Rubio’s chance of being VP go down as Romney’s chances of being president rise, and vice versa), Rubio is so popular there’s still a 22 percent chance Romney would choose him. Christie, on the other hand, sees his chances drop to a mere 5 percent if Gingrich wins the Republican nomination.

Essentially, Pennock is using data from prediction services Intrade and PredictWise on who’s the most-likely VP, and extrapolating from there to determine who might get the nod if any given candidate wins the nomination. Statistically, Pennock’s conclusions are probably accurate, but he does make sure to note they’re just the result of “a statistical correspondence, and an extrapolated one at that, not a proven cause-effect relationship.”

For example, Pennock claims the results “are based solely on data unaided–and untainted–by political intuition,” but that’s not necessarily the case. Depending on what data sources he, Intrade and PredictWise are using, political intuition could play a major role in who’s correlated with whom. If my gut, no matter how uninformed, tells me Marco Rubio would be a great vice-presidential candidate and I write it or tweet it, I’ve likely influenced the data set with my intuitions, however data-driven the analysis itself is.

Furthermore, there’s really no accounting for the human brain. Although Intrade had Sarah Palin as John McCain’s likely VP candidate on the day she was announced, its confidence varied greatly throughout the day as rumors swirled, and I’m guessing it didn’t have her rated highly this far out. Who knows what will change between now and August and whose name will start cropping up? Actually, Intrade has the chances of Palin being picked this year at .2 percent as I write this, but perhaps she’ll surge again over the summer.

Predictive models can be very beneficial, and I think Pennock’s analysis (as well as those from Intrade and PredictWise) is very telling about reality as it exists now. But unlike in the world of machine data, where a series of particular events might be highly suggestive of a particular outcome down the road, reality can change in a hurry when fickle humans are involved.