Monday, November 01, 2010

Nate Silver Is a Randian Libertarian, and The Latest Darling of an Incompetent National Media 

>


- by Doug Kahn
A lot of what appears to be progress is just so much technological rococo.
–Bill Grey

Nate loves playing with numbers, adding then dividing them, aggregating then regressing them, lining them up then projecting them, possessing them. They must be his numbers, no one else’s. Consider the probability that the 2010 election will produce a House of Representatives controlled by Republicans. Today, Nate says this is 84%, or 5-1 odds. A few days ago, it was 80%, 4-to-1 odds. Two weeks ago, it was 75%, 3-1 odds. Common sense says this is flim-flammery. If there were a person who made money by betting on the outcomes of elections, they’d be an instant celebrity. I challenge you to find me that person.
 
Silver’s success in the world of baseball stats is probably due to the lack of intelligent competition. Baseball management used to be populated by former baseball players and rich owners and their families. Higher education is not a necessary qualification for either group. A large portion of players become professionals right out of high school. Many of those who attend college are recruited by the athletic departments, and end up at a school (like I did) where you can pass the physical science requirement by taking Introductory Astronomy. (By the way, the night sky in the Everglades due west of the University of Miami is beautiful.) 

Before Nate, no one had done an exhaustive job of crunching the compiled statistical records of baseball games from the past. Possibly he just came along when the efforts of hundreds of fanatics finally reached critical mass. Look at baseball-reference.com and find play-by-play analysis of games from as far back as the 1920s. Pitch sequence and all.  

Anyway, Nate’s work was invaluable to general managers and their assistants who wanted to use statistical analysis of recruits and trade prospects. It’s the kind of knowledge that can help a manager avoid a crucial mistake, like trying to get a sacrifice fly out of a hitter who almost always hits ground balls against the pitcher who’s on the mound.
   
Nate decided he knew how much each player contributed to each run, each win. Maybe you remember when he decided Barry Bonds was the most valuable baseball player ever? At this point I forget which side he takes on the Derek Jeter question: great taste, or less filling? Wait, that’s something else. I mean, high fielding percentage or pathetic range? 

And Nate is always, always right. Numbers don’t lie! If you don’t believe me, look back to his Most Valuable Democrat rankings of U.S. Senators and Representatives of 2009. Which had Ben Nelson as the most valuable Democratic Senator. You practically have to be drooling to agree with that analysis. 

So being accustomed to Brainiac stature in the world of baseball, Nate now appears to think of himself as the Sun, and all other statisticians and political consultants revolve around him. Playing the part of lesser celestial bodies, the various moons and planetoids, are the actual pollsters. What about the individual people who answer polling questions, and the voters they are supposed to represent? Space dust blown by cosmic wind. They’ve been rendered virtually nonexistent by statistical categorization, and by the footnoted qualifier: results are likely to be within hailing distance of the actual honest-to-God truth, most of the time. I use ‘virtually’ in the old sense; I guess now you’d say that polls count ‘virtual’ voters.  

The essential uncertainty of polling nags at Nate, leads him into permutations that corrupt data beyond significance. You cannot get rid of the last caveat of every poll: this poll is accurate to within x.y%, but only 95% of the time. And this assumes the sample actually mirrors the electorate, which it doesn’t.  

Nate’s predictions are self-referential, necessarily; if a baseball team is losing by a score of 10-0 in the 2nd inning, it’s more likely they can come back to win than if it’s the 7th inning. So if Democrats are ‘losing’ (in the polls), then ‘coming back’ grows more unlikely the closer you get to election day. When Nate’s analysis is off to start with, his errors accelerate out of control, particularly because he includes as factors the predictions of people like Charlie Cook and Stuart Rothenberg and Larry Sabato, who no doubt read Nate Silver’s daily output of tripe, which is partially based upon the Congressional Quarterly’s Teagan Goddard, and so on and so forth.
 
Arguably, the accuracy (in hindsight) of Nate Silver’s so-called ‘model’ is anchored in the deliberately biased partisan polling done by both sides, which ends up cancelling out, but provides stacks of extra data, smoothing out the trend provided by more reputable pollsters. This trend is necessarily choppy, messy, and bothers Nate too much to leave alone. Because it’s random variation, good buddy, not rational; any method you propose to get rid of it corrupts the truth. 
 
Alan Grayson will win on Tuesday


 That will apparently come as a shock to Nate. His numbers say it’s not possible.
 
The New York Times’ official rating of Grayson’s district is “leans Republican.” Some districts are “solid Republican [or Democrat],” and these terms better describe the reality of actual people casting actual ballots. According to Nate, though, the Republican’s chance of winning is exactly 82.3%. The national media says Grayson’s a goner. The independent expenditures against him have stopped, probably because they’re not seen as necessary.
 
So an incumbent with almost 100% name recognition, one who has done everything he can to fire up the Democratic base, whose Progressive fire must have generated a mass of loyal volunteers, in a district that is Democratic, with a well-funded ground campaign that cranked up well before early voting started, who will end up spending $5 million on his reelection campaign, is a 5-1 longshot. So says the savant’s magic eight-ball. 

When Nate feeds numbers into his ‘model’, his reality generator, that’s what it says, and Nate is a slave to his mathematical formulas. They don’t represent reality; to him, they are as real as the earth and the sky. Consider the logic, and language, of this paragraph: 
Our model projects that Republicans will win the average Congressional district by between 3 and 4 points. In recent years-- because turnout is generally heavier in Republican districts-- the aggregate House popular vote has been 3 to 4 points more Republican than the result in the vote in the average district. Thus, our model seems to imply that Republicans will win the House popular vote by 6 to 8 points, which is roughly consistent with current averages of generic ballot polls.

“Thus”??!!! Why not just say “ergo”; Nate obviously believes he has presented a mathematical proof. Actually, his basic assumption about the aggregate House popular vote is wrong, as in not factual. The winning margin for the Republican is typically higher in Republican districts, too. Relying on the existence of an “average Congressional district” or “current averages of generic ballot polls” is okay so long as you keep in mind the essential concept of error.

Republicans may roll up huge margins in the South; does this imply anything about districts in New England? Nate’s model says it does. Whenever a poll comes out that says the Republican is way ahead in, say, Alabama, Nate’s model reduces the winning probability of every Democrat everywhere in the country, even if the poll is one of Rasmussen’s confabulated specials.

Nate’s logic leads to him saying that his model is as real as real reality: “our model seems to imply” etc. The rest of us would use a verb like predict or forecast.

The Generic Ballot is a Mystery This Year

The generic ballot polling is all over the map this year, for various reasons having to do with the methodologies of the various pollsters, with Gallup saying the Republicans have a 15% edge, and Newsweek calling it a Democratic edge of 3%, McClatchy/Marist having it dead even. Some of it has to do with the younger electorate: will younger voters turn out, and can you accurately poll them, given their almost ubiquitous use of cell phones (and no land lines).

There’s another developing problem with cell phone users. More and more young voters aren’t using phones to talk any more. They seldom answer incoming calls, and never answer calls from numbers they don’t recognize. Don’t bother leaving a voice message for someone like this. You have to admit, it is more respectful to send a brief text message, rather than leaving a 30-second recording that has to be listened to all the way through. Which means increasing numbers of young voters can never be polled at all.

Look at the likely voter models, and the recent criticism of them, chiefly the Gallup methodology. They’re based upon a person’s answer to the question, “are you going to vote this election?” Not one of these statisticians seems to understand that the numbers that result from this question are not as reliable as the answer to the question “who would you vote for?” It’s actually a different type of answer. We don’t base our decisions on absolute certainty; people with injuries to a particular brain region can become unable to act, they lose the ‘executive’ function. They’re unable to initiate action because they can’t proceed beyond the nagging uncertainty of consequences. The rest of us have a standard; if you can reduce uncertainty below a certain level, you act. Careful people set this much below 10%.
 
Try it yourself on a congressional race in your area. If I ask “if the election were being held today, who would you vote for,” I imagine you won’t think twice before answering ‘candidate x,’,‘candidate y,’ or ‘I don’t know.’ If I ask ‘do you want to vote on election day,’ your answer is similarly simple. But answer this: “Will you go to your polling precinct and vote on election day?” It’s not what do you want to do (vote), nor what would you do (if you were at your precinct), but what will actually happen at a time certain in the future. It’s been demonstrated time and again that voters’ stated intentions are unreliable; I suppose you can argue that’s covered by the term likely voters, who are not certain voters. But the unreliability of this designation doesn’t show up in the pollster’s caveat “accurate to within x.y% either way, 95% of the time.”
 
How many times can you consecutively manipulate the actual sample you take (to get a final ‘topline’ number) before you end up turning out crap? If every two years I say I’m definitely going to vote, and in reality that means I end up going to vote four out of five times (80%), the math says that every third time I say it, there’s a 50/50 [that is, 1 minus .8x.8x.8, or .488, 48%] chance that I won’t vote. There is no “average me,” there’s no way to smoothe out these curves, get rid of inconsistencies, and that’s what Nate is trying to do. 

The ‘enthusiasm gap’ measures the fact that it’s more likely that tea-baggers will say they’re going to vote. It also measures the increased likelihood they’ll respond rather than refuse to answer a phone survey, or respond to an online poll request. These things naturally overweight teabaggers; then when their responses are extrapolated to come up with an actual Republican turnout number (because the pollster hasn’t spent enough money to actually fill out the various quotas of interviewees), you end up with the average Republican being excessively dippy, yet pretty much guaranteed to do his or her civic duty and vote. 

Am I the only one who sees this?
 
In a way, Nate sees future occurences as inevitabilities that he observes through a window in his mind. Such a person will necessarily come off as pontificating and pompous. When Nate stands very still, and looks very far away through that window of his, he can see himself on a distant hilltop: Nate Silver, the Legend.

Labels: , ,

4 Comments:

At 4:28 PM, Anonymous Anonymous said...

I hope you're right, because that is what I've been saying. Young people mainly use cellphones, and many of us older voters with landlines have caller ID, and we also don't answer the calls with numbers we don't recognize. Wouldn't it be nice if there is an unpolled "silent majority" out there that shows up and votes progressive, and blows all the numbers out of the water. Wednesday morning wouldn't be great to hear our MIA, uninformed Fourth Estate say "Boy, we didn't see that coming." Alas, if only.

 
At 4:53 PM, Anonymous me said...

I guess we'll find out tomorrow.

My own preference is that no republican or blue dog anywhere in the country would get a single vote. (But how often do I get what I want in elections?)

 
At 10:14 PM, Blogger TeddyPartridge said...

WOW

This absolutely rocks.

Thank you for popping the balloon that is Nate Silver.

 
At 1:25 PM, Anonymous Anonymous said...

Good post. I remember his call on turnout and vote margins was way off in the 2008 Democratic primaries and also recall being distinctly unimpressed by his reputation as a seer.

 

Post a Comment

<< Home