The Data Driving Political Campaigns (Guest: David Shor)

Media Thumbnail
  • 0.5
  • 1
  • 1.25
  • 1.5
  • 1.75
  • 2
This is a podcast episode titled, The Data Driving Political Campaigns (Guest: David Shor). The summary for this episode is: <p><span>Our guest was there in one of the seminal elections in the last decade, working on the 2012 Obama campaign in the legendary “cave” doing political data science. David Shor is now the Head of Political Data Science at Civis Analytics. Civis Analytics was founded in the afterglow of Obama’s win with the backing of Eric Schmidt, the former CEO of Google. These Obama campaign alums were building on the successes in polling and data science used so successfully in that campaign.</span></p>

Ben: Welcome to the Masters of Data podcast. The podcast where we talk about how data affects our businesses and our lives. And we talk to the people on the front lines of the data revolution and I'm your host, Ben Newton.
Ben: Our guest today was there in one of the seminal elections in the last couple of decades, working on the 2012 Obama campaign in the legendary Cave, doing political data science. David Shore is now the head of political data sciences, Civis Analytics.
Ben: Civis Analytics was founded in the afterglow of Obama's win with the backing of Eric Schmidt, the former CEO of Google. These Obama campaign alums were building on successes in polling and data science used so successfully in that campaign. I don't know about you, but I'm a political junkie, so I was super excited to do this interview. So without any further ado, let's dig in.
Ben: Welcome everybody to the Masters of Data podcast and I'm very excited to have David Shore on here, who's the head of political data science at Civis Analytics. Thank you for coming on the show, David.
David: Yeah, happy to be here.
Ben: I think I told you a little bit about this, but I am a huge political junkie and just love political news and I'm really excited to have somebody from your background on the podcast. I think this is going to be a really good discussion.
Ben: And in that vein, I'd love to hear a little bit more about how you got into the politics realm and political data science. Where did that start for you?
David: It's a funny story. This really all started back in I'd say, 2007. I was studying math in the undergrad and I thought it would be interesting to study statistics a little bit more, take a couple of electives. And I was taking econometrics at the same time that Obama won in Iowa in the very first primary.
David: And so after we had a couple of primaries going through, I actually just put together a toy data set, did like a linear regression, tried to predict how Obama was going to do as a function of like racial demographics and how democratic the place was. And I saw that it was really pretty easy and pretty predictable. This was a really interesting time, basically there were a couple of different hobbyists who were all on the internet at this point. Nate Silver went by a pseudonym called Poblano and had his own blog.
Ben: Oh, really?
David: It was really fun because back then Nate Silver had this blog, it was originally on daily close elections and he migrated over to blogspot under the pseudonym Poblano and there were like maybe 100, 200 people just kind of arguing and trading numbers in the comments.
David: So I had a blog where I did like 538 style stuff. There were a couple of other people doing it and I just kind of did that as a hobby for a couple of years. And then starting ... the thing that happened from there, I went to go study math in grad school, I went out to Russia and I was studying pure math.
Ben: And why Russia, by the way? I remember seeing that in your background. I thought that was a really interesting ... is this because you wanted to kind of get into a different environment? What was that like?
David: I went to college early and so when I was graduating, I was 17 and I wasn't like super sure what I wanted to do with my life. And my advisor had studied in Moscow and Russia is traditionally pretty good as far as math goes. And part of me wanted to get as far away from my parents as possible.
David: So I went out there, I studied math for a year, it was super fun. But I'd say that after a Russian winter, it was minus 20 degrees, I'm originally from Miami, that I just kind of realized I don't want to do pure math. Pure math is fun but I'm probably not smart enough to actually go ahead and pursue that as a career.
David: So in the meantime, I had been doing this blog and there was a professor at Princeton, Sam Wong, he actually is fairly well known, he has a political blog, he's been doing election forecasts since I think 2004 even. He invited me to come to his lab, so that was the first time that I ever really did any political work. I like kind of built his house forecasting model for the 2010 midterms.
David: I did a little bit of work for Nate Silver for a bit and then applied to join the Obama campaign and the rest is history. I got into the Obama campaign and that was basically my first like official paid political job.
Ben: And so tell me a little bit more about that, because I was reading a little bit of that on the background, you basically ran some sort of a forecasting system and built this kind of thing called The Golden Report. Tell me what that was like and what you were actually working on.
David: I mean I'll talk a little bit about what the Obama campaign was like. We were basically all trapped in this small room called the cave. The original fear was originally we were sitting out with everybody else. The Obama campaign was like basically the tech office, they had I think like the 16th floor or something like of an entire large office building in downtown Chicago.
David: And originally, we would just be ... like in 2011, during the whole debt ceiling crisis, if the election had been then, Barack Obama would have lost re-election basically to anyone. There were just all these data analysts like looking at the polls and crunching numbers and people would walk beside it and see, oh, we're losing.
David: And so as a result, we ended up getting shunted into basically this extended broom closet. By the end of the election, there were about 50 people just like kind of stuck inside this small windowless room, literally bolted shut for most of the day. And so I was there, I was 20 and I walked in and it turns out they didn't really have any sort of Nate Silvery election forecasting tools. They had been doing polls but they hadn't been doing anything to aggregate them, they hadn't been doing really anything to turn that into decisions.
David: And so what The Golden Report was, was basically this big multi-level Bayesian model that took into account all of the information we had available, public polls, our private polling program, the IDs we were getting on the ground and synthesized all of that to estimate our probability of winning the election, our probability of winning in every state. Basically, the covariance between all these different states. If we're losing Ohio, how likely is it that we're going to be winning Florida?
David: And this was powered by I think probably the largest polling program that any political campaign has ever done. Since that I haven't seen anything like it. I think in total we polled about a million and a half people in 2012. By the end of the election we were doing something like 1,000 people per battleground state per day, which I got pretty spoiled on.
David: In terms of what the report actually did, it was super nerve-wracking because basically what I would do is we would get all the data ... computers were a lot slower back then and so we would get all of the data like midnight from the vendors and then we would put it into our model. We would run it, it would finish by about 10:00 a.m. and then we would ... I'd screenshot it and I'd put it in an email and then that email would be sent to ... they were pretty secretive about it, pretty select group of campaign staff. I think for most of the year it only went out to like five or six people. I think the list was eventually expanded to like 11.
David: Barack Obama was on that list, so it was like kind of a crazy thing that as a 20 year old, I was just writing
Ben: That's awesome.
David: It's like a fun thing because I work in politics people go, oh, does Barack Obama know who you are? And answer is no, but he's seen a lot of slides that I've made. He has no idea who I am though.
David: And so anyway, this thing was used as an input for basically every single decision that the campaign made, whether it was ... and this was just generally a microcosm for what the analytics team did, this was really the first time I think any political campaign that basically every single decision that got made first went through the analytics team.
David: And that goes for like big questions, like how much money do you invest in a state or how much are we spending on a mail versus television versus digital? But it even went down to stuff like we were determining Barack Obama's schedule on an hour-by-hour basis, where we were saying first you got to go to Green Bay, Wisconsin and then you got to go to whatever.
David: And then it ended up, part of this is chance, the polls aren't always right, but in 2012 we ended up, I think the overall national result, we were off by less than a tenth of a percent. I think we said the answer would be, the national two way would be 51.92 and the answer was like 51.96 or something ridiculous like that. Most of that is luck.
David: And we got every battleground state down to within a point. It ended up being pretty accurate. The boring answer for why is because we called so many people, it was basically no sampling error. But the stats were cool too. That's basically what The Golden Report was. As a 20 year old it was like kind of a nerve-wracking thing to be doing, but it was a lot of fun.
Ben: Amazing experience. And one thing I know when I was reading up a little bit on this is that and like I said, I find this stuff fascinating. There was one term that kind of came out where he's talking about a persuadability score. So what was that? Because I mean this seemed like that was pretty important to decide where Obama would actually spend his time.
David: It's worth talking about what analytics and data science does in a political campaign. And so the biggest thing that we built in the Obama campaign was that we had these scores, support scores, turnout scores, persuasion scores. And the way they explained it to me on my first day is that we have a really unromantic view of political campaigns, which is that a lot of the decisions that political campaigns make is basically list making. You have a list of voters you have to turn out, you have a list of voters you have to persuade, you have a list of voters you have to raise money from.
David: And one of the most important things that you could do as a data scientist at a political campaign is to more efficiently rank order this list. Because you only have so many volunteers and you only have so much money.
David: And so basically what we would do is we had a big voter file of every single person in the country, it's all a matter of public record, it's not as creepy as it sounds. And we would produce these supervised learning models from our polling where we would say, what's the probability that you will vote for Barack Obama? Or what's the probability that you'll turn out to vote in this election? Or in terms of what a persuasion score is, if we send someone to knock on your door and they read this script about how the economy is getting better, how many incremental votes will that generate?
David: And so 2012 was the first campaign where we had actually built a persuasion score. Operationally it was like a pretty crazy thing, we did this giant field experiment involving hundreds of thousands of volunteers back in the spring. And the basic idea there is that you randomly pick a sample, you send volunteers to go knock on their door, you have some other group where the volunteers don't knock on their door and then you survey them both afterward. And then you kind of compare and model out the difference.
David: And it was a pretty important thing just because it turns out that some people are much more persuadable than others. And something that we found is that for a pretty giant chunk of the population, if you go and send a stranger to their door to argue about politics, it will make them less likely to support you. There's actually a paper from the 2008 Obama campaign that showed that their persuasion program ended up costing them votes. So being able to model that out and screen out the people who don't want to talk to you was a pretty big operational boon for what we were doing.
Ben: That's fascinating. It all makes sense, now you say it I can think of situations where particularly local politicians have come up and ... and I was like, okay, I like you a little less now.
David: We also don't want to waste your time knocking on the doors of people who definitely already support you and are definitely going to vote, which is a big chunk of the population. Most people aren't that persuadable.
Ben: That makes sense and so 2012, everything went great. Obama won. You're kind of wrapping up the operation, what'd you do then?
David: It was kind of surreal. I just remember the morning after the election, I get this text from my boss who's like you need to come to the office right now. I think, oh, fuck, there must be a recount, there must be some kind of emergency. I run over there.
David: I take, it's the day after the election, my only clean pair of clothes, so it's like gym shorts and like some XXL Boilermakers for Obama shirt. And I run into the office and Barack Obama is there. So it was just super surreal. My only photo with the president has me looking unshowered in gym shorts. I think it's like so bad that my mom won't show it to people, that's how bad a photo it is.
David: But the person who was also there, less important than Obama, it was after Obama left, I realized, Eric Schmidt is just kind of walking around the office, which was also super weird. He had heard basically about the kind of work that the Obama campaign had been doing from an analytics perspective and there was an idea that this isn't something that had been done before and that this kind of work should be preserved.
David: So he basically provided some initial seed funding and so me and I'd say maybe 10 other people who worked on the Obama campaign started a company I think in January of 2013, immediately after the election. Pretty crazy story, there were 10 of them back then, there's like 170 of us now.
David: That was basically the start of Civis Analytics, which is where I work right now.
Ben: That's a pretty cool origin story. Because I guess at that point Eric Schmidt, he was still on the board with Google I guess. So he's just walking around the office and offers to give you money to start a new company. That's pretty amazing.
David: I mean I was 21 at that point, it was pretty weird, but yeah, it was a good start, we were super lucky. Since then political is actually only a small part of my business. I'd say, I don't have an exact number but probably like 10%, 20% of our revenue.
David: But it's been pretty wild, we work with a pretty big variety of for-profit companies, nonprofits, you can see on our website, but I'd say basically we work with a pretty wide range of industries, ranging from movie studios to healthcare to telecommunications. It's pretty interesting.
Ben: I noticed that, I think you guys work with a pretty broad set of customers. And so you are running Civis Analytics looks for a few years and I think in particular I was really interested to hear, come around 2016, we actually had a previous episode where we talked to a professor of political science, to talk a little bit about the 2016 election.
Ben: I'd be really interested in hearing your perspective particularly from the kind of data science perspective about what do you think happened there? Because I mean the overall impression is that basically everybody got it wrong, nobody knew what was going on. And that seemed like it was a pretty about-face from what ... it seemed like in 2012, what you guys did with the Obama campaign was just so spot-on.
Ben: So what changed. Did anything change? I mean what was your perspective of all that?
David: Just to be clear, our polls were just as wrong as everyone else's, maybe a little bit more accurate. But I'd say the big story of 2016 was that every single pollster online or live calls or cell phones or internal versus external, basically everybody got the election wrong. And right after 2016, we definitely did a lot of soul-searching because from our perspective, it wasn't just that the numbers were wrong. When people talk about this, they'll say, oh, well, the national error was really only a point or two, that was basically what you'd expect on average.
David: The thing for us is that it wasn't just that the polls were wrong, it's that the polls were wrong in a way that led our campaigns to make bad decisions. So as a result of the polling, the Clinton campaign under invested in states like Wisconsin or Michigan. Because of the nature of why the polls were wrong and we'll talk about that in a second, the message pollsters ended up pushing Clinton to adopt overly cosmopolitan messages that probably ended up having backlash.
David: The thing that was the worst isn't just that the prediction was wrong, but that all of the advice, people use polling for a lot of stuff, not just to predict things and basically all of that advice was either muddled or pushed Democratic campaigns to do basically the opposite of what they should have done.
David: The big thing for us was why were polls wrong? And we think it really comes down to one simple thing, which is that people who take surveys are really weird. So when I started working in polling in 2012, we basically, about 12% of the people that we called would pick up the phone. And obviously, people don't pick up the phone at random. In general, older people are more likely to pick up the phone than younger people, women are more likely to pick up the phone than men. White people are more likely to pick up the phone than non-white people. But we can statistically adjust for all of that stuff.
David: Going up to 2016, we saw a decline in response rates from about 12% to about 0.8% and so when you're at 12%, you can do a bunch of statistical adjustments. You can ask about age and gender and number of people in the household and you can join to a voter file and try to correct for information like party registration. But when you're at point 0.8%, you basically can't do anything. Polling fundamentally is built on this assumption that the people who are answering your surveys are statistically exchangeable with people who aren't answering your surveys, once you control for like a lot of different demographics.
David: And what we saw in 2016 was that that was no longer true. And we think that this particular thing in a technical sense, it's called survey non-response bias. Since then, we've done a lot of analysis and work and we really think that survey non-response bias was the big reason why the polls were wrong.
David: I could like just walk a little bit through some other reasons that people have talked about. So after the election, a lot of people talked about the idea that maybe there was the shy Trump voter, that people were afraid to tell someone on the phone that they voted for Trump. We did a lot of research on that and we've looked on web, it basically seems like that's not a thing.
David: There was this idea that maybe there was a bunch of late movement, that wasn't captured. We had been polling I'd say thousands of people a day at this point. We're pretty sure that there wasn't late movement and then people talked about turnout, maybe this idea that all of these non-college educated white Trump voters who normally wouldn't vote, came out and voted for Trump.
David: Whether or not you vote is a matter of public record and as far as we can tell from the voter file administrative data, that's also not true. Turnout was actually a little bit better than everyone expected in terms of what percentage of the electorate is Democratic. So it really does boil down to this non-response bias issue, which is that the people who were picking up the phone just weren't the same as the people who weren't.
David: And so we've done additional research on exactly why. It's not enough to say that people are weird and basically what we found is that there's this concept of social trust, in the general social survey. There's a question which is, do you generally trust your neighbor, do you think that people can be trusted or do you think that people should keep to themselves?
David: And what we find is the actual true percentage of people who answer that people can be trusted to this question and this comes from the GSS, which is like a really high-quality government survey, where they pay people like $100 and send people to your door. They have a response rate of about 80%. The true value for this question is about 30% or so.
David: But when you do a telephone survey, in 2016, we found that even when we did all the statistical adjustments we could, about 50% of phone respondents answered that they think that people can generally be trusted. And so the idea here is basically that people who trust the system, people who trust their neighbors are much more likely to answer phone surveys from strangers than people who don't.
David: That's pretty plausible, that makes a lot of sense. This bias has always been here. People who didn't trust their neighbors have basically always not been answering phones. But it wasn't a problem in 2012 because in a sense, the 2012 election was a referendum on whether or not we should have universal healthcare. And so people who trusted their neighbors and people who didn't trust their neighbors basically voted at the same rates for Obama vs. Romney.
David: But the 2016 election was really different. 2016 was in a lot of ways a referendum on how much do you trust the system and how much do you trust society and how much do you think that your neighbors can be trusted? And basically there was this giant group of voters who voted for Obama in 2012 and then switched over to Trump in 2016. They were disproportionally white, disproportionally not college-educated. And that particular group had low levels of social trust and didn't pick up the phone.
Ben: Interesting.
David: And that's basically our sense of why 2016 was wrong.
Ben: That's fascinating. I hadn't heard that exactly as I could before and it actually makes a lot of sense, because fewer people are answering their phones overall and you would have naturally under counted the kind of people that would be voting for Trump. But if that's the case, what do political campaigns do now?
Ben: How are you advising people to adjust to this? How are you adjusting to this, with this kind of new reality?
David: So we basically came out of 2016 and totally revamped our measurement program. We no longer use phones, we conduct literally all of our surveys online. The more important thing is this idea that we're never going to find an unbiased source of data. It's no longer true that you can go and call people and get like a reasonable sample.
David: And so as a result, we have invested a lot in doing much more sophisticated statistical adjustments. This is where the data science comes in. Whenever I talk to my data science friends, like I work in pulling, everybody kind of rolls their eyes, but actually ... and I'm not exaggerating here, every time we do a poll we either build or use literally hundreds of supervised learning models. Because basically, the universe of ways in which survey response can be weird is really large and it's a high dimensional statistics problem.
David: We've invested a lot in being able to collect biased data and collect or survey respondents from a biased source and collect enough information on them, so that we can at scale statistically correct for whatever biases they have that are relevant for what we're trying to study. That was a big investment, like I'd say my team has like six data scientists working on this, cumulatively our entire survey operation, I'd say we have over the last year of probably something like 15 data scientists and 20 software engineers have worked them and stuff at one point or another, it's actually pretty hard.
David: That's been the big thing we've done differently is we kind of think that the era of polling where you just have one political science guy who goes and advises for a campaign and hires like seven interns and works with some phone vendor, we think like those days are done and measurement is a lot harder now than it used to be. And that it's much more of a data science problem that it used to be.
Ben: In particular, you're talking about that, the response rate going down from 12% to what did you say? 0.8? Have you guys found ways to increase the response rate at all?
David: Our general thing is that we don't use phones anymore, basically the cost of a phone survey is proportional to one divided by the response rate. And so as you get lower, your costs increase literally exponentially. At this point, we work entirely online and online web panels have their own problems. It turns out the people on those are super weird too in a lot of different ways, but the nice thing about it is you don't have to deal with phone vendors at 2:00 a.m. in the morning. You can automatically script things.
David: The big thing for us is we've kind of given up on this statistical ideal of being able to pull true random samples from the population and we focus a lot more on just collecting as much data on these people as we can. And then doing adjustment, statistical adjustments in the smartest way.
Ben: Do you have any sense for how that's working out so far? I mean do you kind of have to wait to see how it goes through in this election cycle or have you already been able to kind of prove out the methods beforehand?
David: I mean this is a big thing, so obviously I'm always filled with existential terror about these things because surveys are really hard and it's totally possible that we end up getting 2018 wrong. But basically what we've been doing is there has been a variety of elections that we've been involved in, whether it's Virginia or Alabama or any of the various House specials or state led specials or the primaries and we've been basically testing our new techniques on these as it's gone and we've done really well. I think we actually had a public-facing release on our Alabama work. But we did an enormous amount of polling there and I think ended up getting the result down to within half a point.
David: Something that was really fun about the Alabama result is I think we had been predicting a narrow loss, something like I think 49.8 was our final forecast. And so on election day, I remember thinking, oh, I really hope that our forecast is wrong but not so wrong because I want to win and I hope we get about 50, but not so wrong that it would make our methods look bad. If we got 52% or 53% that would be bad, so you want the election to be close.
David: But you don't want it to be too close because if it's it like 50.1, then I'm going to spend the next two weeks of my life doing recounts. I was like I want the polls to be wrong but I want us to get something like 50.4, 50.5 and final margin was 50.5, so it was like our polls weren't wrong, there wasn't a recount and we were able to celebrate on election night and we won. So it was like a perfect storm as far as that one.
David: We've done a bunch of in cycle validations so far and we're feeling good, but it's something that we constantly monitor because last cycle, social trusts and non-college educated whites, but this cycle it could be that we're not properly capturing youth engagement. The number of ways in which polls can go wrong is really big and so we're always trying not to fight the last battle.
Ben: You say that, I mean do you have a sense for how you can detect those, where you might be off before the results come in and you have to do another round of soul-searching? Is there a way to be more real-time about correcting?
David: I mean the big thing for us is collecting as much data as possible and kind of using machine learning to get a sense of what data is both correlated with the outcome we care about and correlated with response. And that statistically is a high dimensional stats problem, it's not trivial. It involves also just trying to collect as much as you can while staying within the boundaries of ethics and privacy.
David: But yeah, it's a hard problem and something that we invest a lot of resources in.
Ben: You talked a little bit about collecting a lot of data, particularly it seems like some of the issues around data science kind of emerged into public view and into public consciousness around politics, because with all the stuff that happened in 2016. I mean coming from your side, I mean where does kind of the ethics of data science in what you do kind of come into your view? Where do you worry about that, is that something that really is kind of top of mind?
David: It's absolutely top of mind in everything we do. Our biggest fear is basically that we don't want to be the next Cambridge Analytica. The big thing for us is that we don't think that ... in general, it's not necessary to collect people's Facebook likes or any of this unethical stuff. Most of the data that we work with is a matter of public record, whether or not you voted in the last election is a matter of public record or in most states, whether or not you're a registered Democrat, which is a voluntary thing. You fill it out in a form, is a matter of public record.
David: In general, we've always taken the position that it's not worth it to push to try to be creepy, because at the end of the day, making your targeting 1%, if you want to be cynical about it, making your targeting 1% better isn't worth the news story. And it's a distraction from the things in campaigns that are really important.
David: At the end of the day, we're trying to figure out should you focus on economic issues or does it make sense to run on gun control in this state? And you don't need super-creepy information to answer those kind of strategic questions.
Ben: It does make a lot of sense. So kind wrapping up to a certain extent, one thing that I was definitely interested in knowing about, so you talked about you're building out this team, you guys have built a big operation from just a few people to a couple hundred people. What are the kind of skill sets that you're looking at to build an operation like this? So you build the operations to be able to grow like you guys have done, I mean is it hard to find those skill sets? I mean what kind of skill sets are you looking for?
David: It's really hard. The problem that we have is that we're looking for smart, quantitatively minded people who can simultaneously kind of answer client needs, but also have a decent amount of technical depth. A problem that we have is someone might come out of a PhD in physics and want to spend all of their time building supervised learning models. But it's not enough to know how to build models, you also have to have the common sense to think about what models would actually be useful for people to use or what are the problems that I should actually be solving?
David: Especially since, while we're 170 people, we're not Google, we can't afford to just have giant R&D teams for R&D's sake. That's been the big thing for us is finding people who could be consultants but also possess the technical skills to go and build models and use technical tools in a smart way.
Ben: That definitely rings true because I had an interview with a guy named Christian Madsbjerg and one of the things he's really big about is bringing the liberal arts and the harder sciences and computer science together. And it seems to be like that's part of what you're getting is not only do you understand the computer science, do you understand the technical part of your job? Can you actually communicate it, can you actually understand what clients need and can you actually get ... I don't know, like the human part of it. Can you communicate, can you write, does that sound right to you?
David: I mean that sounds right and then there's also the fun thing of this is politics and knowing history and having broader political knowledge is really useful when you're actually doing this work. There are a lot of times where we'll push a model out and we will almost send it to clients and someone in the office who knows a lot about politics will look at this and go, no, this is crazy. There has to be a bug because like you think there's a bunch of Democrats in northern Florida, that's wrong.
David: And so it's actually the thing I find really exciting is that we're doing a lot of really interesting synthesis between history and political science and sociology, learning about what persuades people and what doesn't. But doing it in a way that's quantitatively challenging and interesting.
David: I think there's literally not a single data science buzzword that isn't like legitimately used in some way on our team. Whether it's like adversarial neural networks or deep learning or factorization machines, like we actually use all of this stuff, because these problems are really hard.
David: But at the same time, you'll have like a team meeting where you're talking about generators and neural networks in one breath and then suddenly be talking about the politics of North Western Arkansas in the 1970s in the next breath. They actually seamlessly integrate together really well.
David: And so it's really exciting, it's really interesting. I think that there are a lot of liberal arts minded quantitative people who are out there and want jobs like this and it's just a matter of letting them know that this option is there.
Ben: I tell you, that does sound like a lot of fun. You are in the political realm, you know what to look at and I'm sure you watch this stuff like a hawk day in and day out. What are you looking at and what's interesting to you in the election this year that you don't think necessarily everybody else is paying attention to?
David: People usually look election results in this one-dimensional way, which is are we going to do well or are we going to do badly? Basically what's our popular vote total going to be? There's a lot of signal there, whether it's the polling or whether it's the special elections we've gotten or whether it's the primary result turnout that's showed up so far, that point to Democrats kind of broadly having a good year.
David: But something that I think about a lot, in which I think there's definitely conflicting signals for, is it's not just enough to see how well you do nationally. Hillary Clinton won the popular vote. She's not president right now.
David: The basic thing is that from a national perspective, 2012 and 2016 weren't really that different each other. Hillary got 51.1%, Clinton got 51.9%, these aren't big differences in the scheme of things. But there was a massive change on a state level where educated white people swung toward Clinton and non-college educated white people swung heavily toward Trump.
David: And the problem with this kind of shifting coalition where Democrats rely more on votes from college-educated white people in cities, is that this is a coalition that is pretty antithetical to like the structure of how the American government is set up. Basically, this is a coalition that's structurally biased against, in the electoral college, to a lesser extent in the House, definitely in the Senate.
David: So the big question to me isn't just how well we do, like maybe we get 54% of the vote, but how much is the coalition in 2018 going to resemble Obama's coalition where we did fairly badly among rich whites but did very well among mid-Western non college-educated whites. Or Clinton's coalition, that was kind of more heavily focused on doing well in affluent suburbs.
David: And to the extent to which it's the latter, then we'll run into a lot of problems in terms of outcomes. Like moving forward, if you flash forward 10 years from now, it's really hard to imagine us keeping the Senate, for example, just because there are so many rural states that are overwhelmingly white and have low education levels that are over represented in the Senate.
David: That's the big thing for me is thinking about what the nature of the coalitions are going to be, not just the overall vote share.
Ben: That actually makes a lot of sense. David, this has been a fascinating discussion. I really appreciate you taking the time to talk us through this and maybe we will be able to catch up to you again after the election to see how things went. I appreciate your time.
David: Perfect, sounds great.
Outro: Masters of Data is brought to you by Sumo Logic. Sumo Logic is a cloud native, machine data analytics platform delivering real-time continuous intelligence as a service to build, run and secure modern applications. Sumo Logic empowers the people who power modern business. For more information, go to
Outro: For more on Masters of Data, go to and subscribe and spread the word by rating us on iTunes or your favorite podcast app.
Page of


Our guest was there in one of the seminal elections in the last decade, working on the 2012 Obama campaign in the legendary “cave” doing political data science. David Shor is now the Head of Political Data Science at Civis Analytics. Civis Analytics was founded in the afterglow of Obama’s win with the backing of Eric Schmidt, the former CEO of Google. These Obama campaign alums were building on the successes in polling and data science used so successfully in that campaign.