Over the past few years, I've read two books that have kept me up at night.
One of them is called The Merchant of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming. The subtitle says it all; it's a stunningly depressing true story about how a small group of scientists waged a campaign against science itself, with remarkable success.
The other is called Give Us the Ballot: The Modern Struggle for Voting Rights in America. This one's a stunningly depressing true story about how a small group of racists waged a campaign against voting rights for historically disenfranchised groups, again with remarkable success.
While these topics may sound like they fall under umbrellas of science, history, or civics, there's plenty of relevant mathematical conversations here, too. In terms of climate change, I wrote a story last year on general mathematical models of natural resource depletion and its effects on populations. In this story, I'd like to turn a mathematical lens on the topic of voting, and slap a coat of paint on a bunch of tabular data.
Let's explore results from the past decade of United States elections. Every two years, we elect our representatives in Congress, along with potentially a senator or a president. Months after the election is complete, the U.S. Assistance Commission (EAC) releases survey data on the election. This data includes information on things like registration, voter participation, and the number of precincts and poll workers. (More information can be found on their website.)
Along with the data, the EAC provides a report summarizing its findings. But with a decade of past results to compare against, I was curious about visualizing their findings across time. How have election patterns changed since 2008, if at all? To start answering this question, I pulled out the information I was curious about, and consolidated it by year.
On their own, unfortunately, numbers aren't the greatest storytellers. Let's try to frame all of these values into a more cohesive narrative.
(Note that results for the 2018 election aren't available yet. I'll add that data in once it's released.)
Let's start with the fundamentals: who's registered, who's voting, and who isn't voting despite eligibility? Using the map below, you can explore how different states compare across every election since 2008. There are five different statistics to choose from:
Active Registered Voters. How many active registered voters did the state report for that year?
Election Participants. How many ballots were received? Note that in some jurisdictions can include rejected provisional ballots. Think of it as a slight overestimate for the number of people who successfully voted.
Eligible Voters. Of all of the folks in the state, how many were eligible to vote (at least 18 years old, citizens, and no criminal history that would deprive them of their right to vote)? These numbers are taken from the United States Elections Project.
Registration Saturation. What percentage of eligible voters are active registered voters? Note that while this should be a value between 0% and 100%, because we're dealing with estimates, there are certain data points with values greater than 100%. For example, Kentucky and Maine in 2016 both have saturation rates a bit above 100%.
Election Turnout. What percentage of eligible voters actually participated in the election?
Nearly all of the data for these statistics comes from EAC reporting, though there are some inconsistencies requiring alternative sources. In particular, the 2016 New York data is from here, and the 2014 Alabama data is from here.
Here's how the country has looked through the lens of each of these statistics since 2008:
Figure 1: Map visualization of population data by year and state.
For a clearer picture of the trend across years within a state, here's a second representation of the same data:
Figure 2: Line representation of population data by year for a fixed state.
A few observations:
As you might expect, presidential election years correspond to substantial boosts in registration, participation, and overall turnout. For example, the graph of nearly every state's election participant figures show a distinctive zigzag shape, sloping up during presidential election years and down for the midterms.
In terms of average turnout across years, three of the top five states are neighbors (Minnesota, Wisconsin, and Iowa). The bottom five performers are more geographically distributed: the worst offenders are Mississippi, Texas, West Virginia, Utah, and Hawaii.
Here's a table summarizing the findings. You can check out the top or bottom performers for either registration saturation or voter turnout. Note that North Dakota has no data on registered voters dating back to 2008, which is why their saturation number is 0%.
Showing data on 5 states.
|State||Average Saturation||Average Turnout|
Table 1: Average registration and turnout percentages by state.
For more on the percentages, here are histograms highlighting the distrubtion of registration saturation and turnout by year. Note, once again, the impact of a presidential election:
Figure 3: Bar graph of registration and turnout by state.
(The careful reader may have noticed that some states on both sides of the political spectrum have higher registration numbers than estimated eligible voters. Nothing nefarious is going on here; it's typically boils down to inconsistences in methodology for coming up with these numbers. For more on this, see Snopes. Tl;dr: counting is hard.)
What I like about the EAC data is that it doesn't just track election participants. It also tracks election workers. While the data isn't as readily available, many jurisdictions report how many poll workers they had, and some even go so far as to report information on the ages of poll workers, or how difficult it was to find them.
Let's start with the fundamentals. Here are our map and line charts revisited, but with the focus on poll workers instead of voters. The statistics you can explore are:
% of Jurisdictions Reporting. What percentage of jursidictions within a state reported counts of poll workers on election day? Note that among those reporting, an even smaller number provided information on the ages of their poll workers or the difficulty in finding poll workers.
Poll Workers. How many poll workers worked on election day? Since not every jurisdiction reports these numbers, the counts you see are a lower bound on the number of workers.
Polling Places. How many polling places were there on election day? Since not every jurisdiction reports these numbers, the counts you see are a lower bound on the number of polling places.
Poll Workers per Polling Place. On average, how many poll workers were there for each polling place?
Poll Workers per 1,000 Election Participants. How many poll workers were there as a fraction of the population? (Here we consider only election participants from jurisdictions that reported poll worker information.)
Polling Places per 1,000 Election Participants. How many polling places were there as a fraction of the population? (Here we consider only election participants from jurisdictions that reported polling place information.)
Average Difficulty of Finding Poll Workers? All jurisdictions are asked to rate the difficulty of finding poll workers for a given election on a scale of 1 (very easy) to 5 (very difficult). This shows an average difficulty score across the state, taken from jurisdictions that reported a difficulty.
By each of these metrics, here's how the country has changed over time:
Figure 4: Map visualization of poll worker data by year and state.
Just like before, we can also dig into trends across years for a single state:
Figure 5: Line representation of poll worker data by year for a fixed state.
What do you notice by exploring these graphs? Here are some things I noticed:
There's an interesting association between the average number of poll workers per polling location and the political leanings of voters within the state. Among the five states with the lowest ratio of poll workers to polling places (Oklahoma, Alabama, Mississippi, Alaska, Pennsylvania), four of them have traditionally leaned Republican. Similarly, among the five states with the highest ratio (Oregon, Maryland, New York, Virginia, Hawaii), four have traditionally leaned Democrat.
It should be noted that Oregon is a bit of an outlier, since it has a robust vote by mail system in place. This has the effect of skewing the poll worker per polling place ratio quite high.
The number of polling places per 1,000 participants sometimes says more about the mechanics of that state's voting system than anything else. The three states with the lowest ratio (Oregon, Washington, Colorado) are also the three with the most robust vote by mail programs. On the other hand, of the four states with the highest ratio (Utah, West Virginia, Mississippi, Pennsylvania), only Utah and West Virginia offer some form of early voting.
With fewer parcipants in off-years, the per capita statistics (poll workers and polling places per 1,000 participants) tend to see a boost during the midterms.
Across all states, reported difficulty of finding workers has trended slightly upward, from about 3.07 in 2008 to 3.35 in 2016. Total number of jurisdictions reporting difficulty also went down substantially between 2014 to 2016 (6,444 to 3,183).
In addition to reporting on the number of poll workers and the difficulty in finding them, some jurisdictions also report statistics on the age distribution of poll workers. Below is a breakdown of those age groups by state and year. (Note that no states reported this information for 2008.)
Figure 6: Pie chart of poll worker ages, by state and year.
In the aggregate, distribution of poll worker ages has been relatively steady across elections. In terms of youth participation, some states do better than others. California leads the pack with poll workers under the age of 18; more than 20% of reported workers for the 2012 election were minors.
One other dimension that we haven't considered yet is the political leanings of the population. Are there any trends among states that lean Republican or Democratic?
To answer this question, we need data on the political affiliations of state electorates. Election results themselves provide some of this information, but this data can be somewhat problematic. For example, for Congressional races some districts are so heavily Republican or Democratic that candidates run unopposed, making it impossible to suss out party affiliation from the final vote tallies.
In order to get a more complete picture, I pulled data from Gallup surveys on party affiliation by state for each election year. These surveys record the percentage of respondants that identify as Republican or lean Repbulican, as well as the percentage that identify as Democrat or lean Democrat.
Below is another bar chart. Each bar corresponds to a state (including Washington, D.C.). The more blue the bar, the more the state leans Democratic; the more red the bar, the more Republican. You can adjust the election year as well as select one of several statistics to see how party affiliation does (or doesn't) play a role.
Figure 7: Exploring the influence of party affiliation on other statistics.
Here are some observations:
As has been well documented, Democratic party affiliation suffered huge setbacks between 2008 and 2016. Democrats lost over a dozen governorships and hundreds of state legislature seats between 2008 and 2016, as well as losing majorities in the House and a supermajority in the Senate. These losses are reflected in the party affiliation graphs for Democrats and Republicans between 2008 and 2016.
Midterm elections have substantially lower turnouts than presidential elections. Within a given election, it looks like turnout tends to be higher in states that are more likely to be tossup states. In other words, states that are solid blue or solid red tend to have lower turnouts.
While young people overall are much less likely to work at the polls, blue states are more likely to have young poll workers than red states.
What other things do you notice?
Looking at election data over time can provide us with a pulse check on the health of our grand experiment. The EAC data provides a great way to highlight common election statistics like turnout. It's also useful able to shine a light on the health of our elections by measuring how well they are staffed, and how many polling places we provide to our citizens. Unfortunately, these latter metrics appear to be headed in the wrong direction: poll workers are getting harder to find, on average, and reporting on poll workers has decreased.
It's a sad fact that while many states are taking concrete steps towards making voting easy and more accessible, other states are adopting proposals so regressive they could been written by Jim Crow legislators more than a century ago. In Florida, for example, legislators are trying to roll back efforts to restore voting rights for felons by enacting a modern day poll tax. In Tennessee, efforts are underway to criminalize common errors in voter registration drives. These efforts, though upsetting, are not that suprising when understood in the larger context of the fight for voting rights detailed so well in Give Us the Ballot.
We're fortunate that organizations continue to collect data on both the results and the general health of our elections. I've taken a stab at slicing the data in some ways I found interesting, but I don't have a monopoly on staring at numbers. If you're interested in digging deeper, you can find the raw data here.
An interactive introduction to gerrymandering.
Mathematical models of toxic tech culture.
Exploring income and wealth inequality through physics.