# Electoral vote distributions are polynomials

Published on

In his article Electoral vote distributions are Monoids, Gabriel Gonzalez poses and answers the following question based on 538’s data:

what would be Hillary’s chance of winning if each state’s probability of winning was truly independent of one another?

To answer the question, Gabriel devises a divide-and-conquer algorithm. He computes probability distributions over vote counts in subsets of all states and then combines them. He also observes that vote distributions form a monoid.

Here I want to share an algebraic perspective on vote counting and show why distributions form a monoid.

Let $p_i$ be the probability of Hillary’s victory in state $i$, and $n_i$ be the number of electoral college votes for that state, where $i=1,\ldots,N$, and $N$ is the total number of states (and districts; see Gabriel’s post for details).

Then a vote distribution is a collection of probabilities $q_k$ that Hillary will get exactly $k$ votes:

$\newcommand{\p}{\mathrm{Pr}\{#1\}} \begin{equation} q_k = \p{\text{number of votes for H.Clinton} = k},\;k=1,\ldots,\sum_{i=1}^N n_i. \end{equation}$

Consider the following polynomial:

$Q(x)=\prod_{i=1}^N\left(p_i x^{n_i}+(1-p_i)\right).$

This is a product of $N$ brackets, one for each state. If we expanded it, we would get $2^N$ terms. Every such term takes either $p_i x^{n_i}$ or $1-p_i$ from each bracket and multiplies them. Every such term corresponds to a particular election outcome: if Hillary won in a particular state, take $p_i x^{n_i}$ from the corresponding bracket; otherwise, take $1-p_i$.

For example, if an election outcome means that Hillary won in states $1,4,\ldots$ and lost in states $2,3,\ldots$, then the corresponding term is

$p_1 x^{n_1}(1-p_2)(1-p_3)p_4 x^{n_4}\ldots=p_1(1-p_2)(1-p_3)p_4\ldots x^{n_1+n_4+\ldots}.$

Notice that $p_1(1-p_2)(1-p_3)p_4\ldots$ is exactly the probability of the outcome (under the independence assumption) and $n_1+n_4+\ldots$ is the number of votes for Hillary under that outcome.

Since the power of $x$ in each term is the number of votes for Hillary, outcomes that result in the same number of votes, say $k$, correspond to like terms. If we combine them, their probabilities (terms’ coefficients) will add up. To what? To $q_k$, the total probability of Hillary getting $k$ votes.

Therefore,

$Q(x) = \sum_{k}q_kx^k.$

Deriving the final vote distribution $q_k$ from $p_i$ and $n_i$ is just expanding and reducing $Q(x)$ from $\prod_{i=1}^N\left(p_i x^{n_i}+(1-p_i)\right)$ to $\sum_{k}q_kx^k$.

As Gabriel notes, doing this in the direct way would be inefficient. His divide-and-conquer approach directly translates to expanding $Q(x)$: divide all brackets into two groups, recursively expand the groups, combine the results.

Under this formulation, it becomes obvious that vote distributions form a proper monoid: it is just a monoid of polynomials under multiplication.