Published on October 28, 2016; tags: Mathematics, Probability

In his article Electoral vote distributions are Monoids, Gabriel Gonzalez poses and answers the following question based on 538’s data:

what would be Hillary’s chance of winning if each state’s probability of winning was truly independent of one another?

To answer the question, Gabriel devises a divide-and-conquer algorithm. He computes probability distributions over vote counts in subsets of all states and then combines them. He also observes that vote distributions form a monoid.

Here I want to share an algebraic perspective on vote counting and show *why* distributions form a monoid.

Let \(p_i\) be the probability of Hillary’s victory in state \(i\), and \(n_i\) be the number of electoral college votes for that state, where \(i=1,\ldots,N\), and \(N\) is the total number of states (and districts; see Gabriel’s post for details).

Then a vote distribution is a collection of probabilities \(q_k\) that Hillary will get exactly \(k\) votes:

\[ \newcommand{\p}[1]{\mathrm{Pr}\{#1\}} \begin{equation} q_k = \p{\text{number of votes for H.Clinton} = k},\;k=1,\ldots,\sum_{i=1}^N n_i. \end{equation} \]

Consider the following polynomial:

\[Q(x)=\prod_{i=1}^N\left(p_i x^{n_i}+(1-p_i)\right).\]

This is a product of \(N\) brackets, one for each state. If we expanded it, we would get \(2^N\) terms. Every such term takes either \(p_i x^{n_i}\) or \(1-p_i\) from each bracket and multiplies them. Every such term corresponds to a particular election outcome: if Hillary won in a particular state, take \(p_i x^{n_i}\) from the corresponding bracket; otherwise, take \(1-p_i\).

For example, if an election outcome means that Hillary won in states \(1,4,\ldots\) and lost in states \(2,3,\ldots\), then the corresponding term is

\[ p_1 x^{n_1}(1-p_2)(1-p_3)p_4 x^{n_4}\ldots=p_1(1-p_2)(1-p_3)p_4\ldots x^{n_1+n_4+\ldots}. \]

Notice that \(p_1(1-p_2)(1-p_3)p_4\ldots\) is exactly the probability of the outcome (under the independence assumption) and \(n_1+n_4+\ldots\) is the number of votes for Hillary under that outcome.

Since the power of \(x\) in each term is the number of votes for Hillary, outcomes that result in the same number of votes, say \(k\), correspond to like terms. If we combine them, their probabilities (terms’ coefficients) will add up. To what? To \(q_k\), the total probability of Hillary getting \(k\) votes.

Therefore,

\[Q(x) = \sum_{k}q_kx^k.\]

Deriving the final vote distribution \(q_k\) from \(p_i\) and \(n_i\) is just expanding and reducing \(Q(x)\) from \(\prod_{i=1}^N\left(p_i x^{n_i}+(1-p_i)\right)\) to \(\sum_{k}q_kx^k\).

As Gabriel notes, doing this in the direct way would be inefficient. His divide-and-conquer approach directly translates to expanding \(Q(x)\): divide all brackets into two groups, recursively expand the groups, combine the results.

Under this formulation, it becomes obvious that vote distributions form a proper monoid: it is just a monoid of polynomials under multiplication.