# Surprising reciprocity

Published on

I have two correlated random variables, $X$ and $Y$, with zero mean and equal variance. I tell you that the best way to predict $Y$ based on the knowledge of $X$ is $y = a x$. Now, you tell me, what is the best way to predict $X$ based on $Y$?

Your intuition might tell you that if $y = ax$, then $x = y/a$. This is correct most of the time… but not here. The right answer will surprise you.

So what is the best way to predict $Y$ based on $X$ and vice versa? Let’s find the $a$ that minimizes the mean squared error $E[(Y-aX)^2]$:

$E[(Y-aX)^2] = E[Y^2-2aXY+a^2X^2]=(1+a^2)\mathrm{Var}(X)-2a\mathrm{Cov}(X,Y);$

$\frac{\partial}{\partial a}E[(Y-aX)^2] = 2a\mathrm{Var}(X)-2\mathrm{Cov}(X,Y);$

$a=\frac{\mathrm{Cov}(X,Y)}{\mathrm{Var}(X)}=\mathrm{Corr}(X,Y).$

Notice that the answer, the (Pearson) correlation coefficient, is symmetric w.r.t. $X$ and $Y$. Thus it will be the same whether we want to predict $Y$ based on $X$ or $X$ based on $Y$!

How to make sense of this? It may help to consider a couple of special cases first.

First, suppose that $X$ and $Y$ are perfectly correlated and you’re trying to predict $Y$ based on $X$. Since $X$ is such a good predictor, just use its value as it is ($a=1$).

Now, suppose that $X$ and $Y$ are uncorrelated. Knowing the value of $X$ doesn’t tell you anything about the value of $Y$ (as far as linear relationships go). The best predictor you have for $Y$ is its mean, $0$.

Finally, suppose that $X$ and $Y$ are somewhat correlated. The correlation coefficient is the degree to which we should trust the value of $X$ when predicting $Y$ versus sticking to $0$ as a conservative estimate.

This is the key idea—to think about $a$ in $y=ax$ not as a degree of proportionality, but as a degree of “trust”.