# Is differentiability arbitrary?

Published on

A function $$f$$ is called differentiable at point $$x_0$$ if it can be approximated by a linear function near $$x_0$$.

More formally, $$f$$ is differentiable at $$x_0$$ iff for some number $$A$$ and for all $$x$$

$f(x)=f(x_0)+A\cdot(x-x_0) + o(x-x_0).$

Here $$f(x_0)+A\cdot(x-x_0)$$ is the linear function of $$x$$ that approximates $$f(x)$$ near $$x_0$$, and $$o(x-x_0)$$ is the approximation error — a function of $$x$$ such that $$\frac{o(x-x_0)}{x-x_0}$$ tends to 0 when $$x-x_0$$ tends to 0.

There is something arbitrary about this definition. Why are we approximating $$f$$ by a linear function, and not by the square function $$x^2$$, or the square root function $$\sqrt{x}$$, or the sine function $$\sin x$$?

In my experience, this is almost never explained by high school teachers or university professors who introduce differentiability to students. At best they may say that linear functions play a very important role in mathematics which, while true, is just begging the question.

## Approximating by the square function

If a function can be approximated by the square function, i.e.

$f(x)=f(x_0)+B\cdot(x-x_0)^2 + o((x-x_0)^2),$

then it can also be appoximated by a linear function — or, more specifically, a constant function $$f(x_0)$$. This is because $$(x-x_0)^2=o(x-x_0)$$, so the whole term $$B\cdot(x-x_0)^2 + o((x-x_0)^2)$$ can put into the approximation error $$o(x-x_0)$$ in the usual definition, and $$A$$ can be set to zero.

Conversely, if a function is differentiable and its derivative is zero, then it can be approximated by the quadratic function.

## Approximating by the square root function

Consider a function that can be approximated by the square root function:

$f(x)=f(x_0)+B\cdot\sqrt{x-x_0} + o(\sqrt{x-x_0}).$

An example would be the square function itself, $$f(x)=\sqrt{x}$$, near $$x_0=0$$.

But is this property “sustainable”? It can hold at a single point $$x_0$$, but can it hold for all points in an interval in the same way as a function can be differentiable at all points of an interval?

Suppose that

$f(x)=f(x_0)+B(x_0)\cdot\sqrt{x-x_0} + o(\sqrt{x-x_0})$

for all $$x_0$$ in $$[a,b]$$. Pick an integer $$M>0$$ and divide $$[a,b]$$ into $$M$$ equal sub-intervals $\left[a+(b-a)\cdot \frac{m}{M}, a+(b-a)\cdot \frac{m+1}M\right],$ $$0\leq m\leq M-1$$.

Then

$\begin{split} f(b)-f(a) & =\sum_{m=0}^{M-1} f\left(a+(b-a)\cdot \frac{m+1}M\right)-f\left(a+(b-a)\cdot \frac{m}{M}\mathstrut\right) \\ & =\sum_{m=0}^{M-1} B\left(a+(b-a)\cdot \frac{m}M\right)\sqrt{\frac{b-a}M} + o(1/M). \end{split}$

You can already notice that something is strange. Suppose that $$B(x)$$ does not depend on $$x$$, $$B(x)\equiv B$$. Then the right-hand side above reduces to $$B\sqrt{M(b-a)}+o(1/M)$$ and goes to infinity as $$M$$ increases; it cannot possibly remain equal to $$f(b)-f(a)$$.

More generally (but for the same reason), there can’t be any point on $$[a,b]$$ where $$B$$ is continuous and not equal to 0, or any sub-interval of $$[a,b]$$ where $$B(x)>\epsilon>0$$.

So this approximation is rather weird unless $$B(x)\equiv 0$$.

## Approximating by the sine function

What happens if we try to approximate by, say, $$\sin x$$?

$f(x)=f(x_0)+B\cdot\sin(x-x_0) + o(\sin(x-x_0)).$

To be careful, we need to restrict this approximation to $$x$$ in some neighborhood of $$x_0$$. We didn’t have to do this before because when $$x$$ is not close to $$x_0$$, $$o((x-x_0)^{\alpha})$$ can be anything at all. Here, however, $$\sin(x-x_0)$$ can be small even if $$x-x_0$$ is not small.

We know that $$\sin x$$ is itself differentiable in the usual sense, its derivative at 0 being 1:

$\sin (x-x_0)=(x-x_0) + o(x-x_0).$

Therefore,

$f(x)=f(x_0)+B\cdot(x-x_0) + o(x-x_0),$

i.e. $$f$$ is differentiable in the usual sense as well, and vice versa.

## Conclusion

Generalizing from these examples, let’s say that we approximate a function $$f$$ with a function $$g$$ in the neighborhood of $$x_0$$:

$f(x)=f(x_0)+B\cdot g(x-x_0) + o(g(x-x_0)),$

where $$g(0)=0$$ so that $$o(g(x-x_0))$$ has the usual meaning.

Then:

1. If $$g$$ is itself differentiable at 0 and $$g'(0)\neq 0$$ (as in $$g(x)=\sin x$$), we get the usual class of differentiable functions.
2. If $$g$$ is differentiable at 0 and $$g'(0)=0$$ (as in $$g(x)=x^2$$), then we get a subclass of differentiable functions for which the derivative at $$x_0$$ is 0.
3. If $$g$$ itself is not differentiable at 0 (as in $$g(x)=\sqrt x$$), then this property will hold either at an isolated point or describe some strange (non-smooth) functions.

This doesn’t quite tell us why we pick $$x$$ instead of $$\sin x$$ — the answer would be some hand-wavy simplicity arguments — but at least it reassures us that we are not missing much by focusing on linear approximations.