
Click to the left again to hide and once more to show the dynamic interactive window 
Recall that last time we created the graph of the average value of $\tau$ up to $n$, for bigger and bigger $n$. We showed there that $$\tau(n)=O(\sqrt{n})\; .$$ However, we might also want to know what the average value of $\tau$ is, not just what it's less than! Here, it seems that it's hard to find the 'right' value of $C$ so that the average value would be the same order as $\sqrt{n}$.
Trying $x^{1/3}$ doesn't seem to make matters any better. In fact, one can show that $\tau(n)=O(\sqrt[3]{n})$ as well. Here are the steps one might take. We make fleshing out the details an exercise.
So where does it go? To answer this, we will look at a very different graph!

The fundamental observation is that $\tau(n)$ is precisely the number of positive lattice points $(x,y)$ such that $xy=n$.
To be more in line with our previous notation, $\tau(n)$ is exactly given by the number of points $\left(d,\frac{n}{d}\right)$  so that $d\frac{n}{d}=n$.
So $\sum_{k=1}^n \tau(k)$ is the number of lattice points on or under the hyperbola $y=n/x$.
This is a completely different way of thinking of the divisor function! We can see it for various sizes below.
Click to the left again to hide and once more to show the dynamic interactive window 
So what we will do is try to look at the lattice points as approximating  you guessed it  an area! Just like with the sum of squares function.
Click to the left again to hide and once more to show the dynamic interactive window 
For each lattice point involved in $\sum_{k=1}^n \tau(k)$, we put a unit square to the lower right. We are basically interpreting the lattice points as two different sums.
The area of the squares can then be thought of as a Riemanntype sum or as our summation of $\tau$.
It should be clear that the area is "about" $$\int_1^n \frac{n}{x}dx=n\ln(x)\biggr\vert_1^n=n\ln(n)n\ln(1)=n\ln(n)\; .$$ Why is this actually a good estimate, though? The answer is in the error!
Click to the left again to hide and once more to show the dynamic interactive window 
Look at the shaded difference between the area under the curve (which is $n\ln(n)$) and the area of the red squares (which is the sum of all the $\tau$ values).
That means:
We can verify this graphically by plotting the average value against $\ln(n)$:

Looking' good! There does seem to be some predictable error. What might it be?
Click to the left again to hide and once more to show the dynamic interactive window 
Keeping $x=0$ in view, it seems to be somewhat less than $0.2$, although the error clearly bounces around. By zooming in, we see the error bouncing around roughly between $0.15$ and $0.16$, more or less, as $x$ gets large. So will this give us something more precise?
To answer this, we will try one more geometric trick.
Click to the left again to hide and once more to show the dynamic interactive window 
Notice we have now divided the lattice points up evenly in three parts.
So $$\sum_{k=1}^n \tau(k)=\sum_{d\leq \sqrt{n}} (\lfloor n/d\rfloord)+\sum_{d\leq \sqrt{n}} (\lfloor n/d\rfloord) +\lfloor\sqrt{n}\rfloor\leq 2\sum_{d\leq \sqrt{n}} (n/dd) +\sqrt{n}$$ where the error involved is certainly less than one for each $d$, so the total error is at most $2\sqrt{n}+1=O(\sqrt{n})$.
Thus we can rewrite this, using a wellknown identity from Transitions to Higher Math, as $$\sum_{k=1}^n \tau(k)=2n\sum_{d\leq \sqrt{n}}\frac{1}{d}2\sum_{d\leq \sqrt{n}}d+O(\sqrt{n})= 2n\sum_{d\leq \sqrt{n}}\frac{1}{d}2\left(\frac{\lfloor\sqrt{n}\rfloor(\lfloor\sqrt{n}\rfloor+1)}{2}\right)+O(\sqrt{n})\, .$$ The difference between $\left(\frac{\lfloor\sqrt{n}\rfloor(\lfloor\sqrt{n}\rfloor+1)}{2}\right)$ and $\frac{n}{2}$ is once again far less than $O(\sqrt{n})$ (and negative to boot), so we finally get that $$\sum_{k=1}^n \tau(k)=2n\sum_{d\leq \sqrt{n}}\frac{1}{d}n+O(\sqrt{n})\Rightarrow \frac{1}{n}\sum_{k=1}^n \tau(k)=2\sum_{d\leq \sqrt{n}}\frac{1}{d}1+O(1/\sqrt{n})\; .$$
We're almost at the end of the story! Now, we just need to relate this sum to $\ln(n)$. (Remember, we already are pretty well convinced that $\ln(n)$ is close to the average value of $\tau$; the problem is what the error is.)
Click to the left again to hide and once more to show the dynamic interactive window 
This graphic shows the exact difference between $\sum_{k=1}^{m1} \frac{1}{k}$ and $\ln(m)$. Clearly, even as $m\to\infty$, the total area is simply the sum of a bunch of nearlytriangles with width exactly one and no intersection of height (again this idea), with total height less than $1$. So the difference between $\sum_{k=1}^{m1} \frac{1}{k}$ and $\ln(m)$ will be finite as $m\to\infty$.
This number is very important! It is called $\gamma$, or the EulerMascheroni constant.
Definition: $$\gamma=\lim_{m\to\infty}\left(\sum_{k=1}^{m1} \frac{1}{k}\ln(m)\right)$$
Notice that the 'missing' part of the area (since we can't actually view all the way out to infinity) must be less than $1/m$, since it will be the part lower than all the pieces we can see in the graphic for any given $m$. So $\gamma$ is within $O(1/n)$ of any given amount finite $\sum_{k=1}^{m1} \frac{1}{k}\ln(m)$.
Now we put it all together! We know from above that $$\frac{1}{n}\sum_{k=1}^n \tau(k)=2\sum_{d\leq \sqrt{n}}\frac{1}{d}1+O(1/\sqrt{n})\, .$$ Further, we can now substitute in the following for $\sum_{d\leq \sqrt{n}}\frac{1}{d}$; $$\sum_{d\leq \sqrt{n}}\frac{1}{d}= \ln(\sqrt{n})+\gamma+O(1/\sqrt{n})\; .$$ Once we do that, and take advantage of the log fact $2\ln(z)=\ln\left(z^2\right)$, we get $$\frac{1}{n}\sum_{k=1}^n \tau(k)= \ln(n)+(2\gamma1)+O(1/\sqrt{n})\, .$$ That is exactly the asymptote and type of error that I have depicted below!
Click to the left again to hide and once more to show the dynamic interactive window 
Note that it's not hard to prove that $\tau$ grows at least as fast as $\ln(n)$, so this is a fairly sharp result. (It turns out that it's even possible to show that the error in the average is in fact $O(1/\sqrt[3]{x})$, but is not $O(1/\sqrt[4]{x})$.)
Could this conceivably be used for $\sigma=\sigma_1$? The answer is YES! Consider the following rewrite of the sum of sigmas, which are themselves the sum of divisors: $$\sum_{n\leq x}\sigma(n)=\sum_{n\leq x}\sum_{q\mid n} q = \sum_{q,d\text{ such that }qd\leq x} q = \sum_{d\leq x}\sum_{q\leq \frac{x}{d}} q\, .$$ So we have changed from a sum of sums of divisors (which might not be consecutive, and makes $\sigma$ annoying to compute) to a sum of sums of consecutive integers.
We can think about this graphically again. Instead of comparing points on a hyperbola with points in columns or rows, though, we will compare numbers at points on a hyperbola with numbers at points in rows. We can think of it as summing up a weighted set of points. The picture below tells it all.
Click to the left again to hide and once more to show the dynamic interactive window 
In the first one that shows up, we see that $$\sum_{k=1}^6\sigma(k)=1+(1+2)+(1+3)+(1+2+4)+(1+5)+(1+2+3+6)=$$ $$(1+2+3+4+5+6)+(1+2+3)+(1+2)+1\; ,$$ which means we can think of it as a sum of sums from $1$ to the length of each row. Now let's note three things.
So if we combine the information above with the formula, we get $$\sum_{n\leq x}\sigma(n)=\sum_{d\leq x}\sum_{q\leq \frac{x}{d}}q= \sum_{d\leq x}\left[\frac{1}{2}\left\lfloor\frac{x}{d}\right\rfloor^2+\frac{1}{2}\left\lfloor\frac{x}{d}\right\rfloor\right]=\sum_{d\leq x}\left[\frac{1}{2}\left(\frac{x}{d}\right)^2+\frac{1}{2}\left(\frac{x}{d}\right)+O\left(\frac{x}{d}\right)\right]\, .$$
But this is actually possible to analyze! First, some order calculations.
Next, let's get more information about $\sum_{d\leq x}\left[\frac{1}{2}\left(\frac{x}{d}\right)^2\right]$.
Thus the whole crazy double sum can be approximated as follows, quite accurately: $$\sum_{n\leq x}\sigma(n)= \frac{x^2}{2}\sum_{d\leq x}\left(\frac{1}{d^2}\right)+\frac{x}{2}\sum_{d\leq x}\frac{1}{d}+O(x\ln(x))$$ $$=\frac{x^2}{2}\left(\sum_{d=1}^{\infty}\left(\frac{1}{d^2}\right)\frac{1}{x}+O(1/x^2)\right)+O(x\ln(x))=\frac{x^2}{2}\sum_{d=1}^{\infty}\left(\frac{1}{d^2}\right)\frac{x}{2}+O(x\ln(x))\, .$$
And the average value of $\sigma$ must be this divided by $x$, namely $$\frac{1}{x}\sum_{n\leq x}\sigma(n)\text{ is }\frac{x}{2}\sum_{d=1}^{\infty}\frac{1}{d^2}+O(\ln(x))\, .$$ Since we know that the series converges, this means the average value of $\sigma$ increases quite linearly, with an error (at most) increasing logarithmically! This might be a shock  that one could actually get something fairly accurate like this relatively easily using calculus ideas like improper integrals and (implicitly) the integral test for infinite series. But check out the data!

Click to the left again to hide and once more to show the dynamic interactive window 
Of course, one might ask what the slope of this line is! It would have to be $m=\frac{1}{2}\sum_{k=1}^{\infty}\frac{1}{d^2}$. Who remembers this from Calculus II (possibly) or History of Math (definitely)?
Click to the left again to hide and once more to show the dynamic interactive window 
This is Euler's solution to the Basel problem, which is $\frac{\pi^2}{6}$, so the slope is $\frac{\pi^2}{12}$. Amazing!
We will see these $\sum_{d\leq x}\frac{1}{d^k}$ again soon.
We can even do something similar for the Euler $\phi$ function. See the exercises.
Homework:
