MAT 338 Day 35 2011

2409 days ago by kcrisman

def L(n): ls = [] out = 0 for i in range(1,n+1): out += sigma(i,0) ls.append((i,out/i)) return ls P = line(L(1000000)) 
       
@interact def _(a=.02,n=2): show(P+plot(a*x^(1/n),(x,1,10^6),color='red',linestyle='--')) html("Blue is the average value of $\\tau$$") html("Red is $%sx^{1/%s}$"%(a,n)) 
       

Click to the left again to hide and once more to show the dynamic interactive window

Recall that last time we created the graph of the average value of $\tau$ up to $n$, for bigger and bigger $n$.  We showed there that $$\tau(n)=O(\sqrt{n})\; .$$ However, we might also want to know what the average value of $\tau$ is, not just what it's less than!  Here, it seems that it's hard to find the 'right' value of $C$ so that the average value would be the same order as $\sqrt{n}$. 

Trying $x^{1/3}$ doesn't seem to make matters any better.  In fact, one can show that $\tau(n)=O(\sqrt[3]{n})$ as well.  Here are the steps one might take.  We make fleshing out the details an exercise.

  • Note that $\tau$ is multiplicative.
  • For a given prime $p$, note that $\tau\left(p^x\right)=x+1$ grows much more slowly than $\left(p^x\right)^{1/3}=p^{x/3}$, which is exponential in $x$.  
    • What value do each of these have at $x=0$?
    • Take derivatives of both functions at $x=0$ to show that the growth statement is definitely true for $p\geq 23$.
    • Show that for each prime $p$ less than $23$ there is an $x_p$ such that the growth statement is true after $x_p$.
  • Put these pieces of information together to show that $\tau$ is $O\left(x^{1/3}\right)$.

So where does it go?  To answer this, we will look at a very different graph!

n=10 viewsize=n+1 var('x,y') g(x)=1/x P=Graphics() P += contour_plot(x*y,(x,0,viewsize),(y,0,viewsize),cmap=['black'],fill=False,contours=[2,3,8],labels=True,label_inline=True,label_inline_spacing=5,label_fmt="$n=%d$") P += plot(n*g,(x,0,n+1)) grid_pts = [[i,j] for i in [1..viewsize] for j in [1..viewsize]] P += points(grid_pts,rgbcolor=(0,0,0),pointsize=2) lattice_pts = [coords for coords in grid_pts if (coords[0]*coords[1]<=n)] P += points(lattice_pts, rgbcolor = (0,0,1),pointsize=20) show(P,ymax=viewsize,aspect_ratio=1) 
       

The fundamental observation is that $\tau(n)$ is precisely the number of positive lattice points $(x,y)$ such that $xy=n$.

To be more in line with our previous notation, $\tau(n)$ is exactly given by the number of points $\left(d,\frac{n}{d}\right)$ - so that $d\frac{n}{d}=n$.

So $\sum_{k=1}^n \tau(k)$ is the number of lattice points on or under the hyperbola $y=n/x$.  

This is a completely different way of thinking of the divisor function!  We can see it for various sizes below.

@interact def _(n=(15,range(2,50))): viewsize=n+1 g(x)=1/x P=Graphics() P += plot(n*g,(x,0,n+1)) P += plot(2*g,(x,0,n+1),linestyle="--") if n>7: P += plot((n-5)*g,(x,0,n+1),linestyle="--") grid_pts = [[i,j] for i in [1..viewsize] for j in [1..viewsize]] P += points(grid_pts,rgbcolor=(0,0,0),pointsize=2) lattice_pts = [coords for coords in grid_pts if (coords[0]*coords[1]<=n)] P += points(lattice_pts, rgbcolor = (0,0,1),pointsize=20) # P += plot(x,(x,0,viewsize),linestyle="--",rgbcolor=(0,0,0)) show(P,ymax=viewsize,aspect_ratio=1) 
       

Click to the left again to hide and once more to show the dynamic interactive window

So what we will do is try to look at the lattice points as approximating - you guessed it - an area!  Just like with the sum of squares function.

@interact def _(n=(8,range(2,25))): viewsize=n+1 g(x)=1/x P=Graphics() P += plot(n*g,(x,0,n+1)) P += plot(n*g,(x,1,n),fill=True,fillalpha=.3) grid_pts = [[i,j] for i in [1..viewsize] for j in [1..viewsize]] P += points(grid_pts,rgbcolor=(0,0,0),pointsize=2) lattice_pts = [coords for coords in grid_pts if (coords[0]*coords[1]<=n)] P += points(lattice_pts, rgbcolor = (0,0,1),pointsize=20) # P += plot(x,(x,0,viewsize),linestyle="--",rgbcolor=(0,0,0)) squares=[line([[k,l],[k+1,l],[k+1,l-1],[k,l-1],[k,l]],rgbcolor=(1,0,0)) for [k,l] in lattice_pts] for object in squares: P += object show(P,ymax=viewsize,aspect_ratio=1) 
       

Click to the left again to hide and once more to show the dynamic interactive window

For each lattice point involved in $\sum_{k=1}^n \tau(k)$, we put a unit square to the lower right.  We are basically interpreting the lattice points as two different sums.

  • We can think of it as $\sum_{k=1}^n \tau(k)$ - adding up the lattice points along each hyperbola.
  • We can think of it as $\sum_{j=1}^n \left\lfloor\frac{n}{k}\right\rfloor$ - adding up the lattice points in each vertical column.

The area of the squares can then be thought of as a Riemann-type sum or as our summation of $\tau$.

It should be clear that the area is "about" $$\int_1^n \frac{n}{x}dx=n\ln(x)\biggr\vert_1^n=n\ln(n)-n\ln(1)=n\ln(n)\; .$$  Why is this actually a good estimate, though?  The answer is in the error!

@interact def _(n=(8,range(2,25))): viewsize=n+1 g(x)=1/x P=Graphics() P += plot(n*g,(x,1,n)) P += plot(Piecewise([[(j,j+1),floor(n/j)] for j in [1..n-1]]),fill=n/x,fillalpha=.3,linestyle='')+plot(1,(x,n,n+1),fill=True,fillalpha=.3,linestyle='') grid_pts = [[i,j] for i in [1..viewsize] for j in [1..viewsize]] P += points(grid_pts,rgbcolor=(0,0,0),pointsize=2) lattice_pts = [coords for coords in grid_pts if (coords[0]*coords[1]<=n)] P += points(lattice_pts, rgbcolor = (0,0,1),pointsize=20) squares=[line([[k,l],[k+1,l],[k+1,l-1],[k,l-1],[k,l]],rgbcolor=(1,0,0)) for [k,l] in lattice_pts] for object in squares: P += object show(P,ymax=viewsize,aspect_ratio=1) html("Error between $\\tau(%s)$ and $%s\ln(%s)$"%(n,n,n)) 
       

Click to the left again to hide and once more to show the dynamic interactive window

Look at the shaded difference between the area under the curve (which is $n\ln(n)$) and the area of the red squares (which is the sum of all the $\tau$ values).

  • All the areas where the red squares are above the hyperbola add up to less than $n$, because they are all 1 in width or less, and do not intersect vertically (they stack, as it were).
  • Similarly, all the areas where the hyperbola is higher add up to less than $n$, because they are all 1 in height or less, and are horizontally non-intersecting.
  • (Actually, we would expect they would cancel quite a bit... and they do, as we will see.  We don't need that yet.)

That means:

  • The error $\sum_{k=1}^n \tau(k)-n\ln(n)$ is a number less than $n$ minus a (different) number less than $n$.
  • So the error is certainly $O(n)$ (less than some multiple of $n$ as $n$ gets huge).
  • So, the error in the average is less than some constant as $n$ gets huge! I.e., $$\frac{1}{n}\sum_{k=1}^n \tau(k)-\ln(n)=O(1)$$

We can verify this graphically by plotting the average value against $\ln(n)$:

line(L(10000))+plot(ln(x),(x,1,10000),color='red',linestyle='--') 
       

Looking' good!  There does seem to be some predictable error.  What might it be?

@interact def _(pts=range_slider(0,5000,50,(0,200))): show(point([(a,b-ln(a)) for a,b in L(pts[1])[pts[0]:]],pointsize=3,rgbcolor=(0,0,0))) 
       

Click to the left again to hide and once more to show the dynamic interactive window

Keeping $x=0$ in view, it seems to be somewhat less than $0.2$, although the error clearly bounces around.   By zooming in, we see the error bouncing around roughly between $0.15$ and $0.16$, more or less, as $x$ gets large.  So will this give us something more precise?

To answer this, we will try one more geometric trick.

@interact def _(n=(8,range(2,25))): viewsize=n+1 g(x)=1/x P=Graphics() P += plot(n*g,(x,0,n+1)) P += plot(2*g,(x,0,n+1),linestyle="--") if n>7: P += plot((n-5)*g,(x,0,n+1),linestyle="--") grid_pts = [[i,j] for i in [1..viewsize] for j in [1..viewsize]] P += points(grid_pts,rgbcolor=(0,0,0),pointsize=2) lattice_pts = [coords for coords in grid_pts if (coords[0]*coords[1]<=n)] P += points(lattice_pts, rgbcolor = (0,0,1),pointsize=20) P += plot(x,(x,0,viewsize),linestyle="--",rgbcolor=(0,0,0)) show(P,ymax=viewsize,aspect_ratio=1) 
       

Click to the left again to hide and once more to show the dynamic interactive window

Notice we have now divided the lattice points up evenly in three parts.

  • The ones on the line $y=x$.  
    • There are exactly $\lfloor\sqrt{n}\rfloor\leq \sqrt{n}$ of these.  
  • The lattice points above the line and below the hyperbola.  
    • At each integer $y$-value $d$ up to $y=\sqrt{n}$, there are are $\lfloor n/d\rfloor-d$ such points.
  • The lattice points to the right of the line and below the hyperbola. 
    • At each integer $x$-value $d$ up to $x=\sqrt{n}$, there are are $\lfloor n/d\rfloor-d$ such points.

So $$\sum_{k=1}^n \tau(k)=\sum_{d\leq \sqrt{n}} (\lfloor n/d\rfloor-d)+\sum_{d\leq \sqrt{n}} (\lfloor n/d\rfloor-d) +\lfloor\sqrt{n}\rfloor\leq 2\sum_{d\leq \sqrt{n}} (n/d-d) +\sqrt{n}$$ where the error involved is certainly less than one for each $d$, so the total error is at most $2\sqrt{n}+1=O(\sqrt{n})$.

Thus we can rewrite this, using a well-known identity from Transitions to Higher Math, as $$\sum_{k=1}^n \tau(k)=2n\sum_{d\leq \sqrt{n}}\frac{1}{d}-2\sum_{d\leq \sqrt{n}}d+O(\sqrt{n})= 2n\sum_{d\leq \sqrt{n}}\frac{1}{d}-2\left(\frac{\lfloor\sqrt{n}\rfloor(\lfloor\sqrt{n}\rfloor+1)}{2}\right)+O(\sqrt{n})\, .$$   The difference between $\left(\frac{\lfloor\sqrt{n}\rfloor(\lfloor\sqrt{n}\rfloor+1)}{2}\right)$ and $\frac{n}{2}$ is once again far less than $O(\sqrt{n})$ (and negative to boot), so we finally get that $$\sum_{k=1}^n \tau(k)=2n\sum_{d\leq \sqrt{n}}\frac{1}{d}-n+O(\sqrt{n})\Rightarrow \frac{1}{n}\sum_{k=1}^n \tau(k)=2\sum_{d\leq \sqrt{n}}\frac{1}{d}-1+O(1/\sqrt{n})\; .$$

We're almost at the end of the story!  Now, we just need to relate this sum to $\ln(n)$.  (Remember, we already are pretty well convinced that $\ln(n)$ is close to the average value of $\tau$; the problem is what the error is.)

@interact def _(n=(8,range(2,25))): viewsize=n+1 P=Graphics() P += plot(1/x,(x,1,n)) P += plot(Piecewise([[(j,j+1),1/j] for j in [1..n-1]]),fill=1/x,linestyle='') show(P) 
       

Click to the left again to hide and once more to show the dynamic interactive window

This graphic shows the exact difference between $\sum_{k=1}^{m-1} \frac{1}{k}$ and $\ln(m)$.  Clearly, even as $m\to\infty$, the total area is simply the sum of a bunch of nearly-triangles with width exactly one and no intersection of height (again this idea), with total height less than $1$.  So the difference between $\sum_{k=1}^{m-1} \frac{1}{k}$ and $\ln(m)$ will be finite as $m\to\infty$.  

This number is very important!  It is called $\gamma$, or the Euler-Mascheroni constant.  

Definition:  $$\gamma=\lim_{m\to\infty}\left(\sum_{k=1}^{m-1} \frac{1}{k}-\ln(m)\right)$$

Notice that the 'missing' part of the area (since we can't actually view all the way out to infinity) must be less than $1/m$, since it will be the part lower than all the pieces we can see in the graphic for any given $m$.  So $\gamma$ is within $O(1/n)$ of any given amount finite $\sum_{k=1}^{m-1} \frac{1}{k}-\ln(m)$.

Now we put it all together!  We know from above that $$\frac{1}{n}\sum_{k=1}^n \tau(k)=2\sum_{d\leq \sqrt{n}}\frac{1}{d}-1+O(1/\sqrt{n})\, .$$  Further, we can now substitute in the following for $\sum_{d\leq \sqrt{n}}\frac{1}{d}$; $$\sum_{d\leq \sqrt{n}}\frac{1}{d}= \ln(\sqrt{n})+\gamma+O(1/\sqrt{n})\; .$$ Once we do that, and take advantage of the log fact $2\ln(z)=\ln\left(z^2\right)$, we get $$\frac{1}{n}\sum_{k=1}^n \tau(k)= \ln(n)+(2\gamma-1)+O(1/\sqrt{n})\, .$$ That is exactly the asymptote and type of error that I have depicted below!

@interact def _(pts=range_slider(0,5000,50,(0,200))): show(point([(a,b-ln(a)) for a,b in L(pts[1])[pts[0]:]],pointsize=3,rgbcolor=(0,0,0))+plot(2*euler_gamma-1,(x,0,pts[1]))+plot(2*euler_gamma-1+.5/sqrt(x),(x,0,pts[1]),color='red',linestyle='--'),xmin=pts[0],xmax=pts[1],ymax=2*euler_gamma-1+.5/sqrt(pts[0]+1)) 
       

Click to the left again to hide and once more to show the dynamic interactive window

Note that it's not hard to prove that $\tau$ grows at least as fast as $\ln(n)$, so this is a fairly sharp result.  (It turns out that it's even possible to show that the error in the average is in fact $O(1/\sqrt[3]{x})$, but is not $O(1/\sqrt[4]{x})$.)

 

Could this conceivably be used for $\sigma=\sigma_1$?  The answer is YES!  Consider the following rewrite of the sum of sigmas, which are themselves the sum of divisors:  $$\sum_{n\leq x}\sigma(n)=\sum_{n\leq x}\sum_{q\mid n} q = \sum_{q,d\text{ such that }qd\leq x} q = \sum_{d\leq x}\sum_{q\leq \frac{x}{d}} q\, .$$  So we have changed from a sum of sums of divisors (which might not be consecutive, and makes $\sigma$ annoying to compute) to a sum of sums of consecutive integers.

We can think about this graphically again.  Instead of comparing points on a hyperbola with points in columns or rows, though, we will compare numbers at points on a hyperbola with numbers at points in rows.  We can think of it as summing up a weighted set of points.  The picture below tells it all.

@interact def _(n=(6,range(2,50))): viewsize=n+1 g(x)=1/x P=Graphics() P += plot(n*g,(x,0,n+1)) grid_pts = [[i,j] for i in [1..viewsize] for j in [1..viewsize]] P += points(grid_pts,rgbcolor=(0,0,0),pointsize=2) lattice_pts = [coords for coords in grid_pts if (coords[0]*coords[1]<=n)] for thing in lattice_pts: P += text(thing[0],thing,rgbcolor=(0,0,0)) show(P,ymax=viewsize,aspect_ratio=1) 
       

Click to the left again to hide and once more to show the dynamic interactive window

In the first one that shows up, we see that $$\sum_{k=1}^6\sigma(k)=1+(1+2)+(1+3)+(1+2+4)+(1+5)+(1+2+3+6)=$$ $$(1+2+3+4+5+6)+(1+2+3)+(1+2)+1\; ,$$ which means we can think of it as a sum of sums from $1$ to the length of each row.  Now let's note three things.

  • Each row is, of course, $\left\lfloor\frac{n}{k}\right\rfloor$ in length, as with $\tau$.  
  • Adding up the first $j$ integers from one to $j$ is of course $$\frac{j(j+1)}{2}=\frac{j^2}{2}+\frac{j}{2}\; ,$$ which we used above.
  • The most wrong $\frac{\lfloor x\rfloor(\lfloor x\rfloor +1)}{2}$ can be from $\frac{x(x+1)}{2}$ is $j+1=O(j)$ (this is simple algebra).

So if we combine the information above with the formula, we get $$\sum_{n\leq x}\sigma(n)=\sum_{d\leq x}\sum_{q\leq \frac{x}{d}}q= \sum_{d\leq x}\left[\frac{1}{2}\left\lfloor\frac{x}{d}\right\rfloor^2+\frac{1}{2}\left\lfloor\frac{x}{d}\right\rfloor\right]=\sum_{d\leq x}\left[\frac{1}{2}\left(\frac{x}{d}\right)^2+\frac{1}{2}\left(\frac{x}{d}\right)+O\left(\frac{x}{d}\right)\right]\, .$$ 

But this is actually possible to analyze!  First, some order calculations.

  • We already saw that $\sum_{d\leq x}\frac{1}{d}= \ln(x)+O(1)$, so $$\sum_{d\leq x}\frac{1}{2}\left(\frac{x}{d}\right)=\frac{1}{2}O(x\ln(x))=O(x\ln(x))\; .$$
  • Also, $\sum_{d\leq x}O\left(\frac{x}{d}\right)$ must be $$O\left(x\sum_{d\leq x}\frac{1}{d}\right)=O(x\ln(x))\; .$$

Next, let's get more information about $\sum_{d\leq x}\left[\frac{1}{2}\left(\frac{x}{d}\right)^2\right]$.

  • Recall that the (convergent) improper integral $\int_x^{\infty}\frac{dy}{y^2}$ approximates $\sum_{d>x}\frac{1}{d^2}$.
  • Since both converge, and by the same pictures as above, the error is certainly $O(1/x^2)$.
  • Then I can rewrite things as  $$\sum_{d\leq x}\frac{1}{d^2}=\sum_{d=1}^{\infty}\frac{1}{d^2}-\sum_{d>x}\frac{1}{d^2}= \sum_{d=1}^{\infty}\frac{1}{d^2}-\int_{x}^{\infty}\frac{1}{y^2}dy+O(1/x^2)=\sum_{d=1}^{\infty}\left(\frac{1}{d^2}\right)-\frac{1}{x}+O(1/x^2)\, .$$  

Thus the whole crazy double sum can be approximated as follows, quite accurately: $$\sum_{n\leq x}\sigma(n)= \frac{x^2}{2}\sum_{d\leq x}\left(\frac{1}{d^2}\right)+\frac{x}{2}\sum_{d\leq x}\frac{1}{d}+O(x\ln(x))$$ $$=\frac{x^2}{2}\left(\sum_{d=1}^{\infty}\left(\frac{1}{d^2}\right)-\frac{1}{x}+O(1/x^2)\right)+O(x\ln(x))=\frac{x^2}{2}\sum_{d=1}^{\infty}\left(\frac{1}{d^2}\right)-\frac{x}{2}+O(x\ln(x))\, .$$

And the average value of $\sigma$ must be this divided by $x$, namely $$\frac{1}{x}\sum_{n\leq x}\sigma(n)\text{ is }\frac{x}{2}\sum_{d=1}^{\infty}\frac{1}{d^2}+O(\ln(x))\, .$$  Since we know that the series converges, this means the average value of $\sigma$ increases quite linearly, with an error (at most) increasing logarithmically!  This might be a shock - that one could actually get something fairly accurate like this relatively easily using calculus ideas like improper integrals and (implicitly) the integral test for infinite series.  But check out the data!

def M(n): ls = [] out = 0 for i in range(1,n+1): out += sigma(i) ls.append((i,out/i)) return ls 
       
@interact def _(j = [10,100,1000,10000]): show(line(M(j))) 
       

Click to the left again to hide and once more to show the dynamic interactive window

Of course, one might ask what the slope of this line is!  It would have to be $m=\frac{1}{2}\sum_{k=1}^{\infty}\frac{1}{d^2}$.  Who remembers this from Calculus II (possibly) or History of Math (definitely)? 

@interact def _(j = [10,100,1000,10000]): show(line(M(j))+plot(x*zeta(2)/2,(x,0,j),color='black',linestyle="--")) 
       

Click to the left again to hide and once more to show the dynamic interactive window

This is Euler's solution to the Basel problem, which is $\frac{\pi^2}{6}$, so the slope is $\frac{\pi^2}{12}$.   Amazing!  

We will see these $\sum_{d\leq x}\frac{1}{d^k}$ again soon.

We can even do something similar for the Euler $\phi$ function.  See the exercises.

Homework:

  1. Show that $\sigma(n)$ is $O(n^2)$ (compare to the sum of all integers).
  2. Use the formula for the sum of the first $n$ perfect squares (from Transitions/Discrete or Calculus; it was probably in both for you) and the previous exercise to show that the average value of $\sigma(n)$ is Big Oh of $n^2$.  (This can be loosey-goosey.)
  3. Find a formula for the average value of the $u$ and $N$ functions on page 145 in the book (up through $n$).
  4. Finish the details of the proof that $\tau$ is $O(\sqrt[3]{x})$
  5. Show that $\tau(n)$ is not $O(1)$.  (Hint: that means there is no constant $C$ such that $\tau(n)\leq C$ always.)
  6. Why would it not contradict our theorem above that $\frac{1}{n}\sum_{k=1}^n\tau(k)=O(\ln(n))$ to say that $\tau(n)$ is not $O(\ln(n))$?
  7. Show that $\tau(n)$ is not $O(\ln(n))$.  (Hint: look at numbers of the form $6^k$, and compare $\tau$ of these to any given multiple of the natural logarithm using calculus.)
  8. Find absolute bounds for $\phi(n)$ (simple polynomial or log formulas in terms of $n$).
  9. Use data, graphs, whatever to conjecture what type of growth the average value of $\phi$ has up to $n$ - is it logarithmic, linear, quadratic, exponential, something else?  Bonus if you find a coefficient for the growth!