# MAT 338 Day 38 2011

## 2396 days ago by kcrisman

Today, we finish up exploring the wacky and wonderful world of primes, and start heading back toward other arithmetic functions, with an eye toward developing the language we'll need to explore $\pi(x)$ more rigorously.

First, there is an interesting question that no one picked up on last time.  We proved that there are infinitely many primes of the forms $4k+1$ and $4k+3$, but then proceeded to do prime races for several other such forms.   Is it legitimate to do so?

The answer is yes, as proved in this major theorem that introduced limiting and calculus methods to the study of number theory.

Theorem (Dirichlet): If $gcd(a,b)=1$, then there are infinitely many primes of the form $ax+b$ for $x$ an integer.

We call this the theorem on primes in an arithmetic progression.  That is, $ax+b$ defines a progression of numbers separated always by $a$, and this says there are infinitely many primes in any such progression that makes sense in terms of relative primeness.  It is a weak version of a prime race; it just says that it makes sense to do them, though (as we saw) there is much more information one can glean from them.

@interact def _(a=8,b=7,n=100): if gcd(a,b)!=1: html("Oops! The progression won't have many primes if") html("$a$ and $b$ aren't relatively prime!") else: html("Primes of the form $%sx+%s$ up to $%s$:"%(a,b,n)) for x in prime_range(n): if x%a==b: print x

## Click to the left again to hide and once more to show the dynamic interactive window

Obviously, we have already proved this for $a=4$.  It is easy to prove for $a=2$!

It is also possible to prove this for $b=1$ without developing bigger tools, but it involves some details about polynomial factorization that are not worth spending time on now.

We can also look at the opposite question - not whether primes exist in a given arithmetic progression, but whether there are arithmetic progressions made of solely of primes!  $$\text{Can you get a sequence }ak+b,k=0,1,2,3,\ldots n\text{, to all be prime?}$$

It's easy to find short arithmetic progressions in the primes.  We say such a progression has length $n+1$ in the above notation.

• 3, 5, 7 is an arithmetic progression of length 3, where $a=2$.
• 41, 47, 53, and 59 is an arithmetic progression of length 4, where $a=6$.

Longer ones get harder to find.  Can you find a progression of length 5?  (Hint: there is a small one where the numbers differ by 110!)

The question can be raised whether one can find arbitrarily long such sequences in the primes.

The answer is yes!  That is a theorem of Ben Green and Terry Tao, which was a significant piece of Tao's 2006 Fields Medal (though he probably would have won it even without this, remarkable as it may seem).   The highest ones are listed at this website.

How do people find such lists?  For that, we need a new notation.

Notation/Definition: For a prime $p$, we call the primorial the number $$p\#=\prod_{q\leq p,\; q\text{ prime}} q$$ where yes, the 'p sharp' denotes $p$ primorial.

Then one usually finds such lists by this method:

• First compute a large set of primes $a\cdot q\#+1$ for fixed $q$.
• Find arithmetic progressions among the $a$ values (not the $a\cdot q\# +1$ guys).
• If you find $a$ values in progression $k+\ell\cdot n$, then...
• You've found the progression of primes $(k\cdot q\#+1)+(\ell\cdot q\#) n$

Great!  But how might one prove these can go as long as possible?

That might seem mysterious, so here is the gist of the approach to it.  Remember how there seem to be fewer primes the further out we go, also in an arithmetic sequence (e.g. prime mod 4 or mod 8)?  That isn't a coincidence.

There is a technical way to measure this: $$\lim_{n\to\infty}\frac{\pi(n)}{n}=0\; .$$ This follows from the estimate we proved of Chebyshev's last time, and is called having zero density.

If the limit were positive, then it would be called positive density.   We can see this with specific numbers:

• $\pi(100)/100 = 1/4=0.25$
• $\pi(200)/200=0.23$
• $\pi(1000)/1000=0.168$, or under 17%.

Now, if you have a collection of numbers which has positive density, it is a theorem from 1974 (Endre Szemeredi) that you can get arithmetic progressions of arbitrary length in such sets.  Of course, it looks like the primes are approaching zero density.

But Green and Tao managed to show this still works for the primes!  It won't be true for any old set; but somehow, although there are not many primes, there are just enough for it to work.

difference=23681770*2*3*5*7*11*13*17*19*23 start=43142746595714191 for n in [0..25]: print start+n*difference,is_prime(start+n*difference)
 43142746595714191 True 48425980631694091 True 53709214667673991 True 58992448703653891 True 64275682739633791 True 69558916775613691 True 74842150811593591 True 80125384847573491 True 85408618883553391 True 90691852919533291 True 95975086955513191 True 101258320991493091 True 106541555027472991 True 111824789063452891 True 117108023099432791 True 122391257135412691 True 127674491171392591 True 132957725207372491 True 138240959243352391 True 143524193279332291 True 148807427315312191 True 154090661351292091 True 159373895387271991 True 164657129423251891 True 169940363459231791 True 175223597495211691 True 43142746595714191 True 48425980631694091 True 53709214667673991 True 58992448703653891 True 64275682739633791 True 69558916775613691 True 74842150811593591 True 80125384847573491 True 85408618883553391 True 90691852919533291 True 95975086955513191 True 101258320991493091 True 106541555027472991 True 111824789063452891 True 117108023099432791 True 122391257135412691 True 127674491171392591 True 132957725207372491 True 138240959243352391 True 143524193279332291 True 148807427315312191 True 154090661351292091 True 159373895387271991 True 164657129423251891 True 169940363459231791 True 175223597495211691 True

This example was found just over a year ago, on April 10, 2010!

Before we go on, I have to mention one of the best web sites about primes.  This is the Prime Pages, hosted at the University of Tennessee, Martin.

It's just amazingly full of useful information, but also quite user-friendly and usable for a large variety of backgrounds.  In particular, the top twenty page has links to the top twenty of just about every prime type you can imagine - a cornucopia of information.  I really like, for instance, the prediction of when the first billion digit prime will surface.

Notice that for MANY of these types, we don't know if there are finitely many or not! (E.g. Germain, Mersenne, repunit, etc.)   And that brings us to our next topic for today - conjectures for how often certain types of primes might appear.

Let's pick up from the arithmetic progressions.

• Dirichlet's Theorem says that any given (legitimate) progression has infinitely many primes.
• Green and Tao's Theorem says that you can find a progression consisting only of primes of any given length (though not infinite).

So a next natural question is whether you can say anything about the constants involved in these progression $ax+b$.  Since $b$ is pretty arbitrary, we would focus on $a$.

• Find some primes that look like $2x+b$ for some $b$ and several consecutive $x$.
• How many $x$ can you do?
• How about for $3x+b$?
• What about $4x+b$?
• Are the primes you get in these cases ever consecutive?

Hopefully it's pretty clear that you can't do every possible one!  Why?

Thinking about this and the Sieve of Eratosthenes led the French mathematician Alphonse de Polignac to the following:

Conjecture: Every even number is the difference between consecutive primes in infinitely many ways.

We have no proof of this.  In fact, even the most basic case is one of the most celebrated questions in number theory:

Twin prime conjecture:  There are infinitely many consecutive odd prime numbers.

Such pairs are called twin primes, of course - $$p\text{ and }q\text{ such that }p+2=q\;$$

def twin_primes_upto(n): v = prime_range(n+1) L = [] counter = 0 for i in range(len(v)-1): if v[i+1]-v[i]==2: counter += 1 L.append((v[i+1],counter)) return L

There are lots of twin primes.  This cell computes pairs of the second of a twin prime pair, and which twin prime pair it is - so that the pair 17 and 19 is the fourth pair, for example.

twin_primes_upto(100)
 [(5, 1), (7, 2), (13, 3), (19, 4), (31, 5), (43, 6), (61, 7), (73, 8)] [(5, 1), (7, 2), (13, 3), (19, 4), (31, 5), (43, 6), (61, 7), (73, 8)]

But infinitely many?

var('t') plot_step_function(twin_primes_upto(1000000),legend_label='twin prime')+plot(2*twinprime*x/log(x)^2,(x,1,1000000),color='black',legend_label='$C_2 x/log(x)$')+plot(lambda t: 2*twinprime*numerical_integral(1/log(x)^2,2,t)[0],(t,1,1000000),color='red',legend_label='$C_2 Li_2(x)$')

You can see above that it's certainly possible to approximate the twin prime counting function in a similar way to how we approximated the prime counting function $\pi$.  The constant will be explained below.

In fact, there is a nice little rule of thumb that gets us to these things.  Notice this is very much not a proof!

• First, one might want to estimate how many primes there are up to a certain point to start, but using a different idea than just looking at tables!
• Well, one can say the following:
• About half the numbers less than $n$ are not divisible by 2.
• About 2/3 the numbers less than $n$ are not divisible by 3.
• About 4/5 the numbers less than $n$ are not divisible by 5.
• Etc. for each prime less than $\sqrt{n}\ldots$
• In fact, you might even expect that $$\prod_{p<\sqrt{x}}\left(1-\frac{1}{p}\right)$$ is a good approximation of the probability that a given number $x$ is prime.
• (But it isn't.  In fact, this product turns out to be asymptotic to $2e^{-\gamma}/\ln(x)$, for very good reasons.)
• Still, this might help us make ideas for how many twin primes there are.
• For instance, one might take into account that it's not really a probability.  After all, if $p>2$ is prime, then with one hundred percent probability the next number is not prime!  And for $p$ and $p+2$ to be both prime, they must also both be odd - so if $p$ is odd, then $p+2$ is much more likely than a random number to be prime.
• So we do the following analysis instead.
• Although one would expect for 1/4 of all pairs separated by two to both be odd, $n+2$ has the same parity as $n$ so we should expect 1/2 the pairs to both be odd.
• The chances that $n$ and $n+2$ are both not divisible by three is 1/3 (what form must $n$ have for this to happen?).
• The chances that $n$ and $n+2$ are both not divisible by five is 3/5 (what residues modulo five must $n$ avoid?).
• And so forth.
• So we might expect that $$\frac{1}{2}\prod_{p<\sqrt{x},p>2}\left(1-\frac{2}{p}\right)$$ is a good approximation of the probability that a given pair is prime.
• Now we do some algebra to turn this into something that looks like something with logs:  $$\left(1-\frac{2}{p}\right)=\left(1-\frac{1}{(p-1)^2}\right)\left(1-\frac{1}{p}\right)^2$$
• So we could approximate the number of twin primes less than $x$ by $$\frac{1}{2}\prod_{p<\sqrt{x},p>2}\left(1-\frac{1}{(p-1)^2}\right)\prod_{p\text{ prime}}\left(1-\frac{1}{p}\right)^2$$
• And since we already know the right-hand part is more or less the square of the number of primes, we come up with a reasonable suggestion that we should approximate by $$\frac{1}{2}\prod_{p<\sqrt{x},p>2}\left(1-\frac{1}{(p-1)^2}\right)\left(\frac{x}{\ln(x)}\right)^2$$

A little further massaging of this would involve the logarithmic integral equivalent, which involves $\int_2^x dt/(\ln(t))^2$.  The graph above shows how good it gets.

Incidentally, the reason why this works is because $$C_2=2\prod_{p>2}\left( 1-\frac{1}{p-1}^2\right)\, .$$ is finite; it is called the twin prime constant.

This also leads to a conjecture of Hardy and Littlewood - that the number of ways to write an even number $2k$ as a sum of primes is asymptotic to the same thing!  This would provide a very overwhelming proof of the so-called Goldbach Conjecture, that any even number can be written in at least one way as a sum of two primes.

2*twinprime.n()
 1.32032363169374 1.32032363169374

Computing this constant to arbitrary precision led to the discovery of the infamous Pentium chip bug.

Incidentally, there is some inconsistency in the literature about whether the 2 in front is part of the twin prime constant or not.

There is another constant affiliated with twin primes.

brun.n(digits=5)
 1.9022 1.9022

Although there may be infinitely many of them, the sum of their reciprocals $$\sum_{p,p+2 \text{ both prime}} \frac{1}{p}+\frac{1}{p+2}$$ is actually a constant, which at the very least means they must be pretty rare. This is called Brun's constant, and is above.

There are many other such heuristics:

• Since the chance that $n$ and $2n+1$ are both not divisible by a prime $p$ is basically the same as that $n$ and $n+2$ are both not divisible by $p$, it turns out that Germain primes may also be distributed in the same way as twin primes.
• Using similar ideas, one can get a heuristic that Mersenne primes are distributed as $$e^{\gamma}\ln(\ln(x))/ln(2)$$  This is known as Wagstaff's conjecture.
• Bizarrely, one can use the same idea to get a heuristic for factorial primes - primes of the form $n! \pm 1$, like 5, 7, 23, and 719.  It's conjectured that there are $e^{\gamma}\ln(n)$ such primes less than $n$
• These even seem to apply to the so-called primorial primes - primes of the form $p\#\pm 1$, like 3, 5, 7, 29, 31, 211, etc.  It's truly weird, yet also cool.

Finally, you should know that it is not necessary to actually find all the first $n$ primes (even of a particular type) to compute how many there are, at least not always.  If we define $\phi(n,a)$ to be the number of positive integers less than $n$ which are not divisible by any of the first $a$ primes, then it is possible to develop the recursive formula $$\phi(n,a)=\phi(n,a-1)-\phi\left(\left\lfloor\frac{n}{p_a}\right\rfloor,a-1\right)\, ,$$ which allows us to sort of use induction to compute $\phi(n,a)$ without having to use much space.  Then it turns out that $$\pi(n)=\pi(\sqrt{n})+\phi(n,\pi(\sqrt{n}))-1\, ,$$ which is also not hard to prove (by a counting argument).  That is the typical way to count $\pi$ without actually counting primes, and with some speedups is quite efficient.

This is also how one finds the $n$th prime - you use an approximation to the $n$th prime like $n\ln(n)$ and then check $\pi(n)$ near there to figure out exactly where it is.

time nth_prime(10^7)
 179424673 Time: CPU 0.72 s, Wall: 0.72 s 179424673 Time: CPU 0.72 s, Wall: 0.72 s

Homework:

1. Show that $\sigma(n)$ is $O(n^2)$ (compare to the sum of all integers).
2. Find a formula for the average value of the $u$ and $N$ functions we discussed earlier (up through $n$).
3. Find absolute bounds for $\phi(n)$ (formulas in terms of $n$).
4. Why would it not contradict our theorem that $\frac{1}{n}\sum_{k=1}^n\tau(k)=O(\ln(n))$ to say that $\tau(n)$ is not $O(\ln(n))$?
5. Show that $\tau(n)$ is not $O(\ln(n))$.  (Hint: look at numbers of the form $6^k$, and compare $\tau$ of these to any given multiple of the natural logarithm using calculus.)
6. Come up with two functions $f(x)$ and $g(x)$ that both go to infinity as $x\to\infty$, but that switch the lead infinitely often and $f$ and $g$ are asymptotic.
7. Show that the two limits in the prime number theorem are really equivalent.  That is, show that if $\lim \pi(x)/Li(x)=1$, then the other limit is 1, and vice versa.
8. Find an arbitrarily long sequence of consecutive composite numbers.  (Hint: factorials.)
9. Use the piece of Chebyshev's theorem that we proved to show that $\lim_{x\to\infty}\pi(x)/x=0$.
10. Verify that the derivative of $\frac{\ln(n)}{\ln(2n+1)}$ is positive for $n>1000$.
11. Find an arithmetic progression of length five with 110 between primes.
12. Prove that there can be only one set of 'triple primes' - that is, three consecutive odd primes.
13. Find the value of $23\#$.
14. Show that $\left(1-\frac{2}{p}\right)=\left(1-\frac{1}{(p-1)^2}\right)\left(1-\frac{1}{p}\right)^2$.
15. Let $D(N)=\prod_{p<N}\left(1-\frac{1}{p}\right)$.  Compute $D(N)$ by hand for all $N$ between 10 and 20.  What patterns do you notice in the denominators?  The numerators?