Today, we finish up exploring the wacky and wonderful world of primes, and start heading back toward other arithmetic functions, with an eye toward developing the language we'll need to explore $\pi(x)$ more rigorously.
First, there is an interesting question that no one picked up on last time. We proved that there are infinitely many primes of the forms $4k+1$ and $4k+3$, but then proceeded to do prime races for several other such forms. Is it legitimate to do so?
The answer is yes, as proved in this major theorem that introduced limiting and calculus methods to the study of number theory.
Theorem (Dirichlet): If $gcd(a,b)=1$, then there are infinitely many primes of the form $ax+b$ for $x$ an integer.
We call this the theorem on primes in an arithmetic progression. That is, $ax+b$ defines a progression of numbers separated always by $a$, and this says there are infinitely many primes in any such progression that makes sense in terms of relative primeness. It is a weak version of a prime race; it just says that it makes sense to do them, though (as we saw) there is much more information one can glean from them.
Click to the left again to hide and once more to show the dynamic interactive window 
Obviously, we have already proved this for $a=4$. It is easy to prove for $a=2$!
It is also possible to prove this for $b=1$ without developing bigger tools, but it involves some details about polynomial factorization that are not worth spending time on now.
We can also look at the opposite question  not whether primes exist in a given arithmetic progression, but whether there are arithmetic progressions made of solely of primes! $$\text{Can you get a sequence }ak+b,k=0,1,2,3,\ldots n\text{, to all be prime?}$$
It's easy to find short arithmetic progressions in the primes. We say such a progression has length $n+1$ in the above notation.
Longer ones get harder to find. Can you find a progression of length 5? (Hint: there is a small one where the numbers differ by 110!)
The question can be raised whether one can find arbitrarily long such sequences in the primes.
The answer is yes! That is a theorem of Ben Green and Terry Tao, which was a significant piece of Tao's 2006 Fields Medal (though he probably would have won it even without this, remarkable as it may seem). The highest ones are listed at this website.
How do people find such lists? For that, we need a new notation.
Notation/Definition: For a prime $p$, we call the primorial the number $$p\#=\prod_{q\leq p,\; q\text{ prime}} q$$ where yes, the 'p sharp' denotes $p$ primorial.
Then one usually finds such lists by this method:
Great! But how might one prove these can go as long as possible?
That might seem mysterious, so here is the gist of the approach to it. Remember how there seem to be fewer primes the further out we go, also in an arithmetic sequence (e.g. prime mod 4 or mod 8)? That isn't a coincidence.
There is a technical way to measure this: $$\lim_{n\to\infty}\frac{\pi(n)}{n}=0\; .$$ This follows from the estimate we proved of Chebyshev's last time, and is called having zero density.
If the limit were positive, then it would be called positive density. We can see this with specific numbers:
Now, if you have a collection of numbers which has positive density, it is a theorem from 1974 (Endre Szemeredi) that you can get arithmetic progressions of arbitrary length in such sets. Of course, it looks like the primes are approaching zero density.
But Green and Tao managed to show this still works for the primes! It won't be true for any old set; but somehow, although there are not many primes, there are just enough for it to work.
43142746595714191 True 48425980631694091 True 53709214667673991 True 58992448703653891 True 64275682739633791 True 69558916775613691 True 74842150811593591 True 80125384847573491 True 85408618883553391 True 90691852919533291 True 95975086955513191 True 101258320991493091 True 106541555027472991 True 111824789063452891 True 117108023099432791 True 122391257135412691 True 127674491171392591 True 132957725207372491 True 138240959243352391 True 143524193279332291 True 148807427315312191 True 154090661351292091 True 159373895387271991 True 164657129423251891 True 169940363459231791 True 175223597495211691 True 43142746595714191 True 48425980631694091 True 53709214667673991 True 58992448703653891 True 64275682739633791 True 69558916775613691 True 74842150811593591 True 80125384847573491 True 85408618883553391 True 90691852919533291 True 95975086955513191 True 101258320991493091 True 106541555027472991 True 111824789063452891 True 117108023099432791 True 122391257135412691 True 127674491171392591 True 132957725207372491 True 138240959243352391 True 143524193279332291 True 148807427315312191 True 154090661351292091 True 159373895387271991 True 164657129423251891 True 169940363459231791 True 175223597495211691 True 
This example was found just over a year ago, on April 10, 2010!
Before we go on, I have to mention one of the best web sites about primes. This is the Prime Pages, hosted at the University of Tennessee, Martin.
It's just amazingly full of useful information, but also quite userfriendly and usable for a large variety of backgrounds. In particular, the top twenty page has links to the top twenty of just about every prime type you can imagine  a cornucopia of information. I really like, for instance, the prediction of when the first billion digit prime will surface.
Notice that for MANY of these types, we don't know if there are finitely many or not! (E.g. Germain, Mersenne, repunit, etc.) And that brings us to our next topic for today  conjectures for how often certain types of primes might appear.
Let's pick up from the arithmetic progressions.
So a next natural question is whether you can say anything about the constants involved in these progression $ax+b$. Since $b$ is pretty arbitrary, we would focus on $a$.
Hopefully it's pretty clear that you can't do every possible one! Why?
Thinking about this and the Sieve of Eratosthenes led the French mathematician Alphonse de Polignac to the following:
Conjecture: Every even number is the difference between consecutive primes in infinitely many ways.
We have no proof of this. In fact, even the most basic case is one of the most celebrated questions in number theory:
Twin prime conjecture: There are infinitely many consecutive odd prime numbers.
Such pairs are called twin primes, of course  $$p\text{ and }q\text{ such that }p+2=q\; $$

There are lots of twin primes. This cell computes pairs of the second of a twin prime pair, and which twin prime pair it is  so that the pair 17 and 19 is the fourth pair, for example.
[(5, 1), (7, 2), (13, 3), (19, 4), (31, 5), (43, 6), (61, 7), (73, 8)] [(5, 1), (7, 2), (13, 3), (19, 4), (31, 5), (43, 6), (61, 7), (73, 8)] 
But infinitely many?

You can see above that it's certainly possible to approximate the twin prime counting function in a similar way to how we approximated the prime counting function $\pi$. The constant will be explained below.
In fact, there is a nice little rule of thumb that gets us to these things. Notice this is very much not a proof!
A little further massaging of this would involve the logarithmic integral equivalent, which involves $\int_2^x dt/(\ln(t))^2$. The graph above shows how good it gets.
Incidentally, the reason why this works is because $$C_2=2\prod_{p>2}\left( 1\frac{1}{p1}^2\right)\, .$$ is finite; it is called the twin prime constant.
This also leads to a conjecture of Hardy and Littlewood  that the number of ways to write an even number $2k$ as a sum of primes is asymptotic to the same thing! This would provide a very overwhelming proof of the socalled Goldbach Conjecture, that any even number can be written in at least one way as a sum of two primes.
1.32032363169374 1.32032363169374 
Computing this constant to arbitrary precision led to the discovery of the infamous Pentium chip bug.
Incidentally, there is some inconsistency in the literature about whether the 2 in front is part of the twin prime constant or not.
There is another constant affiliated with twin primes.
1.9022 1.9022 
Although there may be infinitely many of them, the sum of their reciprocals $$\sum_{p,p+2 \text{ both prime}} \frac{1}{p}+\frac{1}{p+2}$$ is actually a constant, which at the very least means they must be pretty rare. This is called Brun's constant, and is above.
There are many other such heuristics:
Finally, you should know that it is not necessary to actually find all the first $n$ primes (even of a particular type) to compute how many there are, at least not always. If we define $\phi(n,a)$ to be the number of positive integers less than $n$ which are not divisible by any of the first $a$ primes, then it is possible to develop the recursive formula $$\phi(n,a)=\phi(n,a1)\phi\left(\left\lfloor\frac{n}{p_a}\right\rfloor,a1\right)\, ,$$ which allows us to sort of use induction to compute $\phi(n,a)$ without having to use much space. Then it turns out that $$\pi(n)=\pi(\sqrt{n})+\phi(n,\pi(\sqrt{n}))1\, ,$$ which is also not hard to prove (by a counting argument). That is the typical way to count $\pi$ without actually counting primes, and with some speedups is quite efficient.
This is also how one finds the $n$th prime  you use an approximation to the $n$th prime like $n\ln(n)$ and then check $\pi(n)$ near there to figure out exactly where it is.
179424673 Time: CPU 0.72 s, Wall: 0.72 s 179424673 Time: CPU 0.72 s, Wall: 0.72 s 

Homework:
