MAT 338 Day 18 2011

2457 days ago by kcrisman

Proofs of two things you had to do and one you didn't.

Exercise you didn't have to hand in:

If $a$ is an odd primitive root modulo $p^e$ (with odd $p$, of course), then $a$ is also a primitive root of $2p^e$. 

  • First, $a$ is certainly still a unit, because if $gcd(a,p^e)=1$ and $a$ is odd (not divisible by 2), then $a$ still has no common factors with $2p^e$.
  • Now let's suppose that $$a^k\equiv 1\text{ mod }\left(2p^3\right)$$  In that case, $$2p^e \mid \left(a^k-1\right)\Rightarrow p^e \mid \left(a^k-1\right)$$ But the only powers for which this is true are multiples of $\phi(p^e)$, so the smallest $k$ for which this is true must still be $\phi(p^e)$.
  • Since we know there are only $$\phi(2p^e)=\phi(2)\phi(p^e)=1\cdot \phi(p^e)$$ units available for powers of $a$, it can't have a higher order.

First Lemma:

If $a$ has order $d$ modulo $n$ and $gcd(b,d)=1$, then $a^b$ also has order $d$ modulo $n$.  

  • We know that the $d$ powers of $a$ $$a^1,a^2,a^3,\ldots ,a^{d-1},a^d\equiv 1$$ are all different, and only the last is 1.  
  • So let's look at the $d$ powers of $a^b$ $$\left(a^b\right)^1, \left(a^b\right)^2, \left(a^b\right)^3,\ldots ,\left(a^b\right)^{d-1},\left(a^b\right)^d$$ which are thus just powers of $a$ of powers given by multiples of $b$ $$a^{1b},a^{2b},a^{3b},\ldots,a^{(d-1)b},a^{db}$$
  • So we really just care whether the $1b,2b,3b,\ldots,(d-1)b,db$ powers of $a$ reach a number divisible by $d$ before $db$.
  • But from an earlier lemma, we know that multiples of $b$ modulo $d$ are all mutually not congruent if $b$ and $d$ share no divisors.  So none of those multiples of $b$ is also a multiple of $d$.
  • So each $a^b$ has $d$ powers, all of which are different (and $\left(a^b\right)^d\equiv 1$), which is the definition of having order $d$.

As a comment, this means that if you have an element of order $d$, you automatically get at least $\phi(d)$ of them, regardless of whether there is a primitive root or even if your modulus is prime!

Second Lemma:

Suppose $d\mid (p-1)$ for a prime $p$.  Then there are no more than $\phi(d)$ elements of order $d$ in $U_p$.

  • If there are no elements of order $d$, great.
  • Suppose there is at least one such element $a$.  Then it satisfies the congruence $$x^d\equiv 1\text{ mod }(p)\text{ which is }x^d-1\equiv 0\text{ mod }(p)$$
  • By the Lagrange's Theorem about polynomials (see Day 10), there are at most $d$ solutions to this (since $p$ is prime).
  • But any power of $a$ is automatically a solution to this, since $$\left(a^b\right)^d=\left(a^d\right)^b\equiv 1$$, so the $d$ powers of $a$ are all the possible solutions.
  • It is enough to show that $a^b$ for $gcd(b,d)\neq 1$ doesn't have order $d$ to prove the lemma, because there are only $\phi(d)$ things for which $gcd(b,d)=1$, and we want to restrict to at most $\phi(d)$ possible elements of order $d$.
  • So let's look at $a^b$ in the situation that $gcd(b,d)=k\neq 1$.
    • In that case, we can say (definition of divisibility) that $b=kc$ and $d=ke$, for some integers $1<c,e<d$.
    • Now let's write $a^b$ in a form which makes it clear it has order at most $e$, which is less than $d$.  $$a^b=a^{kc}=\left(a^k\right)^c=\left(a^{d/e}\right)^c$$
    • If we take this to the $e$ power, we get $$\left(a^b\right)^e=\left(a^{d/e}\right)^{ce}=\left(a^{(e/e)d}\right)^c=\left(a^d\right)^c$$ which is certainly congruent to 1, and so $a^b$ definitely does not have order $d$.

Back to cryptography!

Remember, we begin by doing this cell, which enables us to turn words into numbers.  But really, what we care about are the numbers, because there are lots of other ways to turn words into numbers.  So just use this if you want a quick number for a message.

def encode(s): # Input must be in quotes! s = str(s).upper() return sum((ord(s[i])-64)*26^i for i in range(len(s))) def decode(n): n = Integer(n) list = [] while n != 0: if n%26==0: list.append(chr(64+26)) n -= 1 else: list.append(chr(n%26+64)) n //=26 return ''.join(list) 
       

We can use the following cell to do Diffie-Hellman by just changing $p$ and $e$ and our secret message, as long as the message is not too long compared to $p$.

@interact def _(message='hi',p=991,e=677): secret=encode(message) if is_prime(p) and gcd(p,e)==1 and p>secret: e=677 # hopefully coprime to p-1 code=mod(secret,p)^e try: f=mod(e,p-1)^-1 except: html("Looks like $%s$ is not coprime to the prime we chose, $%s$"%(e,p)) html("My encoded message is $%s$"%secret) html("A big prime bigger than that is $%s$"%p) html("And I chose exponent $%s$"%e) html("The encrypted message is $%s$"%code) html("The inverse of $%s$ is $%s$"%(e,f)) html("And the decrypted message turns out to be:") print ''.join(decode(code^f)) elif not is_prime(p): html("Pick a prime $p$!") elif p <= secret: html("Make sure your prime is bigger than your secret, $%s$"%secret) else: html("Make sure that $gcd(p,e)=1$!") 
       

Click to the left again to hide and once more to show the dynamic interactive window

If you for some reason didn't pick a big enough prime, you can use the following command to find one!

next_prime(11058) 
       
11059
11059

Remember, the key that makes it all work is that (thanks to Euler's Theorem/Fermat's Little Theorem) exponents of congruences mod $n$ live in the world of congruences mod $\phi(n)$, as long as they are numbers coprime to $\phi(n)$.  That's why $gcd(e,p-1)=1$ is important.  Here's how that can go wrong.

message='hi' # needs to be in quotes secret=encode(message) p=991 # needs to be bigger than secret e=2 # NOT coprime to p-1 code=mod(secret,p)^e code 
       
95
95

So far, so good, but what happens when we try to decrypt?

f=mod(e,p-1)^-1 message;secret;code;decode(code^f) # prints all the steps 
       
Traceback (click to the left of this block for traceback)
...
RuntimeError
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "_sage_input_9.py", line 10, in <module>
    exec compile(u'open("___code___.py","w").write("# -*- coding: utf-8 -*-\\n" + _support_.preparse_worksheet_cell(base64.b64decode("Zj1tb2QoZSxwLTEpXi0xIAptZXNzYWdlO3NlY3JldDtjb2RlO2RlY29kZShjb2RlXmYpICMgcHJpbnRzIGFsbCB0aGUgc3RlcHM="),globals())+"\\n"); execfile(os.path.abspath("___code___.py"))
  File "", line 1, in <module>
    
  File "/private/var/folders/Yy/YytEJm5VEB0+pBRD7JNLe++++TQ/-Tmp-/tmpgzd9nw/___code___.py", line 3, in <module>
    f=mod(e,p-_sage_const_1 )**-_sage_const_1  
RuntimeError

It turns out not even to be possible to go backwards.

An interesting application of this is called key exchange.  Here is the basic concept.

Two people trying to pass information (universally called Alice and Bob) want to decide on a secret key for using some encryption routine.  Unfortunately, they know that someone may be listening in on their decision.   Instead of trying to send a secret key one has chosen, they try to create a secret key using (essentially) public means.  

  • First, Alice and Bob jointly pick a big prime $p$ and a base for exponentiation $g$, presumably with $1<g<p$.  
  • Now, they each secretly choose an exponent; maybe Alice chooses $m$ and Bob chooses $n$.  
  • Then each of them exponentiates $g$ to their secret power!  
  • Then they pass off these numbers to each other.

Example: 

Alice and Bob pick $p=991$ and $g=55$, and then (separately) pick $m=130$ and $n=123$.  Then they compute the powers.

p=991 g=mod(55,p) m=130 n=123 Alice_does=g^m Bob_does=g^n Alice_does;Bob_does 
       
722
114
722
114

Okay, now what Alice and Bob do with this (possibly spied upon) information of $g^n$ and $g^m$ and $g$ and $p$ is to exponentiate the other person's number with their own power.  Like this:

Bob_does^m;Alice_does^n 
       
877
877
877
877

And now they have a secret key ($g^{mn}=g^{nm}$) they can easily compute but which a spy in the middle cannot.  Feel free to try this with your own numbers you pick!

@interact def _(p=(991,prime_range(1000)),g=55,m=130,n=123): g=mod(g,p) html("If you jointly picked $p=%s$ and base $g=%s$"%(p,g)) html("Then separately picked secret powers $m=%s$ and $n=%s$"%(m,n)) html("Your publicly traded info would be $%s^{%s}\equiv %s$ and $%s^{%s}\equiv %s$"%(g,m,g^m,g,n,g^n)) html("But the secret joint key would be $%s^{%s\cdot %s}\equiv %s$"%(g,m,n,g^(m*n))) 
       

Click to the left again to hide and once more to show the dynamic interactive window

This gives an encryption key useful to both Alice and Bob, to protect from any potential Eve who might be listening in.  (That's Eve for eavesdropping, believe it or not - also a universal person in these stories.)  

On the down side, if Eve not only is listening, but actually has access to Alice and Bob's transmissions and can change them, she can actually add her own exponent $\ell$ to the game, so that she pretends to have secret key $g^{m\ell}$ with Alice and secret key $g^{n\ell}$ with Bob.  Both of their keys' security is now compromised.  Such a situation is known as a "Man in the Middle" attack.  

RSA Public Key

In order to deal with all these issues, we will now introduce the most famous public-key system.  Recall that this is one where we want an encryption key that is easy for anybody at all to use, but is very difficult to undo unless you know the secret.  (Sometimes this is called a trapdoor system, because it's easy to fall in but it's hard to get back out unless you know where the secret passageway is!)

The idea is to make Diffie-Hellman, which technically only relies upon Fermat's Little Theorem and primes, into a system which involves Euler's Theorem, but not so heavily as to make the computation too expensive.  It turns out that the easiest way to do that is to choose a large integer $n$ with only two prime factors, instead of one large prime $p$ as we did before.  For instance:

p=89 q=97 n=p*q html("Multiply the primes $%s$ and $%s$ to get our modulus $%s$"%(p,q,n)) 
       
Multiply the primes 89 and 97 to get our modulus 8633
Multiply the primes 89 and 97 to get our modulus 8633

Now, exponents here live in the world of $\phi(n)$.  We can easily compute this using our magical formula (so that $\phi(n)=(p-1)(q-1)$).  So the computations are going to be easy for us.  

But they will not be so easy to compute without our magical formula, and to use our magical formula we need to have the prime decomposition of $n$.  At least that's what people currently believe; if it isn't true, we are in deep trouble security-wise, as we will see on Wednesday.

In particular, for reasonably large $n$, that means $\phi(n)$ is essentially secret to anyone who isn't tough enough to factor $n$.  Remember that it took until Euler to factor $2^{32}+1$ as follows:

2^32+1;factor(2^32+1);nth_prime(116) 
       
4294967297
641 * 6700417
641
4294967297
641 * 6700417
641

Hence $n=2^{32}+1$ wouldn't have been a bad $n$ to choose in the 1700s, since it would take a LOT of the sieve of Eratosthenes to get to the one hundred sixteenth prime!

That's the preliminaries.  From now on, we do exactly the same thing as before, choosing an $e$ coprime to $\phi(n)$, etc.  This time, though, instead of keeping $e$ secret, we let anybody know it (along with $n$, which we have to let people know anyway).  

Example:

With the same primes, let's choose $e=71$, because that is coprime to $\phi(89\cdot 97)=\phi(89)\phi(97)=88\cdot 96=8448$.

p=89 q=97 n=p*q phi=euler_phi(n) e=71 html("Multiply the primes $%s$ and $%s$ to get our modulus $%s$"%(p,q,n)) html("Are $e=%s$ and $\phi(%s)=%s$ coprime?"%(e,n,phi)) gcd(e,phi)==1 
       
Multiply the primes 89 and 97 to get our modulus 8633
Are e=71 and \phi(8633)=8448 coprime?
True
Multiply the primes 89 and 97 to get our modulus 8633
Are e=71 and \phi(8633)=8448 coprime?
True

And now we compute an inverse mod $\phi(n)$ just as before, which will be (as before) our decryption key.  Since we are able to compute $\phi(n)$, it isn't hard to get an inverse for $e$; if you only knew $n$, though, it would be very hard to do this.

f=mod(e,phi)^-1;f 
       
119
119

Now, just like with D-H, I raise my message (number) to the power $e$ to encrypt, and raise to the power $f$ to decrypt an encrypted message.  Here are all the steps together!

@interact def _(message='hi',p=89,q=97,e=71): secret=encode(message) n = p*q phi = (p-1)*(q-1) if gcd(n,e)==1 and n>secret: code=mod(secret,n)^e try: f=mod(e,phi)^-1 html("My encoded message is $%s$"%secret) html("A big product of primes bigger than that is $pq=%s\cdot%s=%s$"%(p,q,n)) html("(which means my secret $\phi(n)=\phi(%s\cdot %s)=(%s-1)(%s-1)$ is $%s$)"%(p,q,p,q,phi)) html("And I chose exponent $%s$"%e) html("The encrypted message is $%s^{%s}\equiv%s$"%(secret,e,code)) html("The inverse of $%s$ modulo $%s$ is $%s$"%(e,phi,f)) html("And the decrypted message turns out to be:") print ''.join(decode(code^f)) except: html("Looks like $%s$ is not coprime to $\phi(%s)=%s$"%(e,n,phi)) elif gcd(phi,e)!=1: html("Make sure that $gcd(\phi(n),e)=1$!") elif n <= secret: html("My encoded message is $%s$"%secret) html("Make sure that $pq=%s\cdot %s=%s$ is bigger than your secret"%(p,q,n)) 
       

Click to the left again to hide and once more to show the dynamic interactive window

Why RSA works:

But now we have an encryption method where anyone can encrypt.  The modulus $n$ (not written as $pq$) and $e$ are both published, and anyone who wants to send a message of length $n$ or less just exponentiates.   You just have to be sure that $\phi(n)$ and $e$ are coprime for it to be defined properly.

In order to encrypt a message $x$ via RSA with public key $(n,e)$, you do $$x^e\text{ mod }(n)\; ;$$ in order for the owner of the key to decrypt a message $m$, they do $$m^{e^{-1}}=m^f\text{ mod }(n)\; .$$  Since $$ef\equiv 1\text{ mod }(\phi(n))\; ,$$ $ef=k\phi(n)+1$ for some integer $k$.  Hence $$(x^e)^f=x^{ef}=x^{k\phi(n)+1}=(x^{\phi(n)})^k x^1\equiv 1^k x\equiv x\text{ mod }(n)$$ and it all works out, we recover the original message.

But if someone nefarious were to try to decrypt this, they would need access to $f$ somehow, or something equivalent to it mathematically.  That would mean solving $$ef\equiv 1\text{ mod}(\phi(n))$$ for $f$ without actually knowing what $\phi(n)$ is!  Naturally, that is pretty easy to compute in the cases above.  

Example:

But in real life?

p=next_prime(randrange(2^50)) q=next_prime(randrange(2^50)) n=p*q # needs to be bigger than secret html("The first part of my key, $%s$, is the product of my secret primes"%n) 
       
The first part of my key, 694955728256122976170846056757, is the product of my secret primes
The first part of my key, 694955728256122976170846056757, is the product of my secret primes

The $n$ in the cell above is the product of two primes - but would you like to try to compute $\phi(n)$ by hand?  Without knowing the actual primes, it could be very difficult to figure out $\phi(n)$, which you probably need to get $f$.

Realistic examples have much larger primes than this, say 100 digits.  But let's see what would happen next in a 'real' example.

message='mathiscool' # needs to be in quotes secret=encode(message) # needs to be less than n html("My message is $%s$ numerically"%secret) 
       
My message is 68408084029415 numerically
My message is 68408084029415 numerically

Hopefully the randomness of the $p$ and $q$ I picked didn't keep $n$ from being greater than the numerical value of the message.

Now we pick the other piece of our key, $e$.  Believe it or not, it doesn't really seem to matter (though no one has proved this) what $e$ is; documentation for this widely used RSA implementation says "The modulus size will be num bits, and the public exponent will be e. Key sizes with num < 1024 should be considered insecure. The exponent is an odd number, typically 3, 17 or 65537."  

Well, I figure $17$ is easier to use than $65537$ but less obvious than $3$.  Let's check that it's coprime to the modulus of the key.

phi=euler_phi(n) e=17 # needs to be coprime to phi html("And I can check whether $e=17$ is coprime to $\phi(%s)$"%n) gcd(phi,e)==1 
       
And I can check whether e=17 is coprime to \phi(694955728256122976170846056757)
False
And I can check whether e=17 is coprime to \phi(694955728256122976170846056757)
False

If you get False above (I did once in a while during testing), then just pick a different $e$.  (Only evaluate this cell if you have to!)

e=65537 # needs to be coprime to phi html("Second try - is $e=65537$ coprime to $\phi(%s)$?"%n) gcd(phi,e)==1 
       
Second try - is e=65537 coprime to \phi(694955728256122976170846056757)?
True
Second try - is e=65537 coprime to \phi(694955728256122976170846056757)?
True

Once we have our key, away we go!

code=mod(secret,n)^e html("My encoded message is $%s$"%secret) html("A big product of primes bigger than that is $n=%s$"%n) html("And I chose exponent $%s$"%e) html("The encrypted message is $%s^{%s}\equiv %s$"%(secret,e,code)) 
       
A big product of primes bigger than that is n=694955728256122976170846056757
And I chose exponent 65537
The encrypted message is 68408084029415^{65537}\equiv 174251293767624831600040592853
A big product of primes bigger than that is n=694955728256122976170846056757
And I chose exponent 65537
The encrypted message is 68408084029415^{65537}\equiv 174251293767624831600040592853

Crack that!  Who knows what $\phi(n)$ is?

But if I know it, I can calculate the inverse of $e$:

f=mod(e,phi)^-1 html("My original primes were $%s$ and $%s$"%(p,q)) html("So $\phi(n)=(%s-1)(%s-1)=%s$"%(p,q,phi)) html("Which makes $f=%s$"%f) html("And the decrypted message turns out to be:") print ''.join(decode(code^f)) 
       
My original primes were 914666189264551 and 759791644659907
So \phi(n)=(914666189264551-1)(759791644659907-1)=694955728256121301713012132300
Which makes f=551716611767974353522839132573
And the decrypted message turns out to be:
MATHISCOOL
My original primes were 914666189264551 and 759791644659907
So \phi(n)=(914666189264551-1)(759791644659907-1)=694955728256121301713012132300
Which makes f=551716611767974353522839132573
And the decrypted message turns out to be:
MATHISCOOL

Beating the man in the middle:

One final thing to note about this.  Remember our problem with Diffie-Hellman key exchange - that someone who can control your messages can actually fake them?  This can't happen with public-key systems (at least not as easily).  Here's why.

Suppose I want to let someone verify I am who I say I am.  In a public-key system, I never need to let $f$ get known, so I encode my signature with $f$ itself as the exponent!  

First, I just turn my signature into a number.

signature='Crisman' code=encode(signature) 
       

Then I raise it to the power of the secret key $f$, the inverse of the public key $e$.

secret=mod(code,n)^f secret 
       
218933833603512471755737012414
218933833603512471755737012414

Now anyone in the world can check my signature by raising this version of the signature to the public power $e$ modulo $n$.  

secret^e; decode(secret^e) 
       
4342983427
'CRISMAN'
4342983427
'CRISMAN'

The reason this works is because $$ef\equiv 1\text{ mod }(\phi(n))$$ and $ef=fe$ in a commutative setting: $$\left(Name^f\right)^e=\left(Name\right)^{ef}\equiv Name^1\equiv Name \text{ mod }(n)$$

For next time, you should read the rest of this carefully, and try to do the following.

  1. Pick two primes between 1000 and 2000 and create a public key $(n,e)$ for them.  What is the decryption key $f$?  Show your work.
  2. Suppose that $n=9211$ and $e=539$.  Encrypt a (short) message.
  3. Find the decryption key $f$ for this situation, and decrypt your message.  
  4. Use $f$ to sign your name!
  5. Come up with your own RSA public-key system by choosing $p$ and $q$ and $e$ as appropriate, but with $n>10000$; then encrypt a short numerical message and hand in only the public key $(n,e)$ and the encrypted message.  (My job will be to crack it!)