## Thursday, February 27, 2014

### Finding primitive roots

First off, it has to be noted that the constraint that the modulo value is a prime number is just for simplicity (hint: all values from 1 to p-1 are coprime to p for a prime number p).
Now, what EVEN is a primitive root modulo p you may ask. This is just a number g which generates all numbers from 1 to p-1 with $$g^{t} \: mod \: p$$. That is, for every positive integer x < p there exists a t, so that $$g^{t} \equiv x \pmod{p}$$.
Properties of primitive roots:
• for a given p there are phi (phi(p))=phi(p-1) primitive roots
• g is a primitive root $$\Leftrightarrow$$ for every divisor d of $$(p-1)=phi(p)$$, $$g^{d} \not\equiv 1\pmod{p}$$ holds $$\Leftrightarrow$$ for every prime $$x$$ from the prime factorization of $$p-1$$, $$g^{(p-1)/x} \not\equiv 1\pmod{p}$$ holds.
• (comment: this is essentially there to check for repeating sequences in $$g^{t}$$ for $$t \in {1,..,p-1} \pmod{p}$$.
Some research: I've ran this code to look at the distribution of the primitive roots for primes between 100 and 1,000,000,000. By running the code we see this from the output:
• the lower bound for the number of primitive roots is $$0.15 \cdot p$$. Therefore, 15% of the numbers smaller than $$p$$ are primitive roots for a given p (in worst case!). By selecting a random positive integer < p we have a 15% chance of selecting a primitive root!
• given a random p, the chance of hitting a primitive root on random is even ~40%.

Therefore, in order to find a primitive root, we may simply pick up random numbers and check the above property which has to hold for primitive roots. Code which does that: link.

### Now to the discrete logarithm (the index):

for a given x, how to find a $$y=ind_{g}(x)$$ with $$g^{y} \equiv x \pmod{p}$$. The answer is: you can't. Many encryption algorithms are based on this fact. Though, they have some nice properties as normal logarithms:
• $$ind_{g}(ab) \equiv ind_{g}(a) + ind_{g}(b)\pmod{p-1}$$
• $$ind_{g}(a^{k}) \equiv k \cdot ind_{g}(a)\pmod{p-1}$$

Assume we have a problem where we have to iterate over all $$x = \sum\limits_{i=1}^n c_{i} \cdot a_{i}$$, where the $$c_{i}$$ are freely selectable. In such problems a very easy simplification can be made: each such $$x$$ can be represented in the form $$x = t \cdot q$$, where $$t = gcd(a_{1},...,a_{n})$$ and $$q \in \mathbb Z$$.
To understand the above statement, we only need to think of the extended euclidean algorithm. Inside of it we find the integers $$x$$ and $$y$$ so that $$x \cdot a_{1} + y \cdot a_{2} = gcd(a_{1},a_{2})$$ holds. Additionally, we can easily imply that each number $$v$$ written in the form $$v = c_{1} \cdot a_{1} + c_{2} \cdot a_{2}$$ must be a multiple of $$g:=gcd(a_{1},a_{2})$$ (as g divides $$a_{1}$$and $$a_{2}$$ it must divide v for the equation to hold (!)).
We show now that the same holds for 3 variables. Let $$x = a_{1} \cdot c_{1} + a_{2} \cdot c_{2} + a_{3} \cdot c_{3}$$ and $$g:=gcd(a_{1},a_{2},a_{3})$$. We have to prove that $$x$$ is a multiple of $$g$$. Consider the first two variables and write: $$a_{1} \cdot c_{1} + a_{2} \cdot c_{2} = t_{1} \cdot gcd(a_{1},a_{2})$$. This gives us $$x = a_{1} \cdot c_{1} + a_{2} \cdot c_{2} + a_{3} \cdot c_{3} = t_{1} \cdot gcd(a_{1},a_{2}) + a_{3} \cdot c_{3}$$. Simply seen here: there are again two fixed numbers and two free! Smells like the extended euclidean again :)
We write: $$G := gcd( gcd(a_{1},a_{2}) , a_{3} ) = gcd(a_{1},a_{2},a_{3}) = g$$ and get: $$x = t_{1} \cdot gcd(a_{1},a_{2}) + a_{3} \cdot c_{3} = t_{2} \cdot g$$ - with this we're done! Inductively the statement follows for 4 and more variables.