A euclidean domain is a generalization of the integers, that still supports Euclid's gcd algorithm. Like the integers, it is an integral domain, which means multiplication commutes, and there are no zero divisors. A metric function, d(), maps the nonzero elements into the positive integers, and satisfies d(a*b) ≥ d(a) and d(b), and for any nonzero a and b there is q and r, such that a = q*b+r, and r = 0 or d(r) < d(b). I use q and r to remind you of quotient and remainder. Either b goes into a evenly, or the remainder, when a is divided by b, is in some sense smaller than b. This is the heart of Euclid's gcd algorithm.
By convention, d(0) = 0.
Verify the following about d().
d(1) is minimal, as 1 divides everything else.
d(u) = d(1) for any unit u, as 1 and u divide each other.
If everything is a unit, like the rationals, then d(x) = d(1) across the board.
If a and b are associates, wherein a and b divide each other, like 5 and -5, then d(a) = d(b).
If d(u) = d(1), write 1 = qu + r. Since d(r) cannot be less than d(u), u divides 1, and u is a unit.
u is a unit iff d(u) = d(1).
d() can be normalized by subtracting d(1)-1 across the board, whence d(1) = 1. Now d(u) = 1 iff u is a unit.
The integers form a euclidean domain, with d(n) = the absolute value of n.
Even if you do not know that d(a) ≤ d(ab), the units all come out lower than the nonunits. Let u be a unit and let x be a nonunit. Since x does not divide x+u, the remainder, u, has a lower metric than that of x. There is some x with d(x) minimal, and all the units lie below. Once again there is no harm in setting the metric of all the units to 1.
You can see how we're going to run the gcd algorithm. If any factor f, or the greatest factor g, divides a and b, it divides b and r; and if g divides b and r it divides a and b. Replace a and b with b and r, and since d(r) is smaller, we have made progress. This leads to g, the greatest common divisor of a and b.
One more thing; we need to prove a euclidean domain is a factorization domain. Suppose f cannot be written as a finite product of irreducibles. Think of f as f0, and write f0 = x1f1, where f1 cannot be written as a finite product of irreducibles. By construction, f0 is a multiple of f1. If f1 is a multiple of f0, say cf0, then f0 = x1cf0, which makes x1 a unit. Since f0 was separated into nonunits, this is a contradiction. Thus f1 is not a multiple of f0.
Let H0 be all the multiples of f0, and let H1 be all the multiples of f1. H1 properly contains H0.
Write f1 = x2f2, where f2 cannot be written as a finite product of irreducibles. Repeating the above, H2, the multiples of f2, properly contains H1. Continue this forever, building an ascending chain of sets Hi.
Let U be the union over Hi. Let s be the nonzero element of U with the least metric. For purposes of illustration, let H7 bring s into the picture. Thus s is a multiple of f7.
For any t in U, write t = qs + r. Since d(r) cannot be smaller than d(s), t divides s. Everything in U is a multiple of s. Since f8 belongs to U, f8 is a multiple of s. Yet s is a multiple of f7, hence f8 is a multiple of f7. This is a contradiction. Irreducibles cannot spin off forever. Every nonunit is a finite product of irreducibles, and the euclidean domain is a factorization domain.
A factorization domain with a gcd algorithm becomes a unique factorization domain, or a ufd. A Euclidean domain is a ufd. Everything factors uniquely into primes.
Nowhere in this proof do I use the fact that d(ab) ≥ d(a) or d(b). It doesn't seem to be necessary to produce the ufd. I'm not sure why it is included in the definition of a euclidean domain; perhaps to keep things neat and tidy.
Run the gcd algorithm to find g, the gcd of s and t, and remember the quotients along the way. Then backtrack to write g as a linear combination of s and t. This procedure was described for the integers, but it applies to any euclidean domain. The algebra is the same.
From here you can characterize the solutions to various equations such as xs + yt = n. The solution space might not be a lattice in the geometric sense, but the algebra is the same.
Finally, find the inverse of x mod m by solving yx + km = 1, assuming x and m are coprime. Use the same algorithm as the integers.
Review the Gaussian integers, a grid of points that defines a checkerboard in the complex plane.
Let d() be the norm of a+bi, which is a2+b2. This is a positive integer everywhere except the origin, and it is 1 only for the units ±1 and ±i. Since the norm of the product is the product of the norms, d() increases as gaussian integers are multiplied together, or stays the same if one of the operands is a unit. The tricky part is the quotient remainder criterion. I'll illustrate by looking for a common divisor of 10 and 17+24i.
If g divides both numbers, it divides 17+24i - 10×(2+2i).
But how did I select 2+2i?
The number 10 acts as a base for a lattice in the complex plane.
Ten times the integers produces multiples of ten along the x axis.
Ten times the integers times i produces multiples of ten up the y axis.
Ten times the gaussian integers produces a larger grid in the complex plane.
These dots determine an infinite checkerboard
where each square is ten units on a side.
Find the square that contains 17+24i.
Its coordinates run from 10 to 20 along the x axis, and from 20 to 30 along the y axis.
Which corner is closes to 17+24i?
The answer is 20+20i, being just 5 units away
by the pythagorean theorem.
Subtract this corner, and g divides -3+4i.
This is a smaller number, i.e. closer to the origin, than 10,
so we've made progress.
Actually we were bound to make progress,
because the distance from any point in a square to its nearest corner
is always less than the side of the square.
Since norm is distance squared, this is the euclidean criterion we were looking for.
Continue the gcd algorithm by using -3+4i as the base for a new checkerboard in the complex plane. This checkerboard is tilted with respect to the axes. Multiply the base vector -3+4i by 0, 1, i, and 1+i to build the base cell. This square, and the entire checkerboard, is tilted, but that doesn't matter. The squares are smaller, just 5 units on a side, and that's what counts. The point 10 on the x axis lies inside a square, and the corner closest to 10 is 11+2i. Subtract these two points, giving a difference of 1+2i. This is even closer to the origin. If this vector acts as the base for a smaller, tilted checkerboard, -3+4i lands smack on one of the grid points. In other words, 1+2i divides evenly into -3+4i, and we've found our gcd. Both 10 and 17+24i are divisible by 1+2i, and any other factor that divides both numbers divides into 1+2i. Norm is a euclidean metric that facilitates the gcd algorithm, and proves unique factorization. The Gaussian integers forma ufd. There are infinitely many gaussian primes, via the same proof as the integers. If z is the product of all the primes on the finite list, spin it around to the first quadrant, then add 1, giving a number with a larger norm. Given a set of gaussian integers s1 through sn, one can solve the equation x1s1 + x2s2 + … + xnsn = z. As before, the solutions form an n-1 dimensional lattice, shifted away from the origin, but this time the numbers are all complex. The dimension of the lattice doubles to 2n-2 if you want to picture the solutions in real space with real variables. |
|
Remember that the norm of u is 1 iff u is a unit, that is, 1, -1, i, or -i. With this in mind, a factorization of z into smaller primes implies a factorization of its norm. Turn this around, and |z| prime implies z is prime. Thus 1+i, 2+i, 3+2i, and 4+i are prime, with norms 2, 5, 13, and 17 respectively.
The first of these, 1+i, is special in that its conjugate 1-i is really the same prime. That is, (1+i) × -i = 1-i. The two primes are associates of each other, just like 7 and -7 in the integers. Take any other z in the first quadrant, and assume its conjugate, in the fourth quadrant, is z × -i. That puts z on the 45 degree line, and makes z a multiple of 1+i, whence z is not prime. So 1+i is the only prime whose conjugate is an associate of itself. Any other prime, such as 2+i, implies a distinct conjugate prime 2-i.
So what is the unique factorization of 2? Write 2 = 1+i × 1-i. Remember that 1+i and 1-i are associates, so 2 is actually (1+i)2 (up to associates). In other words, 2 is a prime squared. The integer 2 is called a ramified prime, relative to the gaussian integers. Its factorization is a complex prime squared.
The norm of 2+i is 5, hence 2+i is prime. As shown above, 2-i is not an associate of 2+i. It's a different number - a different prime. Thus 5 = 2+i × 2-i. This is the unique factorization of 5. There are many other examples, such as 13 = 3+2i × 3-2i, and 17 = 4+i × 4-i. These primes are called unramified.
What is the factorization of 73? The norm of 73 is 732. If 73 is not itself prime then it is the product of primes by unique factorization. The norm of each prime factor properly divides 732, hence the norm of each prime factor is 73. If the first such factor is s, then the conjugate of s also divides 73, hence that is the "other" factor of 73. An integer prime remains prime or it splits into a complex prime times its conjugate. In our example, a2+b2 = 73. Reduce mod 73, and (a/b)2 = -1. However, -1 is not a square mod 73. Therefore 73 remains prime, even in the complex numbers. This holds for every prime that is 3 mod 4, such as 11 or 103. These are called inert primes.
If p = 1 mod 4, -1 is a square mod p, and there is some x such that x2 + 1 = 0 mod p. Write this as x+i × x-i = kp. Since p does not divide x+i or x-i, p is not prime. Instead, p = a+bi × a-bi, and the norm of a+bi is p. That makes p an unramified prime. All the real primes are either ramified (2), unramified (1 mod 4), or inert (3 mod 4). (Technically, inert primes are also unramified, but I refer to them as inert for clarity; they are inert because they don't move or change in the extension.)
In practice, we can find a+bi, even if p is a thousand digits long. Fish around for a primitive root mod p. You'll find one within a few tries. Let r be such a root, then x is r(p-1)/4. That gives x+i; run gcd(x+i,p) as described above to produce a+bi.
Let z be a complex prime between the x axis and the 45 degree line. Thus z conjugate is a different prime. Multiply z by its conjugate to get the norm. If the norm factors into a product of primes, then assume z divides the first prime p. If z divides p then so does z conjugate. The norm of z divides p, and equals p, and that case has already been covered. All the gaussian primes have been characterized. They are 1+i (a special case), unramified (such as 2+i and 2-i over 5), or inert (such as 73).
Recall that 2+i is prime, with norm 5. What do the gaussian integers look like mod 2+i? Start with 1 and march along the x axis. When you get to 5 you are back to 0, because 5 is a multiple of 2+i. On the other hand, 2+i does not divide any of the integers smaller than 5, so the elements 0 through 4 are distinct mod 2+i. There are at least 5 values mod 2+i.
In fact there are only 5, as shown by a geometric argument. Multiply 2+i by 0, 1, 1+i, and i to build the base cell. This is a square that runs from 0 to 2+i to 1+3i to -1+2i and back to 0. This cell repeats to produce a tilted checkerboard across the plane. Each square has a side of length sqrt(5), and an area of 5. Step back far enough, and lattice points have to agree with area. Each lattice point has area 1 and each square has area 5. Thus each square defines 5 lattice points, one at its bottom corner and 4 inside. The complex integers mod 2+i are 0, 1, 2, 3, 4, or if you prefer, 0, i, 1+i, 2i, and 1+2i, inside the base cell. The correspondence is as follows:
0 → 0
1 → 2i
2 → 1+2i
3 → i
4 → 1+i
Reducing mod 2+i is a homomorphism that respects addition and multiplication, so these 5 elements, in any cell you like, or anywhere in the plane, act just like 0 1 2 3 and 4 mod 5.
The same thing happens for any prime a+bi, with norm p, where p = 1 mod 4. Reduce mod a+bi and find a structure that is isomorphic to the integers mod p.
If p is an inert prime, like 73, then reduce mod p, and pull the plane back onto the base cell from 0 to p to p+pi to pi. This has an area of p2, like the norm, and contains p2 lattice points. Add and multiply using the rules for complex numbers, then reduce the components mod p. Since p is prime, everything in the square, other than 0, is invertible. Use the gcd algorithm and backtrack to find the inverse of z mod p.
Let m be the product of 3 distinct complex primes q1×q2×q3. What do the gaussian integers look like mod m? Reduce mod q1, mod q2, and mod q3, as described above, then take the cross product of these three structures. This works because of the chinese remainder theorem. That was developed for integers, but it works here as well. The bijection relies on the inverse of q2×q3 mod q1, and thanks to the gcd algorithm, this inverse can be computed. The rest of the proof is the same. And this holds for any euclidean domain. If you know the structure of the domain mod pk for any prime p, then factor m into distinct prime powers, find the quotient mod pk for each pk in m, and paste the results together as a direct product of rings.
Find all integer solutions for x2 + y2 = z2. These are called pythagorean triples, because they form the sides of a right triangle, following the pythagorean theorem. The smallest solution is 3,4,5, hence a right triangle with sides of lengths 3 and 4 has a hypotenuse of length 5.
If any two variables have a common factor then that factor can be divided out of the equation. It is enough to look for solutions where the three variables are mutually coprime. No need to talk about 6, 8, 10, if you already have 3, 4, 5 in hand.
Replace x2 + y2 with (x+yi) × (x-yi). Consider the prime factors of x+yi. If an integer prime divides x+yi then x and y have a common factor, which we have just ruled out. Hence all the prime factors of x+yi are complex.
If 1+i divides x+yi then z is even. Yet x and y are odd, to remain coprime to z. When the equation is reduced mod 4 we obtain 1+1 = 0, which is impossible. Thus z is odd, and without loss of generality, let y be even.
If some prime s divides both x+yi and x-yi, take the difference and sum to show s divides 2x and 2y. We've already ruled out 1+i, so s cannot divide 2. It must divide both x and y. The same holds for s conjugate, so s times s conjugate, a real number, divides x and y, which contradicts x and y coprime. Therefore x+yi and x-yi are coprime, no factors in common.
Every prime dividing x+yi divides z, and z is squared, so every prime in x+yi is taken to an even power. Gather the primes together into the complex number u+vi, such that x+yi is the square of u+vi, or an associate thereof. The conjugate primes form u-vi, and their square builds x-yi. Square u+vi and equate terms, giving the following.
u > v > 0, u and v coprime, u and v of different parity:
x ← u2-v2
y ← 2uv
z ← u2+v2
Note that u and v are coprime, else their common factor would divide x y and z. They also have different parity, else z is even.
We're assuming y is even, and since 2uv is even, components correspond.
If either u or v is negative, make it positive. This might change y from negative to positive, but we only need positive pythagorean triples, so that's fine.
If v > u, swap them, making x positive.
Every coprime pythagorean triple leads to u > v > 0, with u and v coprime, and having opposite parity. Conversely, every u v pair with these properties produces a valid pythagorean triple. Verify this by squaring x y and z, using the formulas above. Pythagorean triples and u v pairs correspond 1 for 1.
As an example, set u = 5 and v = 2, hence x = 21, y = 20, and z = 29. Sure enough, 441 + 400 = 841.
If z2 had a coefficient of 7, we'd be in trouble. Inert primes that are 3 mod 4, like 7, can't be split between x+yi and x-yi, hence there is no solution. A coefficient of 73 is no better. When x and y are divisible by 7, that takes two of the factors away, but there's still a factor of 7. In general, a coefficient on z2 that contains a prime p to an odd power, where p = 3 mod 4, prohibits any integer solutions.
A coefficient of 5, however, is just fine. Multiply by 1+2i and x = u2-v2-4uv, and y = 2u2-2v2+2uv. Set u = 9 and v = 4 and get 792 + 2022 = 5×972. Or if you prefer, multiply by 1-2i and get 2092 + 582 = 5×972.
Multiply u+vi by i, and x y and z don't change, up to sign, so associates aren't significant. Replace v with -v, and that is like multiplying (u+vi)2 by 1-2i instead of 1+2i, so conjugation is not significant either. Thus the criteria are the same as they were before: u > v > 0, u and v coprime, and u and v of different parity. But be careful; u = 7 and v = 4 is fine when multiplied by 1+2i, giving 792 + 1222 = 5×652; but multiply by 1-2i instead and get 1452 + 102 = 5×652. That's fine, mathematically, but it's not a coprime solution. u+vi already had a factor of 1+2i, so when you bring in 1-2i you get 5.
If the coefficient on z2 is 65, then multiply (u+vi)2 by 1±2i and 3±2i, for 4 classes of solutions. The u v criteria are the same, and some of the emerging triples may not be coprime.
How many ways can n be written as a sum of squares?
If n = a2+b2 then n is divisible by a+bi. Conversely, each a+bi leads to a representation of n as a sum of squares. We're not interested in associates or conjugates of a+bi, which simply change signs or swap variables. Count the unique factors of n, of the form a+bi, disregarding associates, then divide by two to get rid of the conjugates.
If p prime divides n, and p = 3 mod 4, p doesn't split into complex primes. It has to divide a+bi, hence p2 divides n. All the primes in n that are 3 mod 4 must appear in even powers, or n is not the sum of squares at all.
If n is even, let a+bi hold all the instances of 1+i, and put the conjugates, the powers of 1-i, in a-bi. Since 1+i and 1-i are associates, it really doesn't matter how we divide them up.
Let pk divide n, where p is an integer prime that is 1 mod 4. Select anywhere from 0 to k instances of the prime u+vi lying over p and fold them into a+bi. If j instances are selected, then k-j instances of u-vi are folded into a+bi. We have k+1 options. Do this for all the other primes mod 4. The number of representations is the product of ki+1, where n contains ki factors of pi, and pi is 1 mod 4.
As mentioned earlier, we must divide this combinatorial product by 2, to compensate for the double counting of conjugate pairs a+bi and a-bi. This makes sense when the product is even, but when the product is odd, because every ki is even, it is possible to choose half and half for each prime p in n. In this one case, conjugation produces the same sum of squares. We don't want to throw this one away, so round the product up to the nearest even number, then divide by 2.
At this point an example is desperately needed. Consider 23×54×72×372. Since 7 is 3 mod 4, it has to be squared, and it is. Combine 7 with 3 instances of 1+i to get -14+14i. The exponents on the remaining primes are 4 and 2, so the number of representations is 5×3 or 15. Round this up to 16 and divide by 2, giving 8. These are summarized below, in no particular order, except that the last entry is the odd one, that acts as its own conjugate, and was not double-counted in the 15 combinations. Verify that each pair of numbers below, when squared, yields 335405000.
3122, 18046 16310, 8330 16450, 8050 3430, 17990 12334, 13538 8806, 16058 18130, 2590 12950, 12950
Find all solutions to x2 + y2 = z4. Assume variables are pairwise coprime.
This is of course a pythagorean triple, but this time u2+v2 = z2. Well that is another pythagorean triple. Characterize these triples as: u = s2-t2, v = 2st, and z = s2+t2. This does not assure u > v, but if v > u, x will come out negative, and that's no big deal.
s > t > 0, s and t coprime, and s and t of different parity:
x ← s4 - 6s2t2 + t4
y ← 4st×(s2-t2)
z ← s2 + t2
The first example, with s=2 and t=1, is 72 + 242 = 54.
Set s = 3 and t = 2 to get 1192 + 1202 = 134.
However, x and y don't have to be coprime. If x and y are divisible by m2 then do so, and divide z by m. That reduces to a smaller triple. Assume instead that x, y, and z2 are divisible by m, where m is squarefree. Dividing z by m won't help in this case. We can't reduce to a smaller triple.
Let k2 + l2 = m2. Multiply x+yi times k+li to get new values of x and y, then multiply these, and z, by m. Take the last example, 1192 + 1202 = 134, and fold in the 3 4 5 triangle, giving (123×5)2 + (836×5)2 = (13×5)4 for 3+4i, and (837×5)2 + (116×5)2 = (13×5)4 for 3-4i.
Now look for coprime solutions to x4 + y2 = z2. This time u2 - v2 = x2, or x2 + v2 = u2. Since x is still odd, v is even. Characterize these triples as follows.
s > t > 0, s and t coprime, and s and t of different parity:
u ← s2+t2
v ← 2st
x ← s2-t2
y ← 4st×(s2+t2)
z ← s4 + 6s2t2 + t4
The first example, with s=2 and t=1, is 34 + 402 = 412.
Again, you can fold in a klm triple, where k is squarefree. The fold-in process is a bit different. Move y to the other side: x4 = z2 - y2 = (z+y) × (z-y). Adjust z+y by m+l and m-l. The 3,40,41 example, with 5,12,13 folded in, yields (3×5)4 + (1012×5)2 = (1013×5)2 for 13+12, and (3×5)4 + (28×5)2 = (53×5)2 for 13-12.
Finally raise y, the even variable, to the fourth power: x2 + y4 = z2. Assume a coprime solution, whence 2uv = y2. With u and v coprime, one is an odd square and the other is twice a square. It looks like this.
s and t positive, s and t coprime, and s odd:
x ← s4 - 4t4
y ← 2st
z ← s4 + 4t4
Set s = t = 1 and get the familiar 3,4,5 triangle in a new way.
32 + 24 = 52
Set s = 7 and t = 2 and get:
23372 + 284 = 24652
Again, a klm triple can be folded in, where l is squarefree. Move x2 to the other side and adjust z+x by m+k and m-k.
Instead of adjoining the square root of -1, as we did to build the gaussian integers, adjoin the square root of -2. Let q be the square root of -2, not to be confused with i, the square root of -1. In the complex plane, q is the point on the imaginary axis approximately 1.414 units up from the origin. Move one unit to the right to find 1+q, one unit to the left to find -1+q, and so on. This extension creates a regular grid, a lattice in the plane, but this time the grid defines rectangles, rather than squares. Each rectangle is almost half again as tall as it is wide.
Recall that the gaussian integers had a norm, the square of the distance from the origin. This norm is still valid, and still commutes with multiplication. This makes sense, since the norm is valid over the entire complex plane. Furthermore, this norm makes the extension a euclidean domain. As before, the only tricky part is the quotient remainder criterion. If b is a complex number, the base cell is b, multiplied by 0, 1, 1+q, and q. This defines a tilted rectangle in the plane. All the multiples of b create a grid of rectangles across the plane. If a is another point in the plane, it winds up in one of these rectangles. Its distance to the nearest corner is less than the length of b. This is the remainder when a is divided by b, and it has a smaller norm. (Norm is actually distance squared, but a smaller distance implies a smaller norm.) The quotient remainder criterion is satisfied, and the ring is a euclidean domain, and a unique factorization domain (ufd).
The units have norm 1, and since the norm of a+bq is a2+2b2, the only units are ±1.
If a+bi has norm p, where p is prime, then -2 is a square mod p. This happens iff p = 1 or 3 mod 8. The integer primes are: 2, ramified, since q2 is an associate of 2, p= 1 or 3 mod 8, unramified, and p = 5 or 7 mod 8, inert.
If p is a thousand digits long, can we find the complex prime over p, as was done successfully in the gaussian integers? Assume p = 1 mod 8. Find a primitive root r and raise to the (p-1)/8. Call this x, the eighth root of 1. Remember that x is also the fourth root of -1. Square x3+x and get -2. Thus x3+x+q has a norm that is some multiple of p. Since p does not divide x3+x+q or x3+x-q, some prime factor of p does. This is once again a consequence of unique factorization. Take gcd(x3+x+q,p) to extract the complex prime over p.
If p = 3 mod 8 things are a bit trickier. Adjoin, to the integers mod p, the element i, the square root of -1, which was not there before. I'm getting ahead of myself here, but this creates a finite field of order p2, and it has a primitive root r, and from here you can do the same thing. But how do you know x3+x is going to be a pure integer; what if it still has an i component? Remember that -2 is a square mod p, so it has 2 square roots in the integers. It can only have 2 square roots, because y2+2 has at most 2 solutions. So x3+x is an integer, and x3+x+q makes sense. Take the gcd with p to find the prime over p.
Note that x3-x is the square root of 2.
If z is a complex prime, then the analysis is the same as it was in the gaussian integers. z*z is some integer with prime factors, so z divides some prime p. The conjugate z is a distinct prime that also divides p, |z| is a nontrivial factor of p, |z| = p, and z is a prime lying over p. All the primes of this ufd have been characterized.
With the primes in hand, use unique factorization to find all integer solutions to x2 + 2y2 = z2. This is another variation on the pythagorean triple.
Once again we are looking for positive, coprime solutions. This means x and z are odd, else all variables would be even.
Looking mod 4, y must be even, else 1+2 = 1.
Rewrite x2 + 2y2 as (x+yq)×(x-yq). As before, only complex primes divide x+yq, x+yq and x-yq are coprime, all primes in x+yq are squared, x+yq is the square of some u+vq, and the following equations hold.
u > 0, v > 0, u and v coprime, u is odd:
x ← u2-2v2
y ← 2uv
z ← u2+2v2
As before, u and v are coprime, but this time u is odd, and v can attain any parity. Both y and 2uv are even, so components correspond. Make u and v positive, which makes y positive. However, if x is negative, we can't simply swap u and v as we did before. That's all right; we'll allow x to be negative when u < sqrt(2)v. That leaves u and v in the first quadrant. Once again the map is 1 for 1.
When u = 5 and v = 2, x = 17, y = 20, and z = 33.
When u = 7 and v = 3, a same parity solution, x = 31, y = 42, and z = 67.
When u = 3 and v = 7, x = -89, y = 42, and z = 107.
Place a coefficient of 23 on z2, and there is no solution. This because 23 is an inert prime, and does not factor. But a coefficient of 11 = 3+q × 3-q is fine.
At this point you're asking yourself, "Is there a ring that is not a euclidean domain, not even a unique factorization domain? When does unique factorization fail? Is there some element that factors into irreducibles in two different ways?"
Let q be the square root of -5 and adjoin q to the integers. This builds another grid of rectangles in the complex plane. These rectangles are a bit taller than those of the previous section, 2.236 versus 1.414. The norm, a2+5b2, is still a valid multiplicative homomorphism into the positive integers. The only units are ±1, having norm 1.
Consider (2+q) × (2-q) = 9. Yet 3 × 3 is also 9. The norm of 3 is 9, and if s was a proper factor of 3 it would have norm 3, but nothing has norm 3, (a2+5b2 is never equal to 3), so 3 is irreducible. In the same way, 2+q and 2-q are irreducible. The composite number 9 is the product of irreducible factors in two different ways.
So what happened to our gcd algorithm? It must fail somewhere. The problem is, the rectangles are too tall. The middle is too far from the corners, and if we subtract, the difference can be larger than the width of the rectangle. Try running gcd(3,2+q). Let 3 act as the base vector, spawning a series of rectangles in the complex plane. These cells are lined up with the axes, and are easy to picture. They are 3 units wide and approximately 6.7 units tall. The element 2+q lies in the first cell, and the closest corner is at 3,0, so subtract, giving 1-q. Use this as a base vector; it's smaller, with norm 6. Now the rectangles are tilted, so get ready to turn your head. The base rectangle with its bottom corners at the origin and 1-q runs up and to the right, with its top corners at 5+q and 6. Now 3 is smack in the center of this rectangle, and is 3 units away from all four corners. If we consider 2+q instead, it is the center of the adjacent cell, and is 3 units from its four corners. The gcd algorithm cannot proceed. This is not a ufd.
Neither can we characterize all the solutions to x2 + 5y2 = z2, as we did before. You can make x+yq a square, and that works, giving x = u2-5v2, y = 2uv, and z = u2+5v2, but it doesn't cover everything. The solution x=2, y=1, z=3 has no corresponding uv pair, since 2uv would have to equal 1.
For notational convenience, let q be the square root of -3. In the world of complex numbers, the sixth root of 1 is (1+q)/2. Multiply it out and see, or plot it on the unit circle. Adjoin this to Z and see what happens. The resulting ring is called the Eisenstein integers, just as Z adjoin i forms the Gaussian integers.
This time the adjoined element is not a pure imaginary number, hence the vector that is based at the origin is not perpendicular to the x axis. In other words, the cells of the lattice are no longer rectangles. They are pushed over rectangles, or parallelograms. It is tempting to cut each parallelogram in half, making an infinite lattice of perfect equilateral triangles, and that is what I did in my original characterization of this ring, but that disguises some of the algebra. There are really two vectors, 1 and (1+q)/2, and they span a lattice of parallelograms, just like any other pair of independent vectors in the plane.
The lattice consists of multiples of 1 and q, but the coefficients can be half integers. Both are integers or both are half integers. This holds as lattice points are added together. Verify the same for multiplication: (1+q)/2 times {1, or q, or (1+q)/2}, produces half integers. The ring is well defined.
The norm, a2+3b2, is still a valid multiplicative map into the positive integers, even if a and b are half integers. All this works for q the square root of -7, -11, -15, -19, and so on.
There are six different units with norm 1, namely the six roots of 1 in the complex plane. These are spaced 60 degrees apart around the unit circle.
Now let's go for the gold. Consider the gcd algorithm. Let the shorter vector establish a lattice of parallelograms, (in this case rhombuses), find the cell that contains the longer vector, and measure the distance from the end of the longer vector to the four corners of the cell that contains it. The distance is always less than the side of the rhombus. Thus the norm (which is distance squared) has decreased. The gcd algorithm runs to completion, giving a euclidean domain and a unique factorization domain (ufd).
If q is the square root of -7 or -11, and you adjoin (1+q)/2, the parallelograms grow taller, but the distance between the center and the nearest corner is still less than the short side of the parallelogram. The gcd algorithm works, and these are all ufds.
Returning to q = sqrt(-3), every prime, and every ring element for that matter, has an associate that does not require half integer coefficients. If x has coefficients a/2 and b/2, ask whether a = b mod 4. If they are equal, multiply x by (1-q)/2; otherwise multiply x by (1+q)/2. The result presents whole coefficients. These form a subring, which is coming up in the next section.
Going the other direction, any prime other than 2 has associates with half integer coefficients. Assume a prime other than 2 has whole coefficients. The norm is odd, so a and b have opposite parity. Multiply by (1+q)/2 and get half integers. The exception is 2, with norm 4, and associates -2 and ±1±q.
Speaking of primes, they are easier to categorize when the extension is viewed as cyclotomic. Adjoin w, the cube root of 1, rather than q. Since w+1 = (1+q)/2, the resulting grid is the same. But the coefficients are integers once again; no need to deal with half integers.
Since w3 = 1, and w is not 1, divide by w-1 (taking that root out of the picture), and find w2+w+1 = 0. Solve this by the quadratic formula and get w = ½(-1±q)).
The conjugate of w is w2, and the norm of a+bw is a2-ab+b2. At first this looks like it might come out negative, but it is still the distance squared to the origin, and has to be a positive integer, or 0 if a = b = 0.
As per the above formula, the norm of 1-w is 3. This makes sense if you plot it in the complex plane; it is sqrt(3) distance from the origin. In confirmation, multiply 1-w by its conjugate and get 3.
(1-w) * (1-w2) =
1 - w - w2 + w3 =
2 - w - w2 =
3 - (1 + w + w2) =
3 - 0 = 3
But the conjugate, 1-w2, is 1-w times the unit 1+w. The primes 1-w and 1-w2 are associates; they are essentially the same prime. In other words, (1-w)2 * (1+w) = 3. This makes 3 a ramified prime, with prime factorization (1-w)2.
When we adjoined the square root of -1, or the square root of -2, every complex prime (i.e. not on the x axis) and its conjugate were distinct primes, not associates of each other, except for the prime lying over 2. We want to prove the same thing here. Setting aside 1-w and 1-w2, lying over 3, all other complex primes come in distinct conjugate pairs. It's not quite as easy to prove however, so let's get started.
Remember that the units are the sixth roots of 1, thus conjugate pairs will be associates iff they differ by 60, 120, 180, 240, or 300 degrees at the origin.
An element of the form a-aw is a times 1-w, and is certainly not prime, unless a is 1, giving the prime over 3. So if the prime is a+bw, we don't have to worry about a = -b any more. Our prime cannot have an angle of -30 with the x axis. Taking the 6 associates, our prime cannot meet the x axis at an angle of ±30, ±90, or ±150 degrees.
Suppose a prime a+bw and its conjugate are associates. If b is 0 the prime is real. If a is 0 the prime is w times b, thus an associate of a real prime. Thus a and b are nonzero.
Multiply by -1 if need be, so that a is positive. Travel a distance a along the x axis, then a vector of length b rises up and to the left, or down and to the right if b is negative, at an angle of 60 degrees. Considering both the prime and its conjugate together, the picture looks like an arrow or a capital Y lying on its side.
Pushing Y to the right, the angle decreases from 120 to 60 degrees, but this is the prime 1-w, which lies over 3. Larger values of a make a smaller angle; no associates here.
Move to the arrow, wherein b is positive. As the vertex moves right the angle becomes 180, which has been ruled out, then 120. This happens when a = b, as in 1+w. Once again we have a real prime times a unit. Further to the right, the angle becomes 60 degrees, but that puts the prime at 30 degrees to the x axis, which has been ruled out. These are all the cases; hence a prime and its conjugate are distinct.
Let p be a real prime that is 2 mod 3. Its norm is p2. If a complex prime divides p then so does its conjugate, which is a different prime, hence a2-ab+b2 = p. Reduce mod p, then multiply by a+b, giving a3+b3. Since a ≠ -b, a2-ab+b2 = 0 iff a3+b3 = 0. This makes a/b a cube root of -1 mod p. If p = 2 mod 3, then p-1 is coprime to 3. Cube all the integers mod p and you simply permute them about. The cube of -1 is -1, and nothing else lands on -1. Thus a/b = -1, and a = -b, which we already ruled out. Therefore p is an inert prime.
When p = 1 mod 3 there is a nontrivial cube root of 1. In fact this can be found computationally, even if p is a thousand digits long. Fish around for a primitive root and raise it to the (p-1)/3. Call this x, and x-w has norm x2+x+1, which is 0 mod p. Take the gcd of x-w and p to find the prime a+bw over p. These are the unramified primes.
Given x as above, (2x+1)2 = -3. If p is also 1 mod 4, multiply this by the square root of -1 to get the square root of 3. There is an efficient process to find the square root of -1, ±2, and ±3 mod p, whenever these numbers are squares mod p. Actually we can find the square root of anything mod p, but that's coming up later.
How many ways can n be represented as a2 - ab + b2? This is similar to the sum of squares problem that we solved earlier. Factor a2-ab+b2 into a+bw times its conjugate. Group primes together to build a+bw, leaving the conjugate primes for the conjugate factor.
To keep things simple, let the order of a and b be significant, and allow a and b to be negative. Any solution can be multiplied by ± a power of w to get a new solution. Only 0 stays put when multiplied by w, so each solution expands into 6.
So how many solutions are there?
If an inert prime, such as 5, divides n, it must appear to an even power. If n is divisible by 25 then a and b are divisible by 5, and we can move on.
If 3 divides n, then 1+q or its conjugate is part of a+bw. It doesn't matter which one, as they are associates.
That leaves the primes that are 1 mod 3. If pi is raised to the ki, there are ki+1 possibilities. None, some, or all of the upper primes can be folded into a+bw. Take the product over ki+1, then multiply by 6 as described above. This is the number of a,b pairs that yield n.
If signs don't matter, then multiplying by -1 gives the same solution. Divide through by 2. More combinatorial analysis could be done here, e.g. the effect of swapping variables etc, but this problem doesn't come up very often, so I'm not going to worry about it.
Here is a simple example, with n = 7×13 = 91.
5, -6 6, 11 -11, -5 10,1 -1, 9 -9, -10 9 -1 1, 10 -10, -9 11, 6 -6, 5 -5, -11
You can also characterize the solutions to a2 - ab + b2 = c2. I'll leave that as an exercise.
a2+ab+b2 falls out from the above, for (a,b) is a solution to the former iff (a,-b) is a solution to the latter.
Let q be the square root of -3, as we did above, but this time simply adjoin q, rather than w the cube root of 1. Half the points in the previous lattice go away, leaving a lattice of rectangles. The points that are missing are the centers of these rectangles. Put these back in and the previous lattice appears.
These missing centers are important for another reason; their absence derails the gcd algorithm, which use to work. Try running gcd(2,1+q). If 2 starts a series of rectangles, 1+q lands smack in the center of the first cell, and is exactly 2 distant from all four corners. The gcd algorithm is stuck, and we don't have a ufd.
As if in confirmation, 4 = 2×2 = (1+q)×(1-q). The numbers 2 and 1+q have norm 4, and there is no a+bq such that a2+3b2 = 2, so 2 and 1+q are irreducible, and 4 factors in two different ways.
If s factors in Z[q] then it factors in Z[w]. Conversely let s factor in Z[w]. Spin the two factors around to have integer coefficients. Their product is now a unit times s. The only units in Z[q] are ±1. If the product is -s then negate one of the factors. Thus s factors in Z[q]. In summary, s in Z[q] is irreducible iff it is also irreducible in Z[w].
Boldly try to factor s in Z[q] by factoring s in the underlying ufd The separation of s into primes is unique, up to associates. For any prime other than 2, there are only two associates with integer coefficients, the prime and its opposite. These are associates in Z[q]. It's just a matter of sign. However, 2 has 6 associates, all with integer coordinates. Any of these are fair game. These map to three different irreducibles in Z[q]. The factorization of s in Z[q] is unique, except for the ambiguity among the three irreducibles 2, -1+q, and -1-q. Z[q] is almost a ufd.
Find all integer solutions for x2 + 3y2 = z2.
To be coprime, 3 cannot divide x or z.
Look mod 4 to show x cannot be even.
Look mod 8 to show 4 cannot divide z. If z is even it only has one factor of 2. That's good, since 2 is a trouble maker.
Split x2+3y2 into (x+yq)×(x-yq). If these have a common factor, it also divides x and 2yq. We already ruled out 2 into x, and if q divides x then 3 divides x, which we ruled out. Thus all odd primes dividing x+yq are squared, and we can work with (u+vq)2 as before.
If z is odd then we are done, and the following formulas apply.
u > 0, v > 0, u and v coprime, u and v of different parity:
x ← u2 - 3v2
y ← 2uv
z ← u2 + 3v2
When z is odd, x is odd and y is even (look mod 4), so our components are in the right places. Make u and v positive, and understand that x might come out negative. Show that u and v are coprime, and have opposite parity. When u = 5 and v = 2, x = 13, y = 20, and z = 37. Swap u and v to get x = -71, y = 20, and z = 79.
If z is even we need an associate of 2. We can't use 2, because then x and y would be even. Since x+yq and x-yq have the same norm, we need to assign one associate to each factor. Multiply the above formulas by 1+q or 1-q to get the following.
u > 0, v ≥ 0, u and v coprime, u and v of different parity:
x ← u2 - 3v2 - 6uv
y ← u2 - 3v2 + 2uv
z ← 2u2 + 6v2
x ← u2 - 3v2 + 6uv
y ← -(u2 - 3v2) + 2uv
z ← 2u2 + 6v2
This time v can be 0, making u = x = y = 1, and z = 2. In this case 1 + q is the one and only prime. This is the only overlap between the two characterizations; since it does not particularly matter whether y is 1 or -1. Other than that the triples are unique.
When u = 5 and v = 2, x = -47, y = 33, z=74, or, x = 73, y = -7, and z = 74.
In the margin of one of his notebooks, Fermat asserted that xn + yn = zn has no integer solutions for n > 2, and x, y, and z positive. He claimed to have a beautiful proof, but the margin was too small to contain it. After his death, the most brilliant mathematical minds attacked the problem for 350 years, culminating in a proof that involves modular forms and elliptic curves, modern concepts that Fermat could have had no knowledge of. PBS has put together a marvelous explanation of Fermat's last theorem, and the towering proof that puts this question to rest. Click here for the nova special.
When n = 2 the solutions are pythagorean, and these were characterized above.
If there is a solution with n = 15, then there is a solution with n = 3. Replace x with x5, y with y5, and z with z5. Thus we can assume n is prime, or 4. The latter allows for the fact that there are indeed solutions when n = 2.
Once n is fixed, the smallest solution has coprime variables. If p divides x and y it divides z, and we can divide through by pn. The same thing happens if x and z are divisible by p, and so on, so the three variables are pairwise coprime.
If two coprime variables are equal then they are set to 1. This case can be ruled out: 1 + 0 = 1, or 1 + 1 = zn; so no two variables are equal.
When n is even, the variables can be negative or positive; it doesn't matter. For n odd, you could negate y, for instance, but that just moves it to the other side of the equation: xn = yn + zn. In general, the problem is the same whether variables are negative or positive.
The case of n = 4 will be handled later; for now let n be an odd prime. Write the equation as xn + yn + zn = 0, with the understanding that z is negative, or more generally, some of the variables are negative.
Let d = y+z, replace z with d-y, and expand (d-y)n by the binomial theorem.
xn + yn - yn + nyn-1d - (n:2)yn-2d2 + … - nydn-1 + dn = 0
xn + nyn-1d - (n:2)yn-2d2 + … - nydn-1 + dn = 0
Assume pk divides d. Everything beyond d is divisible by d2, and is divisible by p2k. If p also divides y it divides z, so y is nonzero mod p. If p is not n, then nyn-1 is nonzero mod p. The sum beyond xn is divisible by pk, but not pk+1. Thus xn is divisible by pk, but not pk+1. It follows that k is a multiple of n.
Next let p = n, and assume nk divides d. That means nk+1 divides nyn-1d. The remaining terms have n, from the binomial coefficient, and higher powers of d, thus ratcheting up the exponent on n. The last term dn does not have n from the binomial coefficient, but it represents at least n3k. With precisely nk+1 dividing xn, k is one less than a multiple of n. If n = 3, k is 2, or 5, or 8, etc.
Put this all together and d is a perfect nth power, or an nth power divided by n. This holds for x+y, x+z, and y+z. Wow! That's a lot to ask. No wonder there are no such triples; although this does not constitute a proof.
Suppose x3 + y3 = z3, where x y and z are pairwise coprime.
Let z = y+d, so that d is the difference between y and z. Expand (y+d)3 by the binomial theorem and simplify, giving this.
x3 - d3 = 3y2d + 3yd2
Reduce mod 3, and x3 = d3. Raising to the 3 doesn't change anything, so x = d mod 3. Given this, x3 = d3 mod 9. Look mod 9, and y2d + yd2 is divisible by 3. This means y, or d, or y+d, is divisible by 3. In other words, y, or z-y, or z, is divisible by 3. If y = z mod 3, then y3 = z3 mod 3, and x3 = 0 mod 3, and x is divisible by 3. Exactly one of our variables is divisible by 3.
By the way, the same thing happens when n = 5.
x5 - d5 = 5yd × (y+d) × (y2+yd+d2)
The left side comes out 0 mod 5, then 0 mod 25. Divide through by 5, then pull out y, d, and y+d, assuming these are nonzero mod 5. The last factor cannot come out 0 mod 5.
Returning to cubes, rearrange the equation so that z, on the right, is divisible by 3. This may cause some of the variables to be negative, but that's ok.
Adjoin w, the cube root of 1. This creates the eisenstein integers, which forms a ufd.
Factor the left side into (x+y) × (x+yw) × (x+yw2).
Suppose two of these factors are divisible by a prime p. Subtract, and p = 1-w, or p divides y. If p divides y then so does its conjugate. A real prime divides y, and x, and that is a contradiction. Thus the only common factor might be 1-w, lying over 3. Other than that these three expressions are pairwise coprime.
With z3 on the right, each factor onthe left is a cube, except for the distribution of 1-w. In the previous section we showed x+y is a cube divided by 3, consuming all but one of the factors of 3 in z3. That leaves (1-w)2 for the other two factors, and since x and y are not divisible by 3, one instance has to go to each. Write x+y as 9 times a cube, and let 1-w divide x+yw, and 1-w2 divide x+yw2.
Expand (u+vw)3 * (1-w), and equate this with x+yw. But we're not sure which associate to use, so follow up with 1, w, or w2. In each case, u and v are nonzero, u and v are coprime, and u and v are not both equal to ±1.
x ← u3 + 3u2v - 6uv2 + v3
y ← - u3 + 6u2v - 3uv2 - v3
x ← u3 - 6u2v + 3uv2 + v3
y ← 2u3 - 3u2v - 3uv2 + 2v3
x ← - 2u3 + 3u2v + 3uv2 - 2v3
y ← - u3 - 3u2v + 6uv2 - v3
Take the second associate and add x and y together, giving 3u3 - 9u2v + 3v3, or the third associate, giving -3u3 + 9uv2 - 3v3. This is 9 times a cube, so divide by 3 and look mod 3. Thus u3+v3 = 0, which means u = -v mod 3. The norm of u+vw, u2-uv+v2, is divisible by 3. Thus u+vw already contains a factor of 1-w, which is a contradiction. So return to the original associate: (u+vw)3 * (1-w), and add x and y together, giving 9uv(u-v). This is 9 times a cube, hence uv(u-v) is a cube. Since u and v are coprime, each is coprime to u-v. All three are perfect cubes. The sum of two cubes, namely v and u-v, yields another cube, namely u. We only need show this is a smaller triple.
Smaller in what way? Some of the variables are negative, we don't know which ones. Let the size be the absolute value of the product of the three variables. Remember that 9uv(u-v) = x+y. So x+y is already larger than the new product. This is at least abs(x) + abs(y), and since x and y are at least 1, this is at least xy. Multiply by z and get something larger still. The new triple is definitely smaller than the original.
Triples cannot descend forever, therefore two cubes cannot sum to a cube, and Fermat's last theorem holds for n = 3.
In this section all exponents are even, so it doesn't matter if variables are positive or negative.
The proof is infinite descent across several patterns. Pattern 1 implies a smaller instance of pattern 2, and pattern 2 implies a smaller instance of pattern 1. This time the "size" is the right hand side. This is a positive integer that cannot decrease forever, hence none of these patterns are possible. The first pattern is x4 + y4 = z2, and that settles Fermat's last theorem for n = 4.
x4 + y4 = z2
If p divides all three terms then p4 divides all three terms, and we can divide through to get something smaller. Thus variables are pairwise coprime. Characterize the pythagorean triples as follows.
x2 ← u2 - v2
y2 ← 2uv
z ← u2 + v2
Since 2uv is a square, and u and v are coprime, u and v are s2 and 2t2. Therefore x2 becomes s4 - 4t4 or 4t4 - s4. Rewrite this as x2 + 4t4 = s4 or x2 + s4 = 4t4. These are patterns (2) and (3) below.
Each time we defer to another pattern you should verify that the right side has become smaller. In this case the right side is u2. This is smaller than z, which is smaller than z2, so we're ok.
x2 + y4 = 4z4
Odd squares cannot sum to a multiple of 4, so either x or y is even, whence the other variable is also even. If z is even, divide x by 4 and y and z by 2, producing a smaller solution. Thus z is odd, and (looking mod 8) x is not divisible by 4. Divide through by 4 to get this.
(½x)2 + (½y2)2 = (z2)2
Remember that ½x is odd, and so is z2. Characterize the triple as follows.
½x ← u2 - v2
½y2 ← 2uv
z2 ← u2 + v2
With u and v coprime, the second equation shows u and v are squares. The third equation becomes s4+t4 = z2. With z2 < 4z4, we have no trouble deferring to pattern (1).
x2 + 4y4 = z4
Suppose x and z are odd, and characterize the triple this way.
x ← u2 - v2
2y2 ← 2uv
z2 ← u2 + v2
The second shows u and v are squares, and the third becomes s4+t4 = z2. With z2 < z4, we defer to pattern (1).
Let either x or z be even, whence the other is also even. If y is even, divide x by 4 and y and z by 2 to find a smaller solution. Thus y is odd. Look mod 8 and x = 2 mod 4. Divide through by 4 to get this.
(½x)2 + (y2)2 = (½z2)2
The sum of two odd squares is an even square, and a mod 4 argument shows this is impossible.
Patterns (1), (2), and (3) all fail together, and the sum of fourth powers is never a fourth power. Here are a couple more patterns involving fourth powers, if you're interested.
x2 + y4 = z4
If y is even, characterize the triple as y2 = 2uv and z2 = u2 + v2. Thus s4 + 4t4 = z2, pattern (5).
If y is odd, characterize the triple as y2 = u2 - v2 and z2 = u2 + v2. Multiply these together to show u4 - v4 = (yz)2, pattern (4). This is a self reference, but u4 < z4, so we're ok.
x4 + 4y4 = z2
If x and z are odd, characterize the triple this way.
x2 ← u2 - v2
2y2 ← 2uv
z ← u2 + v2
Thus u and v are squares, and x2 = s4-t4, pattern (4).
Let x and z be even. If z is divisible by 4 we can divide through and find a smaller solution, so let z = 2 mod 4. This makes y odd. Divide through by 4 and characterize as follows.
y2 ← u2 - v2
½x2 ← 2uv
½z ← u2 + v2
This implies y2 = s4-t4, pattern (4).
Mordell proposed a conjecture that states there are finitely many rational solutions for the polynomial expression p(x,y) = 0 when the genus of p is at least 2. As a rough approximation, this means p has degree 3 or higher, e.g. p = x3 + 7x2y - 5y2 - 11y + 12x - 59. Mordell's conjecture was proved by Faltings in 1983, and is sometimes called Falting's theorem in honor of the accomplishment. Fermat's last theorem actually describes rational solutions to xn + yn - 1. Hence there are finitely many fermat triples for any n > 2. Of course we know there are no such triples, since Fermat's last theorem has also been proved. When n = 2, there are infinitely many rational solutions to x2 + y2 - 1, namely the pythagorean triples, as illustrated by x = 5/13 and y = 12/13. These are rational points on the unit circle.
The degree is not a simple litmus test however; some cubic polynomials describe curves of genus 1, such as x3 - y2 + 3, which has infinitely many rational solutions.
Intrigued by Fermat's last theorem, Euler proposed a variant which states that the sum of n-1 nth powers cannot equal another nth power. When n = 3, we have the sum of two cubes equal to another cube. This is impossible, as shown above. What about larger values of n? Now that we have computers, we believe the conjecture fails for all n beyond 3. Here are counter examples for 4 and 5.
958004 + 2175194 + 4145604 = 4224814
275 + 845 + 1105 + 1335 = 1445
We have adjoined the fourth roots of 1 (gaussian), and the sixth roots of 1 (eisenstein), so let w be the eighth root of 1 in the first quadrant of the complex plane, and consider Z[w]. If you like, adjoin i first, then adjoin w the square root of i; the result is the same. This because w2 = i, so w brings in i anyways.
The earlier extensions satisfied a quadratic polynomial, like x2 + 1; this one does not. This is a quartic extension, satisfying x4 + 1. Verify that w4 = -1. When w is brought into the picture, the four roots are w, iw, -w, and -iw, or if you prefer, w, w3, w5, and w7. Each of these, raised to the fourth, yields -1. In the integers however, x4 + 1 is irreducible. In fact it is irreducible over Q. It has no root in the rationals, no linear factor looking like x-r, so if it splits, it factors into two quadratics. These look like x2+ax+c and x2+bx+d, where c and d are inverses. The product has no x3 term, hence a+b = 0. Replace b with -a. The product has no x term, so ad-ac = 0. Assume a is nonzero, whence c and d are both 1 or -1. Expand the product again and look at x2, whence -a2±2 = 0. This is not possible in the rationals, so let a = 0. Now the squared term is c+d, which means c = -d. Since cd = 1, this is also impossible. Hence x4 + 1 is irreducible over Q, and Z.
This extension does not make a simple lattice of parallelograms in the plane. Two dimensions is not enough; we need four. Each a+bw+cw2+dw3 is distinct. If any two of them were the same point then subtract, and find a cubic polynomial p(x) with w as a root. Use synthetic division to find g(x), the greatest common divisor of p(x) and x4 + 1. At each step of the algorithm, the remainder still has w as a root. At the end, g(w) still equals 0, and g cannot be a constant, or even a linear polynomial x+c, since c would have to be rational. Thus g is a polynomial of degree 2 or higher. It also divides x4 + 1, which is impossible, since x4 + 1 is irreducible. Therefore this 4 dimensional spread of numbers is well defined. It is spanned by the integers a, b, c, and d, via a + bw + cw2 + dw3.
It is still convenient however to view them in the complex plane. As complex numbers, multiplication is commutative, associative, and distributive, and the product of two nonzero numbers is never 0. In other words, it is an integral domain.
Each element has three different conjugates, not just one. Of course there is the conjugate you already know - reflection through the x axis. This carries w to w7, i.e. one root of x4 + 1 to another. The other two conjugates carry w to w3, and w to w5, the other two roots of x4 + 1. Along with the identity map carrying w to w, these are the four conjugates of w.
Each of these conjugations is an automorphism. To begin, pretend like w is just a variable, and the extension creates polynomials. Replacing w with w3 doesn't really change anything. w2 becomes w6, w3 becomes w9, and so on. You can replace before or after addition and multiplication, the result is the same; thus the map is a homomorphism. In fact, think of s as w3, and the polynomials look exactly the same, just replacing w with s. But remember that w4 turns into -1. In the same way, s4 also turns into -1, because both s and w are roots of x4 + 1. It's the same ring.
Similar reasoning shows w → w5 and w → w7 are automorphisms. By design, conjugation carries roots of p(x) to other roots of p(x), so that the resulting polynomials collapse in exactly the same way. Complex conjugation took i to -i, both roots of x2 + 1, and Eisenstein conjugation swapped the two roots of x2 + x + 1. Here, a fourth root of -1 becomes another fourth root of -1.
The norm is the product of these four conjugates, just as the norm of a+bi was the product of a+bi and a-bi in the gaussian world. Call these four conjugation functions c1, c2, c3, and c4, and consider the conjugate of xy. This is c1(xy) × c2(xy) × c3(xy) × c4(xy). Yet each is a homomorphism that respects multiplication, so write it this way.
c1(x) c1(y) c2(x) c2(y) c3(x) c3(y) c4(x) c4(y)
c1(x) c2(x) c3(x) c4(x) c1(y) c2(y) c3(y) c4(y)
|x| * |y
Norm is once again a homomorphism that respects multiplication. The norm of xy is the norm of x times the norm of y.
And what is the formula for the norm? Well - just multiply these conjugates together.
a + bw + cw2 + dw3
a + bw3 + cw6 + dw9
a + bw5 + cw10 + dw15
a + bw7 + cw14 + dw21
I ran this through some software and got a rather large polynomial in a, b, c, and d. The polynomial does not contain w at all; nor should it. Thus the norm is an integer. It is in fact a nonzero integer, since it is the product of four conjugates, and each conjugate is a nonzero point in the complex plane (unless a = b = c = d = 0).
a4 + b4 + c4 + d4
+ 4(a2bd - ab2c + acd2 - bc2d)
+ 2(a2c2 + b2d2)
In fact the norm is positive. Compute the norm by multiplying two complex conjugates, then the other to complex conjugates.
a + bw + cw2 + dw3
a + bw7 + cw14 + dw21
a + bw3 + cw6 + dw9
a + bw5 + cw10 + dw15
The first two are complex conjugates, reflections through the x axis, and their product is a distance squared to the origin. This is positive. The next two are also complex conjugates of each other, and their product is a distance squared to the origin, hence positive. The product of two positive numbers is positive, so the norm is a positive integer.
The norm of a real integer a is a4. This is the product of the four conjugates of a, or the result of the earlier polynomial in a, b, c, and d, with b c and d set to 0.
The norm of a gaussian integer a+ci, also known as a+cw2, is(a2 + c2)2. Again, this is the product of the four conjugates, or the result of setting b = d = 0 in the norm polynomial.
The norm facilitates division in the ring. In the gaussian integers, the inverse of s was s/|s|. The formula is similar here. In Z[w], the inverse of s is the product of the three nontrivial conjugates of s, over the norm of s. Multiply this expression by s and get |s|/|s|, or 1.
As usual, u is a unit iff |u| = 1. If |u| is 1 then u times its three conjugates is 1, and u is invertible. Conversely, if uv = 1 then take norms and |u| is a positive integer that divides 1, hence 1.
There are some new units here, not on the unit circle. Verify that 1+w-w3 × -1+w-w3 = 1. Call the first unit f, for the fundamental unit. These units lie on the x axis; f = sqrt(2)+1 and 1/f = sqrt(2)-1. They both have norm 1, as they should. This shows that norm is no longer a measure of distance to the origin.
The powers of f march along the x axis forever, and each is a unit, each with norm 1. The powers of 1/f approach 0. There are infinitely many units.
Verify that the norm of 1-w, i.e. 1-w times its conjugates, is 2. However, these conjugates are all associates. Multiply 1-w by w3, fw, and fw2, where f is the fundamental unit 1+w-w3, to get the other three conjugates. Since 2 is prime (in the integers), and |1-w| = 2, 1-w is irreducible. In fact 2 is ramified, with (1-w)4 as its unique factorization. I'll prove this ring is a ufd next, but first, let's prove it is a factorization domain.
Let n be any nonunit. If n is not irreducible then write it as s times t. Take norms, and |s|×|t| = |n|. Since s and t are not units, their norms are not 1. Thus |s| and |t| are less than |n|. By induction, s and t each factor into a finite product of irreducibles. Put this together and n has a finite factorization. This is typical; the norm is usually able to prove the extension is a factorization domain.
Now it's time to leave the complex plane and look in 4 dimensions. The axes are 1, w, w2, and w3. An element like a+bw+cw2+dw3 is plotted with coordinates [a,b,c,d] in 4-space.
Let v be a vector [a,b,c,d]. What kind of lattice is spanned by v? Multiply v by 1, w, w2, and w3 to build the base cell. This cell is a parallelotope, a pushed over box, and it repeats to cover all of 4 space. Here are the independent vectors determined by v, and the matrix that defines the base cell.
a + bw + cw2 + dw3
-d + aw + bw2 + cw3 -c - dw + aw2 + bw3 -b - cw - dw2 + aw3 |
|
Let e() be the distance squared to the origin, that is, a2+b2+c2+d2. This will be the euclidean metric that proves unique factorization. Do not confuse this with the norm; they are unrelated.
If v is nonzero, e(v) is a positive integer.
If v spans a cell, and u lies in this cell, u is closer to one of the 16 corners than the length of v. I'll prove this in 2 dimensions first, because it's easier to picture.
Let one vector of length l run down the x axis, and let another vector of length w run up the y axis. These span a perfect rectangle of area l×w, and these rectangles line up along the x axis to make a strip of width w.
A point c at the center of the rectangle, [½l,½w], is as far as it can be from the corners.
Push the strip of rectangles over, so they become parallelograms. The second vector still has length w, but is at an angle to the y axis. Let u be a point anywhere in this strip. Assume without loss of generality that u is closer to the floor than the ceiling. Drop a perpendicular from u to the floor, in this case the x axis. This point is at worse ½l from a lattice point on the floor. The y coordinate is at worse ½w, and probably less, depending on the angle of the second vector. The strip could be thin indeed if the parallelograms are skinny. The point u is closer to a lattice point than c was to the corners of its rectangle.
In 3 dimensions, let three vectors have lengths l, w, and h. If the vectors span a perfect box, these represent length width and height, the dimensions of the box. A point c at the center of the box is farthest from its 8 corners, and the distance is ½sqrt(l2+w2+h2). But the box is probably pushed over. Consider a layer of parallelotopes spanned by these three vectors. Let u lie in one of these, and assume, without loss of generality, that u is closer to the floor. Drop a perpendicular from u to the floor. This projection is at most ½sqrt(l2+w2) from its nearest lattice point, probably less. This was demonstrated in the previous paragraph. The distance to the floor is at most ½h. Put this together and u is at most ½sqrt(l2+w2+h2) from its nearest lattice point.
Extend this to 4 dimensions, and apply it to four vectors all of length l. Review the earlier matrix; each row has coordinates [±a,±b,±c,±d], in some jumbled order. The four vectors have a common length that I will call l. A point u is at most ½sqrt(l2+l2+l2+l2) from its nearest lattice point. The distance is at most l. In fact it equals l only if all vectors are orthogonal, making a perfect box, and u is at the center. Thus u = ½(1+w+w2+w3)×v. This could happen, e.g. if v is 2 and u is 1+w+w2+w3. When it happens, v = (1-w)u. Swap the roles of u and v, as the algorithm would dictate, let u seed the next lattice, and v is a multiple of u. At this point the algorithm stops, giving a gcd of u.
This may not fit the technical definition of a euclidean domain; because of the u v problem sited above, and e(ab) can sometimes be smaller than e(a) (consider f × 1/f). However, the gcd algorithm works, and Z[w] is a ufd.
Let s be prime in Z[w], and let p be a prime integer factor of |s|, where s divides p. We already factored 2, so p is odd. Write s×t = p and take norms. Now |S| × |t| = p4. The norm of any prime is a power of p ≤ p4.
Let p = 1 mod 8, and find x such that x4 = -1 mod p. This can be done computationally; x = r(p-1)/8, where r is a primitive root. The norm of x+w is x4 + 1, which is 0 mod p. In other words, x+w times its three conjugates is an integer multiple of p. Since p does not divide x+w or any of its conjugates, p is not an inert prime. Take the gcd of x+w and p to find the prime over p. Call this prime s. If s also divides x+w3, one of the conjugates of x+w, then s divides w-w3. This is the same as 1+i, or sqrt(2). We're looking at p an odd prime, so s does not divide 1+i. Thus x+w and x+w3 cannot both contain s. Similar results hold for x+w5 and x+w7. The four conjugates of s are all distinct. If any two of them were associates they would both divide the corresponding conjugates of x+w, yet the conjugates of x+w are all coprime. Therefore the 4 conjugates of s are all distinct primes that divide p. The product is the norm of s, which is a power of p. With |s| a power of p that divides p, |s| = p. There are four conjugate primes over p, and their product is p, whence the norm of each conjugate of s is p.
To illustrate, let p = 97. (Note that 97 is prime, and is 1 mod 8.) Select 5 as a primitive root mod 97. Let x = 596/8 = 64. Take the gcd of 64+w and 97. The first remainder is 33-w. Double this and subtract 64+w to get 2-3w. At this point 2-3w goes in evenly.
2-3w times 3+4w+6w2+9w3 = 33-w
The algorithm stops, with 2-3w as the gcd. Sure enough, |2-3w| = 97. The unique factorization of 97 is 2-3w × 2-3w3 × 2+3w3 × 2+3w.
Let p = 3, 5, or 7 mod 8, whence there is no naturally occurring eighth root of 1 mod p. If p = 7 mod 8 then adjoin, to the integers mod p, the element i, the square root of -1, which was not there before. If p = 3 or 5 mod 8 then adjoin the square root of 2, which was not there before. The new field has p2 elements, which is 1 mod 8. There is a primitive root r (proved later on in this book), and when r is raised to the (p2-1)/8, we find x, an eighth root of 1. If x is not itself an integer, it is an integer combination of 1 and i (p = 7 mod 8), or 1 and (w-w3) (p = 3 or 5 mod 8). Thus x+w has a representation in Z[w].
Let p = 7 mod 8. Think of x as x+yi if you like, since it has an i component. The constructed element is x+yi+w. One of its conjugates is x+yi-w. This form of conjugation maps w to w5, and thus w2 to w10, which means i maps to i. There is no need to change x+yi. Multiply x+yi+w times x+yi-w and get (x+yi)2-w2. The second term is i. The first term is the square of the eighth root of 1, which is the fourth root of 1, which is also i. (If this square comes out -i, multiply x+yi by i, so that its square comes out i. This can also be accomplished by cubing x+yi.) The result is i - i, which is 0 mod p, thus some multiple of p.
The product of two of the conjugates of x+yi+w, not all four conjugates, just two of them, gives an integer multiple of p. Since p does not divide x, or y, or 1, some prime factor of p, call it s, divides x+yi+w. Suppose some associate of s divides x+yi-w. Subtract, and s divides 2w, which is impossible. The conjugate of s divides x+yi-w, but the associate of s does not. The conjugate of s is distinct from s. These two primes are separate factors of p.
As an example, let p = 7, and the two primes over p are 2+w+2w2 and 2-w+2w2. Each has norm 49.
The same thing happens when p is 3 or 5 mod 8. The constructed element is x+y(w-w3) + i, where the first term is a fourth root of 1. Multiply this by x+y(w-w3) - i to get -1 + 1. (Conjugation from w → w7 does not change w-w3.) Let s be the prime common to p and the constructed element. The conjugate of s lives in the conjugate x+y(w-w3) - i, and is a distinct prime. There are once again two primes lying over p.
As an example, let p = 3, and the two primes lying over 3 are 1+w-w3 and 1-w-w3. The primes lying over 5 are 2+w2 and 2-w2. You know these better from the gaussian world as 2+i and 2-i. They remain prime here.
In all these cases, the gcd of a carefully constructed element and p extracts the prime over p. This can be done even for vary large primes.
I went through this rather long and complicated example to: lay the foundation for algebraic number theory, continue the exploration of cyclotomic extensions, describe conjugation as a permutation of roots, and present a workable gcd algorithm in more than 2 dimensions. The lattice lives in 4 dimensions, and the longer vector u lives inside some cell spanned by the shorter vector v, and u is closer to one of the 16 corners of this cell than the length of v. The remainder (u minus a multiple of v) is smaller, the gcd algorithm runs to completion, and the ring is a ufd. The next example also presents a 4 dimensional lattice, with a one-sided gcd algorithm.
A euclidean domain is commutative by definition, but sometimes the gcd algorithm works, even when multiplication is not commutative.
Let f be a left divisor of a if there is some h such that fh = a. In other words, a is a multiple of f, with f on the left.
Let g be a left gcd of a and b if g is a left divisor of a and b, and any f that is also a left divisor of a and b is a left divisor of g.
Assume a = bq + r. It is important to keep the quotient on the right. If f is a left divisor of a and b, then f is a left divisor of bq, and of r. Conversely, a left divisor of r and b becomes a left divisor of a. The left divisors are the same. This sets the stage for the gcd algorithm. Repeat until r = 0, whence g = b. Every f that divided our original pair divides g, and g is the left gcd.
Let R be the quaternions over Z. Let v be a quaternion and picture v as a vector in 4-space. If v = a+bi+cj+dk, then v is plotted with coordinates [a,b,c,d].
What kind of lattice is spanned by v? Multiply by 1, i, j, and k on the right, and build the following matrix, which defines the base cell.
a | b | c | d |
-b | a | d | -c |
-c | -d | a | b |
-d | c | -b | a |
The rows are all orthogonal, and all four rows have the same length. Each cell is a perfect cube in 4 dimensions.
In the previous section, the norm and the euclidean metric were unrelated; this time they are the same function, namely the distance squared to the origin. The gcd algorithm proceeds as long as the point u is closer to one of the 16 corners of its containing cell then the length of that cell. This happens every time, except when u lands smack in the center of the cell. In this case u is ½(1+i+j+k) times v. Yes, this can happen, e.g. when v = 2 and u = 1+i+j+k. Swapping vectors doesn't help, for then v is ½(1-i-j-k) times u, and v lands in the center of a cell spanned by u. We are in an infinite loop.
There is a way around this; switch to the half integer quaternions. This brings in all the centers, and gcd runs to completion. Does this sound familiar? Earlier we adjoined the square root of 3, and that did not create a ufd, because a point could land smack in the center of a rectangle; so we brought in the centers, creating the eisenstein integers, and that produced a ufd. The same thing is happening here.
If g is the left gcd of a and b, there are coefficients x and y on the right such that g = ax + by. Run the gcd algorithm and push the quotients onto a stack, then backtrack, using the same procedure as the integers. Just make sure each quotient, popped off the stack, is multiplied on the left. In the same way, you can find the inverse of one quaternion mod another, assuming the two quaternions are relatively prime.
Let S be a finite set of half quaternions, and let H be everything that is generated, on the right, from S. In other words, H is all the linear combinations of elements of S. Let g be the left gcd of S. Thus g generates everything in S, and g generates everything in H. They are all right multiples of g. Furthermore, g is a linear combination of S, and belongs to H. Using terminology from ring theory, which is coming up later in thisw book, every finitely generated right (or left) ideal is principal.
The norm proves R is a factorization domain, but despite the gcd algorithm, R is not a ufd, not even close. If p is a prime integer, there are many ways to write p as the product of two irreducibles. Each quaternion with norm p is irreducible, since its norm is prime, and each can be multiplied by its conjugate to produce p. Even if p is as small as 7, there are many quaternions with norm 7.
±2±i±j±k
±1±2i±j±k
±1±i±2j±k
±1±i±j±2k
Gather these up by associates, and allow a quaternion and its conjugate to commute, and there are still for distinct factorizations.
2+i+j+k * 2-i-j-k
2+i+j-k * 2-i-j+k
2+i-j+k * 2-i+j-k
2+i-j-k * 2-i+j+k
Don't assume the left gcd is the right gcd. In fact two quaternions could be coprime from one side and not from the other. Let g = 1+3i+k, and let h = g*i = -3+i+j. Since h is a multiple of g, g is the left gcd of g and h. Everything spanned by g and h, on the right, is some multiple of g. All norms are divisible by |g| = 11. On the other side, multiply i*g and get -3+i-j. Subtract this from h and get 2j. This implies 2i, and g-2i = 1+i+k. Remember that gcd only works in the half quaternions, so multiply 2j by a half unit to get 1+i+j+k. Subtract 1+i+k and get j. This is the same as 1, which brings in everything. The left multiples of g and h cover everything, thus their right gcd is 1. These two quaternions are associates from one side, and coprime from the other.
the polynomials over a field F form a euclidean domain, and a ufd. This is because one coefficient can always be divided by another. The metric of a polynomial is its degree, the highest exponent. The gcd algorithm is accomplished by synthetic division. Since the ring is a ufd, every polynomial factors uniquely into a product of prime polynomials.
In the rationals, x3 + 1 becomes x+1 times x2-x+1, and nothing else.
This assumes multiplication in F is commutative, but if it is not, a one-sided gcd still exists. This was described in the previous section. Run the gcd algorithm as usual, but make sure the quotient is always on the right, giving a left gcd. This assumes F commutes with x, which is the standard definition of a polynomial extension.
As with any euclidean domain, you can use the gcd algorithm to find the inverse of x, or 3x+5, or anything else, mod some prime polynomial p(x).
If p(x) and q(x) are polynomials that are relatively prime, then F[x] mod p*q is the cross product of F[x] mod p and F[x] mod q. This is the chinese remainder theorem, and the proof is exactly the same. The homomorphism from the larger ring to the two smaller rings is reduction mod p on one side, and reduction mod q on the other, and since p has an inverse mod q, and q has an inverse mod p, the map can be reversed.
This section separates a fraction into partial fractions, e.g. 5/6 = 1/2 + 1/3. However, instead of numbers, we are dealing with polynomials over a field. A quotient of polynomials p/q can be separated into partial fractions, as long as q can be factored.
If you are a student of calculus then you will recognize this technique - integration by partial fractions. The integral of (4x2 + 3x + 10) / (x3 + 2x2 + x + 2) is not obvious at first; but separate it into 4/(x+2) + 3/(x2+1), and the integral is 4log(x+2) + 3atan(x). For purposes of integration, fractions are separated over the reals, but this method generalizes to any field.
Let F be a field, so that the polynomials F[x] form a ufd. In what follows, fractions are always assumed to be proper. In other words, the polynomial on top has degree strictly less than the polynomial downstairs. Given an improper fraction, use synthetic division to create a polynomial plus a proper fraction.
Let p/q be a quotient of polynomials, with q monic, and with the degree of q greater than the degree of p. Let w1 w2 w3 … wk be monic irreducible polynomials that are the distinct prime factors of q. Let ei be the power of wi in q, and let vi = wi raised to the ei. Thus q has been factored; it is the product over vi. We would like to express p/q as a sum of proper fractions whose denominators are v1 through vk. In other words, p/q can be separated into partial fractions.
Let di be the degree of vi, and let s be the degree of q. Thus s is the sum over di.
Let ui be q/vi. Thus 1/vi is the same as ui/q. Let ui/q stand in for 1/vi, thus everything has a common denominator of q.
Let ni be the desired numerator, so that p/q is the sum of ni/vi. Seen another way, p = ∑ niui.
To be a proper fraction, the degree of ni is less than di. Thus ni is a polynomial with d coefficients, and we must find these coefficients. Each one is multiplied by a shifted instance of ui. Add these up across each ni*ui, and the result is an exercise in linear algebra. There are s coefficients to find, di for each ni, and these coefficients are multiplied by various vectors, shifted versions of ui. The result is the vector p, as determined by p(x). This can always be done, as long as the vectors are independent.
Suppose they are not. Suppose a nontrivial sum of niui = 0. To keep things simple, assume u1 participates in this sum. In other words, n1 is nonzero. If this is the only term, then n1u1 = 0. Since F[x] is an integral domain, this is impossible, thus more terms are involved in the sum. Move them to the right side of the equation, so that n1u1 is minus the sum of n2u2 + n3u3 + … nkuk. Everything on the right is a multiple of v1. By unique factorization, v1 is a factor of the left. Since v1 and u1 are coprime, v1 is a factor of n1. But the degree of v1 is d1, and the degree of n1 is less than d1. This is a contradiction, hence the vectors are independent, and p/q can be represented uniquely as a sum of ni/vi.
Computers are good at linear algebra, even for large matrices, so this process is entirely computational, assuming q can be factored.
Here's a silly example where denominators are not coprime. Try to write 1/x2 as a linear combination of 1/x and 1/x. Thus 1/x2 = (a+b)/x, and (clearing denominators), 1 = (a+b)x, which is impossible.
Q[x], the polynomials with rational coefficients, form a ufd, as described in the previous section. But a given polynomial, and all its factor polynomials, can be brought into Z[x] simply by multiplying through by a common denominator. This gives us a gut feeling that Z[x], the polynomials with integer coefficients, also form a ufd.
Indeed they do. In general, the polynomials over a ufd create another ufd. The proof is another gem from the mind of Gauss, hence this theorem is called Gauss' lemma.
Let R be any ufd. The content of a polynomial p(x) in R[x] is the greatest common divisor of its coefficients. The content of 2x2+6x+10 is 2. A primitive polynomial has content 1.
The product of two primitive polynomials is primitive. Suppose q, a prime element of R, divides the content of a(x)*b(x), and let ai and bj be the first coefficients, reading left to right, in a and b respectively, that are not divisible by q. The coefficient i+j in the product is not divisible by q. This is a contradiction, hence the product has content 1.
Pull out the content of two polynomials to derive the corollary: content(a) * content(b) = content(a*b). Thus content is a homomorphism from polynomials onto R.
Let F be the fractions of R. For example, if R is the integers, then F is the rationals. In general, put elements of R in the numerator, and nonzero elements of R in the denominator, and manipulate fractions in the usual way.
If p(x) primitive is reducible in R[x] it is certainly reducible in F[x]. Conversely, if p(x) is reducible in F[x], write p(x) = a(x) * b(x) / d, where d is a common denominator of all the fractions. Multiply through by d, and the equation lies in R[x].
d*p(x) = a(x) * b(x)
Recall that content(a*b) = content(a)*content(b). With p primitive, this content must be d. Divide through by d, pulling out the content of a and the content of b. That leaves p as a product of primitive polnomials in R[x]. Therefore a primitive polynomial is irreducible in R[x] iff it is irreducible in F[x].
Pause here and look at an example when R is not a ufd. Adjoin q the square root of -3 to Z. This is a ring that is not a ufd, as described earlier. Let p(x) = x2 + x + 1. A root of this polynomial has to divide 1. The only units in Z[q] are 1 and -1. Neither of these is a root of p(x), hence p is irreducible in Z[q]. However, when fractions are brought in, p becomes (x+(1+q)/2) times (x+(1-q)/2). Thus p factors over F, but not over R.
Return to R a ufd, and factor p(x) in R[x], starting with the content, which belongs to R and factors uniquely. What remains is primitive, and factors into primitive polynomials. Each of these is irreducible over R, and is also irreducible over F. If this could be done two ways over R, then it could be done two ways over F. Yet the polynomials over F form a ufd. Therefore R[x] is a ufd.
As we suspected, the integer polynomials exhibit unique factorization. You can refer to an integer polynomial as irreducible, or prime; there is no difference.
Applying Gauss' lemma twice, R[x][y] is a ufd. Keep adjoining indeterminants, as many as you like, and by induction the result is always a ufd. Thus Z[x,y,z] is a ufd.
Here's a really big ufd! Choose your favorite infinite cardinal and adjoin that many indeterminants to Z. Call this ring S.
Order the indeterminants in any way you like. The degree of a polynomial in S is the highest indeterminant, followed by the highest exponent on that particular indeterminant. If the degree is t4, then p could be rewritten as a polynomial in t, of degree 4, whose coefficients are other polynomials in S, all having indeterminants below t.
Multiply two polynomials together and look at the highest terms. If both start with t then you might have t4 * t5 = t9, times the two lead coefficients, which are lesser polynomials. By induction these coefficient polynomials have a nonzero product, hence this term is nonzero, and the product of these two polynomials is nonzero. If one polynomial starts with t4 and the other starts with u5 then u5 wins, and again the product is nonzero. S is an integral domain.
Use induction on degree to show S is a factorization domain. If p is based on t, pull out the content, which has a finite factorization, then, if the remaining primitive polynomial is not already irreducible, write it as the product of two polynomials with lesser degree, which have finite factorizations.
Now suppose a polynomial p factors two ways in S. These factorizations use a finite number of indeterminants, which contradicts the fact that Z[x1,x2,…xn] is a ufd. Therefore all of S is a ufd.
Here is a trick you may have learned (without proof) in high school. Let p(x) have integer coefficients, and look for all rational roots. Let a/b be such a root, so that x-a/b divides p in the rationals. Apply Gauss' lemma, and x-a/b, times its denominator, divides p(x). Thus p(x) = q(x)×(bx-a). The numerator divides the constant coefficient, and the denominator divides the lead coefficient.
Consider the polynomial 3x3 - 2x2 + 12x - 8. A rational root has numerator 1 2 4 or 8, and denominator 1 or 3. And of course the root could be negative. In this case the only rational root is 2/3. The other roots are ±2i.
Once again lets move to a polynomial with two variables. Let p = u2-3uv+7v-1. When factoring p, you can view it as a polynomial in u, with coefficients taken from the ufd Z[v]. The lead coefficient is 1, and the constant coefficient is 7v-1. Since p is monic, its content is 1. If q is a factor of p, its lead coefficient is 1, and its constant coefficient divides 7v-1, which is prime in Z[v]. Thus p is the product of u±1 times u±(7v-1), wherein the signs on the constant terms agree. This produces the middle term ±7uv, which cannot equal -3uv, hence p is irreducible.
Yet p does have rational solutions, lots of them. This because p is a polynomial in two variables. Set u to any rational number and solve for v.
How about integer solutions? If u is an integer, and 3u-7 divides u2-1, then u and v form an integer solution to p. Examples include ±1,0 and 5,3.
If you are given a 2 variable polynomial in u and v, where each term has the same degree d, and you are looking for integer solutions beyond [0,0], divide through by vd, and let r be the ratio u/v. The result is a polynomial in r, which must obtain a rational root. The numerator of R divides the constant term, which was the coefficient on vd. Similarly the denominator is drawn from the coefficient on ud. Therefore v divides the coefficient of ud and u divides the coefficient of vd.
Consider p = u2 - uv - 2v2. The only nonzero coprime solutions are 2,1 and 1,-1.
To close this section, let's factor the polynomial p = x2 + y2 - z2 in the integers. View p as a polynomial in x with coefficients in Z[y,z]. Since there is no content, the two factors have to begin with x or -x. By symmetry the same holds for y and z. The only possible nontrivial factor is x±y±z, and whenever you multiply two of these factors together, mixed terms remain. It is not possible to cancel them out the way you might with (x+y) × (x-y) = x2 - y2. Therefore x2 + y2 - z2 is irreducible. Yet it has infinitely many integer solutions, namely the pythagorean triples.
Let R be a ufd. A polynomial p(x) is irreducible in R[x], (and F[x] by Gauss' lemma), if all coefficients except the lead coefficient are divisible by q (prime), and the constant is not divisible by q2. This is called Eisenstein's criterion.
Suppose p(x) = a(x) * b(x). Start with the constant term p0 = a0×b0. The product has precisely one factor of q. Let q divide a0 and not b0. Since q divides p1 it must divide a1, and so on down the coefficients of a. Thus q divides a(x), and p(x), which is a contradiction.
The same proof works if the constant and leading terms are swapped. Start with an instead of a0.
Polynomials like x2 + 7x + 21 and x19 - 8x3 - 2 are irreducible; you can tell at a glance.
Here is a multivariable example.
5x8 + 2y2x5 + 6x5 - 7y2 - 21
Rewrite this as:
5x8 + 2(y2+3)x5 - 7(y2+3)
Verify y2+3 is prime in the polynomials of y with real coefficients. This "prime" entity divides x5 and the constant term, but not the lead coefficient. And it isn't squared in the constant term. Apply eisenstein's criterion, and the polynomial is irreducible in R[y][x]. Since there is no content, it is irreducible in R[x,y].
A wonderful property of a ufd is that fractions in lowest terms are distinct. No fraction is equal to 3/5, unless it is something like 48/80, which becomes 3/5 when you pull 16 out of the numerator and denominator.
Assume a/b = c/d, and multiply through by the common denominator bd. That gives ad = bc. Let a prime p divide a, whence it divides b or c. If p divides c then divide a and c by p, and defer to the smaller fractions. If p divides b then a/b was not in lowest terms. This applies to a, b, c, and d. After reductions, all four variables are units.
Yes, 1/1 could equal (-1)/(-1). Solve this by setting the denominator to a preferred associate. A fraction in Z is in lowest terms if the denominator is positive. A fraction in Z[i] is in lowest terms if the denominator is in the first quadrant. A fraction of polynomials over a ufd is in lowest terms if the lead coefficient of the denominator is the preferred associate for its ufd, such as (-3x-1) / (2x+5). Now two fractions are equal iff they are symbolically equal, same numerators and same denominators.
This is still a bit fluffy, since I have not yet developed a formal definition of fractions, but it will do for now.