Modules, or unitary modules, can act as objects in a category, with R module homomorphisms acting as morphisms. Thus a projective module is merely a projective object within its category, and an injective module is an injective object within its category. If you're not familiar with category theory, don't panic; I'm going to describe projective and injective modules below. I just wanted you to know that these adjectives have general definitions that go beyond modules.

A module is projective if each homomorphism into B has a compatible morphism through A. Here is the technical definition.

A module P is projective if, for any pair of modules A and B, and any epimorphism f from A onto B, and any homomorphism g from P into B, there is at least one homomorphism h from P into A such that hf = g. A map from P into B lifts up to a compatible map from P into A.

P
h↓ ↓g
A
f
B

Every free module is projective. See projective objects for a general proof - free objects are always projective in any category. So I suppose we don't have to prove it here, but the proof is easy. If x is a basis element of P, and g(x) = y, then let h(x) be anything in the preimage of g(y) under f. Do this for the basis of P, and the homomorphism h follows from there.

As a special case, the ring R is always a projective R module. This because R is a free R module of rank 1.

A module J is injective if, for any pair of modules A and B, and any monomorphism f from A to B, and any homomorphism g from A to J, there is at least one homomorphism h from B to J such that fh = g. These morphisms run up towards J, while the projective morphisms ran down and away from P.

J
g↑ ↑h
A
f
B

Let P be the direct product or direct sum of modules Ci, where i ranges over an indexing set, and assume P is projective. We are given f(A) onto B, and g(Ci) into B. Extend g to all of P by mapping everything outside of Ci to 0. If a component other than i is nonzero, the image is 0. This is a valid homomorphism from P into B, and it has a compatible lift h, such that hf = g. Restrict h to Ci. This doesn't really change anything, hence h(Ci) is the lift of g(Ci). Therefore each component Ci is projective.

If P is a direct sum then the converse holds. Assume each Ci is projective. Given g(P) into B, realize that g defines, and is defined by, its action on each Ci. Lift each g(Ci) up to an h(Ci), and put these component functions together to build a composite function h from P into A. Verify that h is indeed a lift for g, hence P is projective.

Similar reasoning applies when J is injective. Let J be a direct sum or product, and injective, and prove the components of J are injective. Given a function g(A) into Ci, extend it to all of J by setting the other components to 0. This implies a function h, which we restrict to Ci, and Ci is injective.

Conversely, if J is a direct sum, and each Ci is injective, the function g(A) into J defines, and is defined by, component functions from A into each Ci. Each of these component functions implies a compatible function hi. Put these together to build a function h, and J is injective.

Although R is a projective R module, it need not be injective.

0 is injective, but nontrivial injective modules are hard to find. Q is injective over abelian groups.

Here is some notation from homology theory that you will need. The variables are modules and the arrows are R module homomorphisms. The chain could extend indefinitely in either direction.

… → A → B → C → …

The image of A in B acts as the kernel of the next homomorphism from B into C, and so on. Each kernel is the previous image.

The following is a short exact sequence.

0 → A → B → C → 0

The first homomorphism takes 0 into A, and is 0. This acts as the kernel of the next homomorphism, hence A embeds in B. The quotient B/A then maps into C, and this becomes the kernel of the final homomorphism from C to 0. The kernel is all of C, hence C = B/A. The short exact sequence is fancy notation for C = B/A. C is the quotient module of B with kernel A.

A short exact sequence is split exact if there is a copy of C in B, which I will call D, and the homomorphism B → C maps D onto C. In other words, the elements of D form a submodule of B, yet when they represent cosets of A in B, they define the quotient module B/A = C. D and C are isomorphic - so you can draw arrows from D to C, or from C back to D if you like. The latter is called a reverse homomorphism from C back into B.

0 → A → B ⇔ C → 0

Since A is the kernel, and the elements of D represent distinct cosets of A in B, and since all the cosets are represented, everything in B is spanned, uniquely, by an element of D and an element of A. Adding and scaling are accomplished per component, thus B is the direct product A*D. Conversely, if B = A*C then B is split exact, with kernel A and quotient C, and a reverse homomorphism mapping C onto itself, that is compatible with the forward projection from B onto C. The sequence is split exact iff B is the direct product of A and C.

Here is another criterion for split exact. Let k be a reverse homomorphism from B onto A, such that the composition A → B → A produces the identity map on A. A direct product implies such a homomorphism. When B is A*C, k is the projection from the direct product onto the first component, namely A. Conversely, assume k exists and let D be the kernel of k. The elements of A correspond one for one with cosreps of D in B. Thus B is isomorphic to A*D. Since D forms the cosets of A in B, D is isomorphic to C. In other words, B = A*C, and the sequence is split exact.

Since split exact implies two reverse homomorphisms, it is sometimes written like this.

0 → A ⇔ B ⇔ C → 0

In general I will avoid this notation, because it looks like A and B and C are isomorphic, which they aren't. I'll just use forward arrows, and say the sequence is split exact.

In the world of groups, things are more complicated, because the primary operation is not commutative. When a short exact sequence is split exact, B is a semidirect product of A by C, rather than a direct product. If you haven't seen semidirect products before, that's ok; nonabelian groups will not come in to play here. This chapter deals with modules. Of course an abelian group is a Z module, where scaling an element x by an integer m simply adds x to itself m times.

Let P be projective and consider the short exact sequence

0 → A → B → P → 0.

Let f be the homomorphism that maps B onto P, with kernel A. Let I be the identity map on P. There is then a lift H, from P into B, such that hf = I. Now H is the reverse homomorphism that embeds P into B. The image of P in B is the copy of P in B that is compatible with f. The sequence is split exact, and B = A*P.

P
h↓ ↓I
B
f
P

Next assume every short exact sequence ending in P is split exact. Let W be a free module that maps onto P. Now W = K*P, where k is the kernel of the homomorphism. In other words, P is the summand of a free module.

Finally assume P is the summand of a free module. Every free module is projective, and a module is projective iff its two components are projective, hence P is projective. This completes the circle. A module P is projective iff every short exact sequence ending in P is split exact, iff P is the summand of a free module.

Free modules are projective, but projective modules need not be free. Let R be the ring Z/6, i.e. the integers mod 6. This is isomorphic to Z/2 * Z/3. Now Z/3 is the summand of R, the summand of a free module, yet Z/3 is not free. Map 1 in Z/3 to 1 in R, and 0 maps to 3, which is impossible. As you recall, a free module has to look like so many copies of R.

When R is a pid, the submodule of a free module is free, hence projective and free are synonymous. Since Z is a pid, a free abelian group = a projective abelian group.

Throughout this chapter, and subsequent chapters, I will often prove a statement s() for the base ring R, and for direct sum, whence s() is true for all free R modules. Apply direct sum again, in the other direction, and s() is true for a projective module, since it is a summand of a free module. Thus R parlays into projective modules. If s() is only assured for a finite direct product, rather than an infinite direct sum, then s() is still true for free modules of finite rank, and for finitely generated projective modules. If M is such a module, it is the quotient of a free module F of finite rank, and M is at the end of a short exact sequence F → M → 0. The sequence splits, and M is a summand of F, and s(F) is true, and s(M) is true. This technique will become clearer as we use it in specific situations.

There is no analogous technique for injective modules, thus projective modules are, by and large, more important than injective modules.

Given modules A and B, the dual of A with respect to B is the set of module homomorphisms from A into B. This is written hom(A,B). If B is not specified then hom(A,R) is assumed. This is called the dual of A.

A B

Realize that hom(A,B) is itself an R module. Homomorphisms can be added together, and, since R is commutative, they can be scaled by elements of R on the right within the module B.

The dual of the dual of M is not always M. Start with Z/2, as a Z module. It's dual is 0, and the dual of 0 is 0. In this sense, dual is perhaps a misnomer, since the word "dual" usually implies its own inverse.

Assume B is the direct product of modules Bj. A function from A into B is now the direct product of functions from A into each Bj. Each function into B defines component functions into each Bj, and a collection of component functions can be combined to build a composite function from A into B. Verify that the composite function is a homomorphism iff each component function is a homomorphism. Therefore hom(A,B) is canonically equivalent to the direct product of hom(A,Bj).

This does not hold when B is the direct sum over Bj. Let A be a free module with infinitely many generators, and let a function map A into B by carrying the jth generator of A to a nontrivial element of Bj. This is a map from A into the direct sum B. This function cannot be realized through the direct sum of functions from A into Bj, for each such function maps A into 0 at almost every Bj.

From the other side, let A be a direct sum of modules. A function from A into B defines, and is defined by, the component functions. The component functions combine, by linearity, to define f on the direct sum. Therefore f is the direct product over fj. A is a direct sum, yet hom(A,B) becomes a direct product.

If A is an infinite direct product, then the direct product of component homomorphisms defines f on the direct sum, as described above, and leaves most of f(A) undefined. You can change f on the points of A that are nonzero everywhere, without changing any of the component functions.

Consider the R homomorhpisms from R into R. They define, and are defined by, the image of 1. Thus the dual of R equals R. In general, hom(R,M) = M.

Apply these results to a free module of finite rank, and hom(Rl,M) = Ml.

A homomorphism from one module into another can induce a homomorphism on the homomorphisms going into or coming out of those modules. That's a mouthful, so let's illustrate with a square. Imagine a square with R modules at the corners. Place A1 at the upper left and B1 at the upper right. Place A2 at the lower left and B2 at the lower right. Now hom(A1,B1) is a collection of arrows going across the top of the square, from left to right, and hom(A2,B2) is a collection of arrows going across the bottom of the square, from left to right. Assume there is a given homomorphism from A2 to A1, going up the left side, and another homomorphism from B1 to B2, going down the right side. This establishes a function f from hom(A1,B1) into hom(A2,B2). Given a homomorphism from A1 to B1, prepend the homomorphism going up the left side, and append the homomorphism going down the right side, to find a homomorphism from A2 to B2. This is a map from hom(A1,B1) into hom(A2,B2). Verify that this map is an R module homomorphism.

A1B1
A2B2

In general, a left R module and a right R module combine to form an abelian group, which is their tensor product. However, this operation is usually applied to modules over a commutative ring, whence the result is another R module. Thus tensor product becomes a binary operation on modules, which is, as we'll see, commutative and associative.

If A and B are R modules, it is somewhat awkward to write (A tensor B) over and over again. The abstract multiplicative operator *, when applied to sets, often means a cross product, so A*B might be confused with the direct product of A and B. In fact * is often used for this purpose. Similarly, + often indicates a direct product when the underlying groups are abelian, which is the case for modules. Therefore I will use the times symbol (A×B) to indicate tensor product.

Given a right module A and a left module B, the tensor product of A and B consists of an abelian group T, and a bilinear map f(A,B) onto T. The term "bilinear" means f respects addition in A and in B, and, for any d in R and any x in A and any y in B, f(xd,y) = f(x,dy). The action of d can be passed from one module to the other.

Don't confuse bilinearity with a group homomorphism on the direct product A*B. They are not the same. Bilinearity means f(x1,y) + f(x2,y) = f(x1+x2,y). If this were a group homomorphism on A*B, then f(x1,y) + f(x2,y) would equal f(x1+x2,2y).

Let A and B be the integers, denoted Z. Integer multiplication is an example of a bilinear map from Z cross Z onto Z. Use the distributive property to show f respects addition in A and in B. Use the associative property of multiplication to show f(xd,y) = f(x,dy) = xdy. This map, integer multiplication, is not a group homomorphism on Z*Z. If it were, f(1+1,1+1) would equal 2, rather than 4.

Note that 0 in either component maps to 0. If f(0,x) = z, use linearity to write: z + z = f(0,x) + f(0,x) = f(0+0,x) = z. This implies z = 0.

For T to be a tensor product it must be universal. Here's what we mean by universal. Consider any other bilinear map g(A,B) into another abelian group U. There is a unique group homomorphism h(T) into U satisfying fh = g.

If you are familiar with category theory, then you already know the tensor product is unique up to isomorphism. A universal object in any category is unique up to isomorphism. Let's prove it anyways, using an argument that is essentially the same as that used in category theory.

Let f map A cross B onto the tensor product T, and suppose U is also universal through a function g. There are unique homomorphisms j from T to U, and k from U back to T, that make the diagrams commute. g = fj, and f = gk, hence f = fjk. Apply the definition of universal from T to T, and jk is an endomorphism that makes the diagram commute. But the identity map on T also makes the diagram commute, hence jk = 1. Turn this around and kj = 1. This makes j and k isomorphisms, whence T and U are isomorphic. There is one tensor product up to isomorphism.

So the tensor product is unique, but does it exist? Let's build it from the ground up.

For each x in A and each y in B, let (x,y) be a generator of the free abelian group S. Each of these symbols is independent in S. There is no meaningful way to add (x1,y1) + (x2,y2), even though x1+x2 is a particular element in A, and y1+y2 is a particular element in B. One can however add (x1,y3) + (x1,y3), giving 2(x1,y3). This is what we mean by a free group.

Next, build a collection of relations that will span a subgroup K inside S. For instance, let x1 + x2 = x3 in A. This gives the relation (x3,y) = (x1,y) + (x2,y), or if you prefer the relator, (x3,y) - (x1,y) - (x2,y), which is a syntactically correct generator of the kernel K. Do this for all pairs x1 x2 in A and all y in B, then do the same for all pairs y1 y2 in B and all x in A. Finally include relations for passing the action of R between components. These have the form (xd,y) = (x,dy). Setting d = 1 is pointless, giving (x,y) = (x,y), and even d = 2 is implied by linearity, (2x,y) = (x,2y), both equal to 2(x,y). If R is the integers we don't need to pass R between components, but larger rings require these xdy relations. All these generators span a kernel K, and the quotient S/K = T is the tensor product.

Of course we have to prove T is the tensor product, and that is rather technical. First show f(x,y) → (x,y) is a bilinear map. This is straightforward given the relations in the kernel K.

Now let g(A,B) map into an abelian group U. We need a function h that makes the diagram commute. Start by defining h on the generators of S. Specifically, h(x,y) = g(x,y). Then extend this definition to all of S. We can do this because S is a free group. Thus h is a group homomorphism from S into U.

What happens to K? Consider a generator of K, and pull back to the corresponding expression in A cross B, then map this forward to U. The result has to be 0. This is because g is also a bilinear map. The generators of K are the bilinear relations, and these have to go to 0 in any bilinear map. Thus h(K) = 0. Therefore the cosets of K, which are the elements of T, have well defined images in U. Furthermore, the map from T into U is a group homomorphism.

By construction, h makes the diagram commute. In other words, fh = g. Furthermore, any deviation from h would force g(x,y) ≠ h(f(x,y)) for some pair x,y, hence h is unique. That completes the proof.

A*B
f↓ ↓g
T
h
U

When rings are commutative, T becomes an R module. An element c in R acts on S by mapping (x,y) to (cx,y). Verify that this action turns S into an R module.

Consider any of the generators of K, and apply the action of c in R. The result is another generator of K. For instance, c acts on (dx,y) - (x,dy) to give (dcx,y) - (cx,dy), which is the generator in K that passes d between cx and y. Thus K is an R submodule of S, and T is a quotient R module. We didn't have to do anything to T to make it an R module, it just is.

Apply c to x, then evaluate f(cx,y). This is the same as c times f(x,y). Either is (cx,y) in S, and in T. Our bilinear map f respects the action of R on x. It also respects the action of R on y, only because (x,cy) is equivalent to (cx,y) in T.

Let U be an R module, and let g be a bilinear map from A cross B into U that respects the action of R. Let h be the unique group homomorphism from T into U. Now h(c*(x,y)) = h((cx,y)) = g(cx,y) = c*g(x,y) = c*h((x,y)). The unique group homomorphism has become an R module homomorphism.

This suggests a new category, where an object O is an R module and an R bilinear map from A cross B into O. Morphisms are R module homomorphisms between objects that cause the diagram to commute, and the tensor product T is the universal object in this category. As shown above, the bilinear map onto T respects the action of R on each component, and the unique group homomorphism from T into U is also an R module homomorphism, hence T is universal in its new category.

When constructing T, you don't really have to use every x in A and every y in B. Let A and B have a fixed set of generators as R modules. Use these generators to build the free module S as above.

Recall that A is the free module on its generators, mod some kernel KA, and similarly for B and KB. The relations in KA must become relations in the kernel K that leads to the tensor product. For instance, let w be a relation in KA, hence w is a linear combination of generators of A that yields 0. Perhaps w = 17x3 + 9x5 - 8x9. (Coefficients could come from R, not just integers.) Select any y in B and cross y with w, giving 17(x3,y) + 9(x5,y) - 8(x9,y). This becomes a relation in K.

You don't have to join A relation in KA with each y in B; you can restrict attention to the generators of B. Assume the relation w above has been joined with y1 and y2, generators of B, and let y = 3y1 + 5y2. Since w cross y1 and w cross y2 are relations in K, regroup the generators of w with 3y1 + 5y2, merge these into y, and find the relation wy from KA cross B. All of KA cross all of B is in the kernel K, as it should be. The same holds for A cross KB. The modules A and B map into T.

You don't have to bring in every relation in KA, i.e. every linear combination that yields 0, a set of generating relations will do. Bring in x1-x2, and x2-x3, and x1-x3 is implied. After all, these relations wind up acting as generators for K, so all of KA is implied. Similar results hold for KB.

Look again at linearity. Assume x3 = x1 + x2, (not necessarily generators), thus x3 - x1 - x2 lives in KA. Expand this into a linear combination of generators from A, which also lives in KA. This is a relation in the generators of A, like w in the previous paragraphs. It is crossed with every generator of B, and indirectly with all of B. Therefore, linearity is covered by what we have already done.

The relations that pass the action of R between A and B, (xd)y-x(dy), are still in K, as they were before, but this time you only need include relations for the generators of A and B. Let x = ax1+bx2+cx3, representing x as generators of A, and let y0 be a generator of B. Start with (xd,y0), which is d*(x,y0), and replace x with its linear combination. Look at the first term: d*(ax1,y0) = ad*(x1,y0) = a*(x1d,y0) = a*(x1,dy0) = (ax1,dy0). Do this for each of the three terms and apply linearity, and (xd,y0) = (x,dy0). Apply linearity on the other side, where y is perhaps ay1 + by2 + cy3. Relations transmute the first term: (xd,ay1) = (xda,y1) = (x,day1). Linearity puts it all back together as (xd,y) = (x,dy).

With all these relations in place, K is a generated submodule of S. The quotient is T, which becomes the tensor product. Once again we must prove T is the tensor product, but the proof is the same as that given above. K covers KA and KB, and all the bilinear relations, so that A and B map meaningfully into T, and the map is bilinear. Another bilinear map into U allows the construction of h(T) into U that makes the diagram commute. T is universal, and thus the tensor product A×B.

It follows that the tensor of finitely generated modules is finitely generated. I'd like to say the same about finitely presented modules, and I can, if R is a finitely generated ring. The relations in KA and KB are finite, and they combine with the generators of A and B to build a finite collection of relations in K. The only tricky part is the relations that pass the action of R from x to y. By linearity, the relation (dx,y) = (x,dy) also implies (2dx,y) = (x,2dy), and so on for all the multiples of d. If other relations pass c from x to y then follow this path to put c and d together.

(cdx,y) = (cx,dy) = (x,cdy)

If R can be built, as a ring, from the integers or the integers mod p, adjoin finitely many generators, using addition and multiplication, then the set of relations is finite, and T is finitely presented.

By equating (x,y) with (y,x), it is easy to see that A×B is isomorphic to B×A, as abelian groups. Since R passes from x to y, it doesn't really matter whether you let R act on x or y in (x,y). Thus they are isomorphic as R modules, and tensor product is commutative.

Now take a look at associativity. Choose your favorite generators for the three modules A B and C. Now A×B can be expressed using generators and relations. In fact, that is how A×B was constructed. When this is tensored with C, the resulting generators are the generators of A, cross the generators of B, cross the generators of C. These are the generators of (A×B)×C, or A×(B×C). The result is symbolically symmetric.

What can we say about the kernel of (A×B)×C? Following the procedure in the previous section, cross (x,y) with the relations that generate KC. This is done for the generators of A and the generators of B. Similarly, the relations that lead to A×B are crossed with each generator z in C. But what are the relations that produce A×B? They are the relations of KA, crossed with each generator y in B, and the relations of KB crossed with every generator x in A. Put this all together and find triples with two generators and one relation, drawn from KA, KB, or KC. This is symbolically symmetric.

Finally we need to pass the action of R between the operands of (A×B)×C. This produces relations of the form (d(x,y),z) = ((x,y),dz). A×B is an R module, hence d(x,y) is the same as (dx,y), which equals (x,dy). Relations combine all three generators, and place d on any of the three variables. This is symbolically symmetric. As an abelian group, (A×B)×C is the same as A×(B×C).

Look at the action of R. When d acts on (A×B)×C, the result is something like (d(x,y),z). When d acts on A×(B×C), the result is something like (dx,(y,z)). In each case the result is the triple (dx,y,z). The action of R is the same, and tensor product is associative.

Notice that f(x,y,z) → (x,y,z) defines an R trilinear map from A cross B cross C into the tensor product. This map is linear in A B and C (per component), and respects the action of R (per component), and allows the action of R to be passed between components.

Assume another R trilinear map g carries A cross B cross C into an R module M. If T = A×B×C, a unique R module homomorphism h carries T into M, such that fh = g. T is once again universal in its category. This generalizes to a finite tensor product of R modules.

Let B be the direct sum of modules B1 B2 B3 etc. Thus the generators of B are the union of the generators of the component modules. Use these generators, along with the generators of A, to build the tensor product.

The free group S is the generators of A cross the generators of B. The kernel K is generated by the relations that produce KA, cross each generator of B, and the relations that build the kernel of each component module Bi cross the generators of A. We also have relations that pass the action of R between the generators of A and the generators of each Bi.

Let Ti be the submodule of T spanned by the generators of A cross the generators of Bi. Clearly the submodules Ti span T. We want to show T is the direct sum over Ti.

Let a sum of elements wi from two or more submodules Ti produce 0. Pulling back to S, the sum over wi lies in K. The sum is equal to a linear combination of relations drawn from K. Each wi is spanned by the relations of K that come from A and Bi. This puts wi in Ki, a slice of the kernel K, thus each wi already lies in K, and is 0 in T. The submodules Ti are linearly independent, and T is the direct sum over Ti. Direct sum commutes with tensor product.

Apply the above to a free R module, where R may or may not be commutative. Let M be a left R module and consider R×M. Write f(R,M) = R*M, i.e. the action of R on M. Verify that this is a bilinear map from R cross M onto M. Let g(R,M) be some other bilinear map into U. We need a function h that completes the diagram. Given c in M, pull back to (x,y) in R cross M. Since R contains 1, this is the same as (1,xy), which is the same as (1,c). All the preimages of c are equivalent to (1,c), and must map to the same element in U, which becomes h(c). Thus h is well defined, and unique, and the diagram commutes. Pull c+d back through R cross M, and forward to U, to show h is a group homomorphism; and if R is commutative h is a module homomorphism. Therefore R tensor M = M, with a canonical bilinear map.

Let W be a free R module with rank j. In other words, W = Rj. Here j could be an infinite cardinal, in which case we're talking about the direct sum. Since tensor and direct sum commute, W×M = Mj, i.e. the direct sum of j copies of M.

This provides some justification for the × notation. The tensor of two free modules multiplies their ranks. For instance, R3 × R4 = R12.

Let J be a right ideal and let M be a left module. Let JM be the abelian group generated by pairs of products from J cross M, i.e. the action of J on M. Thus JM is a subgroup of M, and if R is commutative JM is an R submodule of M. The map from J cross M into JM is bilinear. Is JM the same as J×M, i.e. the tensor product? Probably not.

Let R = Z[p]/p2. This is the integers adjoin p, where p2 is 0. Let J be the principal ideal generated by p.

Let M be R adjoin q, with pq = 0. M is integer polynomials in p and q with no mixed terms and no higher powers of p. M is an R module. Note that J kills p and q in M, hence JM = Z*p. Is this the tensor product?

Let U = Z2, two parallel copies of the integers, and let p*U = 0, so that U is an R module. Let g map J cross M into U by carrying xp cross a+bp+cq to x×(a,c). (This map ignores the higher powers of q.) Clearly g respects addition in J and in M. It also respects the action of R from J or from M. How about passing R between components? Assume d+ep is a factor of xp. This means d is a divisor of x. Pull d+ep out of xp and multiply it by the other side. Expand, and a and c are multiplied by d. The image under g is the same. Thus the action of R can be passed between J and M, and the map is bilinear.

Since the tensor product is universal, a module homomorphism completes the diagram. Ignore R, and call it a group homomorphism. It agrees with g, and is onto, since g is onto, hence there is a surjective group homomorphism from Z on to Z2, which is impossible. Even though J is principal, JM does not equal J×M.

Are there ideals and modules that satisfy JM = J×M? Start again, with J a right ideal and M a left module. Let T be J×M, which exists, and is unique up to isomorphism. Construct T, as demonstrated earlier, with generators (x,y) for every x in J and every y in M.

Let g(x,y) = xy in JM. Verify that g is a bilinear map. There is therefore a unique homomorphism h(T) into JM that makes the diagram commute. If h is an isomorphism, then JM is indeed the tensor product.

Since h is a homomorphism, we only need show injective and surjective. Let's start with the latter. Each generator xy in JM comes from g(x,y), and it also comes from h((x,y)) for (x,y) in T. Linear combinations of generators in T map to linear combinations of generators in JM, and every element of JM has some preimage in T. Thus h is onto.

To show injective, simplify the problem a bit, and let J be a principal right ideal with generator p. (This was the example given above.) Suppose a linear combination of generators lies in the kernel of h. Since J is principal we can "normalize" each generator. Instead of using (xp,y), use (p,xy). Thus each generator is p combined with something in M. By linearity, any set of generators combines to form (p,q), where p generates J and q is an element of M. Thus (p,q) is in the kernel of h, and pq is 0 in M. We saw this in our earlier example; pq was equal to 0. If p does not kill anything in M, then pq is nonzero, the image of (p,q) is nonzero in JM, and (p,q) is not in the kernel of h after all. The homomorphism is injective, and JM is the tensor product.

Similar reasoning holds when M is cyclic, generated by some element q, and nothing in the right ideal J kills q.

Let J be a right ideal as above, and let R/J be the cosets of J in R, which form a right R module. (I'm abusing the notation slightly, since R/J usually denotes a quotient ring.) What is (R/J)×M?

Let JM be the subgroup of M spanned by the elements of J times the elements of M. If R is commutative JM is a submodule of M.

Let T consist of cosets of JM in M, i.e. the quotient group. Given x in R and y in M, f(x,y) = xy represents a coset of JM in M, i.e. an element of T. If x changes by an element in J, the difference lies in JM, and represents the same element of T. Therefore f is a well defined map from R/J cross M into T. Verify that f is a bilinear map, and if R is commutative it respects the action of R.

Since R/J contains 1, and M is unitary, the map is onto. The pair 1 cross c is a valid preimage of c in T.

Let g(R/J,M) be another bilinear map into U. Pull c in T back to a preimage in R/J cross M. A preimage such as (x,y) can be normalized to (1,xy), or (1,c). Another preimage (1,c+e) is possible, but only if e is in JM, leading to the same thing in T. Let h(c) = g(1,c). If instead we chose c+e, e is in JM, and for g to be well defined on R/J, g(e) must equal 0 in U. No matter the preimage, h(c) = g(1,c).

Pull c+d back to R/J cross M, and forward to U to show h is a group homomorphism, or a module homomorphism if R is commutative. T is universal, and (R/J)×M equals M/JM.

Continuing the above, let J be principal, generated by p, and let K be a right ideal inside J. Let J/K be the cosets of K in J. What is J/K tensor M?

As before, JM is shorthand for the finite sums of elements of J times elements of M. Note that KM is a subgroup of JM is a subgroup of M. These subgroups become submodules if R is commutative.

Let T be the cosets of KM in JM, and map x in J and y in M to xy, a cosrep of KM in JM. Varying x by something in K doesn't change the coset of KM in JM. The map is well defined and bilinear.

We need J to be principal to show the map is onto. Given c in T, or in JM, remember that c is a linear combination of product pairs. However, each pair can be normalized to p times something in M. The components from M can then be added together, hence something in J times something in M yields c. Every element of T is accessible.

Given g(J/K,M) into U, build h as we did before. The normalized preimage of c is py, for some y in M. If y deviates by e, pe lies in KM. e is a sum of products from K cross M, each product starts with something in K, which is really 0 in J/K, and g(e) = 0. Let h(c) = g(p,y), which is the same even if y changes by e. The map h is well defined, and dictated by g. T is universal, and J/K tensor M equals JM/KM.

Can we tensor one quotient ring with another? Let J be a right ideal and let K be a left ideal, and consider R/J tensor R/K. Use the quotient formula above. View R/K as a left module M, so that the tensor product is M/JM.

Elements of M are cosets of K in R. If such a coset is in JM then write a cosrep as a sum of products from J cross M. With J a right ideal, this puts the cosrep in J, whence that coset and J intersect. Conversely, let J intersect a coset of K in a cosrep z. Premultiply z by anything in J and find something else in the same coset, K being a left ideal. This coset is now part of JM. In summary, JM consists of those cosets of K that intersect J.

As an abelian group, M = R/K. Take the image of J in this quotient group and find the elements that we call JM. We must mod out by the image of J, in R/K, to find the tensor product. By correspondence, this quotient group is the same as R mod J+K, that is, r mod the group generated by J and K. This is an R module if R is commutative. The bilinear map, from R/j cross R/K onto R/(J+K), is multiplication in R.

When R is a pid, wherein all ideals are principal, J+K is generated by the gcd of the generators of J and of K. The integers form a pid, and tensor distributes across direct product, thus giving equations that look like this.

(Z/6 * Z/15) × (Z/9 * Z/35) = (Z/3)2 * Z/5 * Z/1

Of course Z/1 is trivial, and drops out.

Let A×B = T, where T is nonzero. Select a nonzero element c of T. Now c is a linear combination of product pairs, e.g. (x1,y7) + 3(x5,y2) - 29(x4,y8. Let the aforementioned values of x generate a submodule in side A and let the values of y generate a submodule inside B. Tensor these two submodules and find the same element c. There are even fewer relations implementing bilinearity, so c remains nonzero. Thus the tensor product of these two finitely generated modules is nonzero.

If the tensor product of every finitely generated submodule of A with every finitely generated submodule of B is zero, then A×B is zero.

If A×B is zero, not much can be asserted about their submodules. For instance, let A = Z/p, the integers mod p, and let B = Q, the rationals. Both are Z modules. What does the tensor product look like? Recall that 0 in either component maps to 0. If the second component is a nonzero rational number it is divisible by p. Move p to the other side and make the first component zero. Therefore the tensor product is 0. However, if we restrict B to Z, a submodule of Q, Z/p tensor Z = Z/p, which is nonzero. Restricting to a finitely generated submodule can turn a zero tensor product into something nonzero.

Let C1 be the category of pairs of left and right R modules, where pairs of module homomorphisms act as morphisms. Let C2 be the category of abelian groups, where group homomorphisms act as morphisms. The tensor product induces a functor from C1 into C2.

Consider a morphism in C1. Specifically, let f1 map A1 into B1, and let f2 map A2 into B2. Build the tensor products S = A1×A2, and T = B1×B2. Remember that S and T are abelian groups, members of C2. Thus tensor product is a functor that carries objects in C1 to objects in C2. How about the functions?

Build g, from A1 cross A2 into T, by applying f1 and f2, then the bilinear map from B1 cross B2 onto T. Verify that g is a bilinear map. By the universality of S, there is a unique group homomorphism h(S) into T that agrees with g. By definition, h is the tensor product of f1 and f2, and h is the corresponding morphism in the category C2.

Show that the composition of morphisms in C1 leads to the composition of morphisms in C2. Either way we have a group homomorphism in C2 that makes the diagram commute, and since S is universal, the morphisms must agree. Therefore our map is a functor between categories.

If R is commutative, then this functor carries C1 into C3, the category of R modules and R module homomorphisms.

Since h is compatible with what goes on upstairs, you can get your hands on it. Take x from A1 and y from A2, and h maps (x,y) in S to (f1(x),f2(y)) in T. This is a homomorphism from the free group that is the parent of S, into the free group that is the parent of T. Let w be a bilinear relation on the symbols drawn from A1 cross A2, hence 0 in S. Map w through to B1 and B2, and down through g, and in to T, and it must be 0. This because g is a bilinear map. Therefore h is a well defined compatible homomorphism from S into T. Since S is universal, h is unique. h is the tensor of f1 and f2.

Tensor does more than map f1 cross f2 into h; it respects function addition and scaling by R. Add functions upstairs, e.g. f1+g1 or f2+g2, and you add the corresponding functions downstairs. This can be seen from the characterization of h in the previous paragraph. Therefore tensor product induces a homomorphism from the module of functions from A1 cross A2 into B1 cross B2, into the module of functions from S into T.

As a special case of tensoring two functions together, assume one of the two functions is the identity map on M. Tensor A and B with a module M, and tensor f(A) into B with the identity map from M onto M. If x in A is mapped to y in B, and w is an element of M, then (x,w) in A×M is mapped to (y,w) in B×M. This is a useful technique, as we'll see below.

What happens when M = R? We know that A×R is isomorphic to A, and B×R is isomorphic to B. This isomorphism is realized by equating x in A with (x,1) in A×R. Since 1 maps to 1, (x,1) maps to (y,1) . It's the same function, acting on the same modules. Tensoring with R doesn't change a thing.

What if M is the direct sum of modules Mi? Let w be an element in the direct sum. Remember that (x,w) leads to (y,w). Break w up into its components, and (x,wi) leads to (y,wi). This is what we would get if we tensored f with the identity map on Mi. Therefore, tensoring f with the identity on a direct sum M is the direct sum of the individual tensor products with Mi.

As a corollary, tensor with a free module, and obtain the direct sum of that many instances of f, from that many instances of A into that many instances of B.

What if M is itself a tensor product - say M = S×T? Tensoring is associative, so A×M = A×S×T. Let's look at the homomorphisms. Start with x → y from f and pair this with v in S. This gives (x,v) → (y,v). Tensor with T and find ((x,v),w) → ((y,v),w). On the other hand, join x → y with the pair generator (v,w) in M and find (x,(v,w)) → (y,(v,w)). These are the same generators in A×S×T and B×S×T respectively. Tensoring with M is equivalent to tensoring with S, and then with T.

Here is another variation of tensoring with a direct sum. Let A be the direct sum of left modules A1 A2 A3 etc. A homomorphism f that carries A into B defines, and is defined by, the homomorphisms fi from each Ai into B.

Let another homomorphism g carry C into D. What is the induced homomorphism h from A×C into B×D? In other words, what is f×g?

Since tensor and direct sum commute, A×C is the direct sum of Ai×C. Thus any homomorphism h on A×C defines, and is defined by, component homomorphisms hi from Ai×C into B×D. Since f and h are compatible, i.e. h is (f,g) applied to the generators of S, h and f must agree per component. If they didn't agree per component they wouldn't agree overall. Therefore hi is fi×g. Conversely, the function f×g is the direct sum of the functions fi×g, across all components, as is implied by linearity in h.

In earlier sections, juxtaposition of an ideal and a module, such as JM, indicated the action of J on M, that is, sums of products of elements in J times elements in M. Modules were never juxtaposed. However, for the rest of this chapter, juxtaposition of two modules indicates a tensor product. Thus CM is the same as C×M. It's just algebraic shorthand.

Recall that a short exact sequence is an embedding of A into B, with quotient module C, and is denoted as follows.

0 → A → B → C → 0

If these are left modules, and M is a right module, consider the three tensor products: AM, BM, and CM. These are abelian groups, or R modules if R is commutative.

A homomorphism carries A into B, and another homomorphism carries M onto M, namely the identity map. In an earlier section I described the tensor product of two homomorphisms. Thus there is an induced homomorphism from AM into BM, and another induced homomorphism from BM into CM. The derived sequence is almost short exact. The first 0 is lost; AM need not embed into BM.

AM → BM → CM → 0

Don't let the notation confuse you. We are really saying that there is a homomorphism from AM into BM, and the quotient is CM. That's all.

We know that AM maps into BM and BM maps into CM. To complete the proof we need to show that BM maps onto CM, and that the kernel of this map is the image of AM. For this proof the map from A to B need not be injective, as long as its image is still the kernel of the next homomorphism onto C.

Onto is easy. Let z lie in C and let w lie in M, so that (z,w) is a generator of CM. Since z has some preimage y in B, (y,w) maps to (z,w), and the induced function from BM to CM is surjective.

Let f map A into B, and let g map B into C. Note that fg maps all of A to 0. Tensor fg with the identity on M, and for any x in A and w in M, (x,w) maps to (0,w), which is 0 in CM. The image of AM lies in the kernel of BM.

Finally show that the kernel of BM lies in the image of AM. Let e be any preimage of a bilinear relation in C cross M, which becomes 0 in CM. Thus e is a relation in B cross M, and e belongs to the kernel of BM. The bilinear relation in C cross M might look like (z3,w) - (z2,w) - (z1,w). Pull this relation back to any preimage in B cross M. Select any y1 in the preimage of z1, and similarly for z2 and z3. This is our arbitrary preimage e, leading to the designated relation in CM. Then write (y3,w) - (y2,w) - (y1,w) - (y0,w). The extra element y0 is in the kernel of B, and therefore in the image of A. I'm using y0 to attain proper equality in B. This 4 term relation represents bilinearity in B cross M, and drops to 0 in the tensor product BM. Subtract this from e, and find (y0,w), which is in the image of AM. Therefore e, leading to a bilinear relation in CM, and in the kernel of BM, is in the image of AM.

Let's try another bilinear relation in CM: (z,w3) - (z,w2) - (z,w1). Pull each z back to y plus something in the kernel of B, and the result is bilinear in BM, plus three terms in the image of AM.

Finally we have (zd,w) - (z,dw). This also pulls back to a bilinear relation in B cross M, plus two terms in the image of AM.

Now let e be anything in the kernel of BM. It maps forward to a finite sum of bilinear relations in CM. Select a preimage for each of these relations, and subtract these preimages away from e. Each preimage is something in the image of AM. The last relation has a specific preimage, whatever is left of e, but that too is in the image of AM, because every preimage of every bilinear relation in CM is in the image of AM. Therefore e is the sum of elements in the image of AM, and e is in the image of AM. That completes the proof.

As a generalization of the above, tensor A B and C with three isomorphic modules MA MB and MC. The proof is the same, and the following exact sequence appears.

AMA → BMB → CMC → 0

This makes intuitive sense; the elements of M are merely relabeled, as M is tensored with A B and C.

Here is an example where AM does not embed in BM. Let the inclusion of the even integers into Z produce a short exact sequence as follows.

0 → 2ZZZ/2 → 0

Take the tensor product with Z/2. Z/2 is a quotient ring, so use the formula for tensoring with a quotient ring. In each case the tensor product is Z/2. However, we cannot embed Z/2 into Z/2 and find a quotient group equal to Z/2. The first tensor function maps Z/2 to 0, and the second is an isomorphism.

If a short exact sequence is split exact, then B is the direct product of A and C. Tensor with M, and BM is the direct product of AM and CM. AM embeds into BM, with quotient CM, and the sequence is short exact, and split exact.

There are some modules that always produce a short exact sequence, with AM embedding into BM. If M is such a module it is called a flat module. An example is M = R, which is a flat R module. Tensor any short exact sequence with R and get the same exact sequence back again, the same modules and the same functions. There are many other flat modules, including the fraction field of R if R is an integral domain. Flat modules will be described later on in this chapter. I'm just giving you a heads up that they're coming, and they're important.

Let R and S be commutative rings, where h(R) is a ring homomorphism into S. Let M be an R module. Note that S is also an R module. The action of c, relative to S, is multiplication by h(c) in S. Thus we can tensor M and S, giving a new R module B.

Let (x,y) be a pair generator of B, where x is in M and y is in S. Elements of S act on (x,y) by acting on y. Any bilinear relation, when acted upon by c in S, is still a bilinear relation. This makes B an S module. By tensoring with S we have performed a base change; the result is B, which is an R module and an S module.

If c is in R, let c act on (x,y) by applying it to either x or y. Acting on x, (cx,y) is equivalent to (x,cy), whereupon c acts on y via h(c)*y. This is the same as mapping c into S and then performing the action of S on y. The action of R and the action of S are compatible.

Continuing the above, let N be an S module, which is also an R module. Let W be another S module. Evaluate M×N as R modules, view the result as an S module, and tensor with W. Alternatively, evaluate N×W as S modules, view the result as an R module, tensor with M, and view the result as an S module. Are they the same? This is a variation on associativity.

In each case the result is generated by triples (x,y,z) drawn from M cross N cross W, with relations that reflect linearity, and passing R between M and N, and S between N and W. The generators are the same and the relations are the same, hence the tensor products are the same. Furthermore, each can be acted upon by S, through N, hence they are isomorphic as S modules.

As you might guess, tensor and base change commute. Let M1 and M2 be R modules, and tensor them with S, giving B1 and B2. Let T be the tensor product of B1 and B2. T, as an S module, is the same as (M1×M2) × S.

Start with the first product, M1S × M2S, having generating tuples drawn from M1 cross S cross M2 cross S. The relations are those of linearity of course, and passing the action of S between M1S and M2S. In each case S is acting on the second symbol, the one drawn from S. Rewrite the generators as symbols drawn from M1 cross M2 cross S cross S. This is a permutation that does not change T. Now the action of S passes between S and S, and the second instance of S can always be normalized to 1. In other words, S tensor S = S. The action of R from M2 to the second instance of S can pull back to the first instance of S. That leaves M1 cross M2 cross S cross 1, Which is (M1M2) × S.

Let R be a commutative local ring, with A and B finitely generated R modules. If A×B = 0 then either A or B is 0.

Let S be the quotient ring of R mod its maximal ideal J. Thus S is a field. Change base with respect to S. Thus 0 tensor S = AB tensor S = (AS)(BS). Since BS is an S module it is an S vector space of a certain dimension. Vector spaces are free modules, and the tensor product of free modules is free, with rank equal to the product of the individual ranks. If this is zero, then either AS or BS is zero.

Assume without loss of generality that BS = 0. If BS is 0 as an S module it is 0 as an R module. Apply the formula for tensoring with a quotient ring. Thus BS equals B mod JB, where J is the maximal ideal. Since this is zero, B = JB. The action of J maps B onto B. Since J is the jacobson radical, apply nakiama's lemma to assert B = 0.

If either module is infinitely generated, all bets are off. Let R be the localization of Z about p, i.e. the rationals without p in the denominator. Now the integers mod p and the rationals are both R modules, yet their tensor product is 0. Equate any pair generator (x,y) with (xp),y/p, which becomes 0.

A or B is once again 0 if the jacobson radical is nilpotent. The modules need not be finitely generated. Use the same proof and close with a variation of nakiama's lemma.

In another application of base change, Let R be an integral domain, and let A and B be torsion free R modules. In other words, nothing in R (save 0) kills anything in A or B. If A×B = 0 then either A or B is 0.

Tensor with S, the fraction field of R, so that (AB)S = AS × BS = 0. Assume, without loss of generality, that BS = 0. If B has a nonzero element x, consider (x,1) in BS. Nothing from R can pass from 1 to x and turn x into 0, because nothing kills x. And nothing from R can pass from x to 1 and become 0 in S, because R has no zero divisors. Thus (x,1) remains nonzero in BS. This is a contradiction, hence B = 0.

As long as R is commutative, you can tensor with S, where S is R mod some maximal ideal. Tensoring with a field is useful because the resulting modules are vector spaces. We understand vector spaces, and maps from one vector space to another. This was put to good use in the previous section. Here is another application of base change with respect to a field. If M is a free R module it has a well defined rank. Two free modules that are isomorphic must have the same rank. Start with a short exact sequence and tensor with S.

0 → A → B → C → 0

? → AS → BS → CS → 0

Set A = 0 and let B and C be isomorphic free R modules. This leads to isomorphic vector spaces downstairs. We know from linear algebra that AS and BS have the same dimension. Since tensor and direct sum commute, the rank of B is the dimension of BS, and similarly for C and CS. Thus B and C have the same rank. When R is commutative, every free R module has a well defined rank. This is called the invariant dimension property, a term borrowed from linear algebra.

Next let A be unconstrained, while B and C are free. Now BS maps onto CS, and the dimension of BS is at least the dimension of CS, hence the rank of B is at least the rank of C. If B maps onto C it cannot have a smaller rank.

If R does not contain 1, all bets are off. As an abelian group, let R be the direct sum of infinitely many copies of Z. Let the product of any two elements of R = 0. Embed R in itself by mapping the generators to the even numbered generators. Then embed R in itself by mapping the generators to the odd numbered generators. As an R module, R is isomorphic to R*R, and the dimension of a free R module is not well defined.

R has plenty of maximal ideals, but the earlier proof derails, because R mod a maximal ideal is not a field. Let M be the multiples of 5 in the first instance of Z, crossed with all the other copies of Z. This is maximal in R, having a quotient ring that looks like the integers mod 5. Tensor with S = R/M, whence each instance of R becomes R/MR. But MR is 0, so tensoring with S doesn't change a thing. With this example behind us, let's reinstate the assumption that rings contain 1.

Let R be a commutative ring, and let C be a free R module of rank k that is spanned by j generators, where j < k. Let B be the free R module of rank j. Map the basis of B onto the j generators of C, and a unique epimorphism carries B onto C. Yet B has a lower rank than C. This is a contradiction, as shown above. If C has rank k, it cannot be spanned by fewer than k generators.

If C is spanned by precisely k generators, and k is finite, these generators form a basis. Before we prove this, let's see what goes wrong when k is infinite.

Let C be a free module of infinite rank, and build a set of generators from the basis, but toss in the first basis element twice. This set certainly spans C, and its cardinality agrees with the rank of C, but it does not form a basis, because two of the generators are precisely the same element in C.

Now, let C be free with finite rank k, and consider a set D of k generators that span C. Let g() map the standard basis for C onto D. Thus g maps C onto C. We want to show that this is an isomorphism - that D becomes another basis for C.

We have a homomorphism g whose image is all of C; we only need show the kernel is 0. Suppose there is a nonzero element w in the kernel. Remember that C is a free module, so w ∈ C is nonzero across finitely many components. Select one of these components and look at the nonzero projection x, which is an element of R. Since zero is a local property, there is a prime ideal P in R such that x/1 remains nonzero in the localization RP.

Let S be the localization RP, and tensor 0 → A → C → C → 0 with S, where A is the kernel of the map from C onto C. Since tensor and direct sum commute, CS becomes a free S module, having the same rank as C. (w,1), a member of CS, has (x,1) as one of its components, and (x,1) is nonzero in S. Thus (w,1) remains nonzero in CS. Since w maps to 0 in C, (w,1) in CS maps to 0 in CS. The induced function g from CS into CS still has a nontrivial kernel.

The map from CS to CS carries k basis elements onto k generators. These generators span all of C cross 1, and when viewed AS an S module, they span all of cs. If g(u) = v, then g(u/f) = v/f, where f is a denominator of S. Therefore g maps CS onto CS. The problem has been reduced to a simpler case, modules over a local ring.

Think of S in the above as the new ring R. Now R is a local ring, and C is a free R module of finite rank k, and a set D of k generators spans C, whence D becomes a basis for C. If this fails then our original premise also fails.

0 → A → C → C → 0

Once again A is the kernel of the map from the basis of C onto the set of generators D that span C. Since C is free it is projective, and the sequence is split exact. Tensor 0 → A → C → C → 0 with S, where S is R mod its maximal ideal. The result remains split exact. Everything is an S module, where S is a field. Everything is a vector space. Since CS and CS have the same finite dimension, AS has to be zero.

View AS as an R module. Hence A tensor S = 0. S is finitely generated, since it contains 1. A is a submodule of C, but it is also a quotient module, since the sequence is split exact. Thus A, like C, is finitely generated. The tensor product of two finitely generated modules over a local ring is 0. We showed above that one of the two modules has to be 0. Since S is nonzero, A = 0. That completes the proof. Any k generators that span C act as a basis for C.

But what if we don't know ahead of time that C is a free R module? Assume C is trapped between two R modules of rank k, a lower module Cl and an upper module Cu. Map the basis of Cl onto D, the generators of C, and suppose there is a kernel A. Write this as follows.

0 → A → Cl → C → 0

If R is a pid then C is the submodule of a free module Cu, and is free. In the next section I'll prove the rank of such a module is bounded above by K. And since C contains Cl, the rank is bounded below by k. The rank is k, and the previous theorem applies. D is a basis for C.

If R is not a pid, tensor with the localization RP as we did before. Choose P a maximal ideal. Cl becomes a free RP module of rank k with a nontrivial kernel. At the same time, C becomes trapped between two RP modules of rank k. (I'll prove that containment is preserved by localization in a later section.) We have pushed the problem down to a local ring. If RP is a pid for every maximal ideal P, then C is a free RP module of rank k, and d becomes a basis. Thus d was a basis for C, exhibiting no kernel, and C is a free R module of rank k.

A vector space cannot fit into another space of lower dimension. This generalizes to free modules over an integral domain.

If A and B are free modules, tensor 0 → A → B → C → 0 with S, the fraction field of R. I'm getting ahead of myself here, but S is flat, which means tensoring with S keeps the exact sequence exact, including the leading 0. (This is described later on in this chapter.) The free R module A becomes a vector space of the same rank - and similarly for B. The vector space AS cannot have a higher dimension than BS, hence the rank of A cannot exceed the rank of B.

This theorem does not require R to contain 1. That's because the fraction field of R does contain 1 in the form of w/w for any nonzero w. Take the fractions of the even integers, for instance, to get the rationals. Everything is invertible, the fraction field is really a field, and the theorems from vector spaces apply. As long as R has no zero divisors, we're good to go.

If R does not contain 1, yet it contains at least one prime ideal P, we can get the invariant dimension property back again. Tensor 0 → 0 → B → C → 0 with R/P, and find two isomorphic free modules over A ring with no zero divisors. Tensor this with the fraction field, and the two modules have the same rank, hence B and C have the same rank. More generally, if B maps onto C, its rank is at least as large as the rank of C. In our earlier pathological example, where R = R*R, R has no prime ideals, since 0 drags in all of R.

If R contains at least one prime ideal, (R may or may not contain 1), we can resurrect the free submodule theorem as well. A prime ideal makes everything work.

Suppose a free R module embeds in another free module of lower rank. Let P be a minimal prime ideal. Localize about P, i.e. tensor with RP, and a free RP module embeds in another free RP module of lower rank. Again, I am using the fact that a localization is flat, whereupon the resulting sequence is short exact. A×RP embeds in B×Rp. Thus we have reduced the problem to a ring that contains 1, and has but one prime ideal P.

The radical of 0 is the intersection of all the primes containing 0, and that is just P. The radical of 0 is also the elements x such that some power of x = 0. These are called nilpotent elements. Thus every x in P is nilpotent.

Since P is maximal, R/P is a field. Take y outside of P and write yz = 1 in R/P. Thus yz = 1+u for some u in P. Since u is nilpotent, 1+u is a unit, hence y is a unit. Everything outside of P is a unit. Every x in R is either nilpotent or a unit.

At this point we need a lemma. Any finite set of elements in P is killed by a nonzero element in P. For a single element x, of order n, x is killed by xn-1. Let u kill a finite set S, and bring in the element z. If u kills z we're done. If this is not the case, consider uz. This kills all of S, and if it kills z we're done. If not, consider uz2. Continue as far as necessary, until uzn-1 kills S and z.

Now try to embed Rn+1 in Rn. In other words, there are n+1 vectors in Rn that are the images of the basis of Rn+1. For convenience, arrange them in an n+1 by n matrix U. A linear combination of these rows is equal to 0 iff the map is not an embedding. This includes the possibility of one of the rows of U equal to 0. Turn this around and linear independence corresponds to an embedding.

Assume a linear combination of the rows of U, with coefficients ci, yields 0. Imagine subtracting w times the second row from the first to build another matrix V. Add wc1 to c2, and the rows of V span 0. If any coefficient other than c2 was nonzero it is still nonzero. If they are all zero (other than c2), then c2 has not change, and is still nonzero. Thus V is linearly dependent.

Of course we can add w times the second row to the first, reversing the process. One matrix represents an embedding iff the other one does.

Suppose a row contains no units. By the previous lemma, something kills that row, and we don't have linear independence. If U represents an embedding then every row contains at least one unit.

Assume there is a unit in the upper left. Use gaussian elimination to clear the first column. Everything below this unit is 0.

If the second row has no units the matrix does not represent an embedding, so assume the unit is in the second position. Subtract multiples of the second row from the rows below, thus clearing the second column. Continue this process until there are units down the main diagonal, and zeros below. The bottom row is all zeros. Multiply this by 1 and contradict linear independence.

As long as R has a prime ideal, a free module cannot embed in a free module of lesser, finite rank. This result extends to infinite ranks using the corresponding proof from linear algebra.

The last case to consider, and this doesn't come up very often, is the case where R is a finite ring, having finitely many elements. R may or may not contain 1, or a prime ideal. If B is free of rank n, then the size of B is |R|n. If A embeds in B then A cannot have a larger rank, or it simply wouldn't fit. Nor can a smaller rank map onto a larger rank. Dimension is directly determined by size, and R has the invariant dimension property.

Let R be commutative, and assume U and V are free of finite rank, with U containing V. As shown above, V cannot have a larger rank. If V has a smaller rank than U, then the quotient module U/V cannot be torsion. Something in U/V looks like R. There just aren't enough dimensions in V to take care of U. Suppose V has lesser rank, with a quotient module that is torsion, and tensor with R mod any maximal ideal. Now everything is a vector space. The first has a dimension strictly less than the second, and it may not even embed. Thus the quotient is a nontrivial vector space. However the quotient is still torsion, and the only torsion vector space is 0. This is a contradiction, hence our original quotient module U/V contains at least one free copy of R.

If R does not contain 1, but still has a prime ideal, then localize about P first, then tensor with the quotient field.

V doesn't have to be a free module, as long as it is generated by m generators, where m < n. Once you get to vector spaces, m space maps into n space, leaving a nontrivial quotient.

If the ranks are equal, a torsion quotient module is implied, as when the even integers live inside the integers as Z modules, having a quotient of Z/2. Nothing is free in the quotient; in fact it is killed by 2. If y is free in the quotient, with preimage x, then x and the kernel span a module of rank n+1 inside a module of rank n, which is impossible. The quotient is torsion iff the ranks are equal.

If a module M is torsion, and finitely generated, using n generators, then M is the homomorphic image of a free module F of rank N. If R is a pid then the kernel is another free R module, and its rank has to be n.

Assume R is commutative. An R module M is flat if every short exact sequence tensored with M gives another short exact sequence. The leading 0 is preserved.

Tensoring with R doesn't change a thing, hence R is a flat R module.

Let M be the direct sum of modules Mi, and tensor M with 0 → A → B → C → 0. In an earlier section we showed that BM is the direct sum over BMi, and similarly for the other modules. If f embeds A into B, and x is in A and w is in M, then the induced tensor function maps (x,w) to (f(x),w). Perhaps w is w1 + w2 + w3, having nonzero projections in the first 3 components of M. The image (f(x),w) is 0 in BM iff each (f(x),wi) is 0 in BMi, for i in 1 2 and 3. Generalize this to any w in M, and the tensor function from AM into BM is injective iff each function from AMi into BMi is injective. Therefore M is flat iff each Mi is flat.

Using this, and the fact that R is flat, any free module is flat.

Being a summand of a free module, every projective module is flat. There are flat modules that are not projective however, as demonstrated by Q, which is a flat Z module (this will be demonstrated below), yet Q is not projective.

If U and V are flat R modules then so is their tensor product. Tensor with U and get an exact sequence, then tensor with V and get another exact sequence. This is equivalent to tensoring with UV, in terms of modules and functions.

Sometimes it is possible to chain a flat ring and a flat module together. Let a ring homomorphism take R into S, where S is a flat R module. Let M be a flat S module. Note that M is also an R module, courtesy of the homomorphism from R into S. Tensor an R exact sequence with S, giving an exact sequence of R modules. This is also an exact sequence of S modules. Tensor with M and find another exact sequence of S modules, which is an exact sequence of R modules. Thus, S×M is a flat R module. But M is an S module, so tensoring with S isn't going to change a thing. Therefore M is a flat R module. A flat S module becomes a flat R module.

Assume instead that M is a flat R module and perform a base change. M×S is a flat R module, that happens to be an S module. Tensor this with an exact sequence of S modules, which are all R modules courtesy of the ring homomorphism from R into S. The sequence remains exact when viewed as R modules. The first module embeds, and it is an S module, hence it embeds as an S module. Therefore M×S is a flat S module. The base change of a flat module is flat.

A ring homomorphism from R into S is flat if S is a flat R module. This is merely nomenclature.

A module M is faithful if, for every module U, M×U = 0 implies U = 0. This definition remains valid when M is a left module over a noncommutative ring, but more often, faithful refers to modules over a commutative ring.

The tensor product of two faithful modules is faithful.

If one component of a direct sum is faithful, then the entire direct sum is faithful. MU = 0 implies MiU = 0 implies U = 0.

Since RU = U, R is faithful.

Take the direct sum of copies of R, and every free module is faithful.

If R is a division ring then every R module is free, and faithful.

If R is a pid, then every finitely generated torsion free module is free, and faithful.

We now have a round-about proof that Q is not a free Z module. If it were it would be faithful, but we saw, earlier in this chapter, that Q×Z/p = 0. So Q is not faithful, or free, though it is flat, as will be demonstrated below.

From the other side, a faithful module need not be flat. Let R be the integers mod p2, with J equal to the ideal generated by p, i.e. the multiples of p. Note that R is a local ring, and its maximal ideal J is nilpotent. If the tensor product of two R modules is zero then one of the two modules is zero. Therefore every nonzero module over R is faithful. We only need find one that isn't flat.

Let M = Z/p, which is an R module. In fact it happens to be isomorphic to R/J. Tensor M with the exact sequence 0 → J → R → M → 0. The middle module is easy; R tensor M is M. Evaluate M×M by the two quotient ring formula, giving R mod (J+J), or M. Finally J×M becomes J mod JJ, which is J. The result is J → M → M → 0. These modules all have size p, hence J cannot embed, and M is not flat.

Free is faithful, but projective need not be faithful. Let F be a field and let R be the direct product of F cross F. A and B are R modules, both an instance of F, but R acts on A through its first component, with the second component mapping to 0, and R acts on B through its second component. A*B is R, as an R module, thus A and B are projective. Consider the tensor product A×B. A is a quotient ring, R mod its second component J. Thus A×B is B/JB, or B/B, or 0. Thus A is not faithful.

A ring homomorphism from R into S is faithful if S is a faithful R module. This is similar to the definition of a flat homomorphism, as presented in the previous section. A homomorphism is faithfully flat if it is faithful and flat, i.e. if S is a faithful flat R module. For example, embed R into R[x], the constants in the ring of polynomials. Since R[x] is a free R module it is both flat and faithful.

Let T be a faithfully flat R module, and let M and N be R modules, with a module homomorphism f from M into N. Tensor M and N with T, and assume the resulting homomorphism, f cross the identity map on T, is an isomorphism. Pull back and prove M and N are isomorphic.

Suppose f is not injective, with kernel K, and restrict N to the image of M under f. Tensor the following sequence with T.

0 → K → M → N → 0

Since T is flat, the resulting sequence is exact. Also, since T is faithful, K×T is nonzero. Thus K×T is a nontrivial kernel of M×T. This is impossible, hence f is injective.

Suppose f is not surjective, and let C be the cokernel. Tensor the following sequence with T.

M → N → C → 0

Again, C×T becomes a nonzero cokernel, and that is impossible. Therefore f is an isomorphism, and M and N are equivalent.

As you recall, a projective module is flat. If M is finitely generated and projective, it is called finite flat. Think of "finite flat" as one word. A module could be finitely generated over R, and flat, without being finite flat, though if it is finite flat it is finite over R and flat.

Let R and S be commutative, with a ring homomorphism from R into S, and watch what happens to a finite flat R module under base change.

If M is a free R module then M×S becomes a free S module of the same rank. This because direct sum and tensor product commute. Thus free becomes free.

Now assume M is the homomorphic image of a free R module F. Thus M is the end of a sequence → F → M → 0. Tensor with S, and tensor the homomorphism with the identity map on S. Thus M×S is the homomorphic image of a free S module of the same rank. If M was a finitely generated R module then M×S is a finitely generated S module, and its rank is at most the rank of M.

Let M be projective, a summand of F. Write F as the direct product of M and U, and tensor with S, whence M×S is the summand of a free module, and is projective over S. Put these together and M finite flat implies M×S is finite flat.

In the previous sentence, and in most sentences involving base change, the first adjective refers to M as an R module, and the second adjective refers to M×S as an S module. The base change of a projective R module is a projective S module, but need not be another projective R module. An example is the integers and the rationals as Z modules. Z is a projective Z module, in fact it is free, but tensor with Q and get Q, which is a free Q module, and not a projective Z module.

How about the tensor of two finite flat modules? Let M and N be finite flat modules over R. Construct the tensor product using generators and relations, and M×N is finitely generated. In fact the generators of M×N are the generators of M cross the generators of N. Then let M*U be free, and let N*V be free, and expand (M*U)×(N*V), giving a free module. Thus M×N is the summand of a free module, and is projective, and finite flat.

Finally let's look at a tower of finite flat modules. If S is a finite flat R algebra, and T is a finite flat S module, then T is a finite flat R module.

Take a step back and assume S and T are free. Write T as a direct sum of S, and expand each S into a direct sum of R, whence T is a direct sum of R, and a free R module.

Next assume S and T are projective. Let S*V = F, where F is a free R module. Let T*W = Sn. (In this proof, n can be any cardinal number.) Write Fn = Sn*Vn = T*W*Vn. Thus T is the summand of a free R module, and projective over R.

Finally assume S and T are finitely generated. Cross the generators of T, as an S module, with the generators of S, as an R module, to get finitely many generators spanning T as an R module. Put this all together and the composition of finite flat is finite flat.

Let R be a local ring, and let M be a finite flat R module. Let K be the residue field of R, i.e. R mod its maximal ideal H. Let W = M×K, which is a K vector space of rank n for some integer n.

Recall the quotient formula; M×K = M mod HM, where H is the maximal ideal of R. Let b1 b2 b3 … bn be elements of M that become basis elements in M/HM, also known as Kn. This defines a canonical map from the free module F = Rn into M, mapping the generators of F onto the basis elements.

Let C be the cokernel, that is, M mod the image of F. Write an exact sequence like this.

F → M → C → 0

Tensor with K and get this.

Kn → Kn → C×K → 0

The map from F into M was carefully chosen. In particular, the image of F includes the basis elements for Kn. In other words, the first homomorphism in the above sequence is onto. The image of the first homomorphism is the kernel of the second, and that means C×K = 0. Using the quotient formula, C equals HC. (Since H acts on M it acts on the quotient module C.) The generators of M span C, so C is finitely generated. Apply nakiama's lemma, and C = 0. Thus F maps onto M. If this map is injective then M is free.

Let V be the kernel of this map. Remember that M is projective, and an equivalent characterization says every short exact sequence ending in M splits. Thus F is the direct product of M and V.

Tensor the split exact sequence 0 → V → F → M → 0 with K. The result is split exact, so V×K joins a free module of rank n to give a free module of rank n. Therefore V×K = 0. Since V is the quotient of a finitely generated module F, V is finitely generated. Nakiama's lemma shows V = 0, hence F = M. Our finite flat module is free.

The above can be generalized to other rings that may not be local. Let H be a nonzero ideal in the jacobson radical (which lets us use nakiama's lemma), and assume M/HM happens to be a free R/H module. Let K = R/H, realizing that K need not be a field, and apply the above proof. Once again M is free with rank equal to that of M/HM.

There is one tricky part to this generalization - the dimensionality argument that forces V×K = 0. This time K might not be a field. Suppose x is a nonzero element in V×K. Since F×K is free, x generates a copy of K, which is separate from the submodule M×K. Together they span a submodule of rank n+1 inside a free module of rank n. We showed in an earlier section that this is impossible.

Of course finite flat over a pid is free, because every projective module over a pid is free.

As you know, the localization of a ring R about a prime ideal P, denoted RP, is the ring of fractions with numerators in R and denominators in R-P. The rationals are the localization of the integers about the prime ideal 0.

To localize a module M about the prime ideal P, denoted MP, tensor M with RP. Since RP is an R module, where the action of x is multiplication by x/1 in RP, the tensor product M×RP is well defined.

If R is not commutative, RP is both a left and a right R module. Tensor the right module RP with the left module M, and MP is a well defined abelian group. However, this does not come up very often. Most of the time R is commutative, whence MP is a base change from an R module to an RP module. For example, if R is an integral domain and P = 0, MP becomes a vector space.

Wait a minute. We defined the fractions of a module M in an earlier chapter as elements of M in the numerator and certain elements of R in the denominator, partitioned into equivalence classes, and here we are defining it again in terms of tensor product. This can only work if the two definitions are equivalent, and they are.

More general than localization, let S be a fraction ring of R, having denominators belonging to a given multiplicatively closed set, and consider M×S. An element in M tensor S is a linear combination of pair generators from M cross S. The elements of S have a common denominator d, and can be represented by fractions that employ this common denominator. These are really the same elements of S, we haven't changed anything. Now pass each numerator from S over to M, so that all pair generators are drawn from M cross 1/d. Invoke linearity, and add the elements of M together. Thus everything in M×S is something in M cross a denominator in S. As a set, this looks like the fractions of M by the denominators of S. It's looking good so far.

If R does not contain 1, use d/d2 instead of 1/d. Thus everything in M×S is some x in M crossed with some d/d2, where d is a valid denominator of S.

Fractions are added by obtaining a common denominator and then adding the numerators. Now consider the sum of two pairs in the tensor product. Multiply the element in M by any denominator d, and divide the reciprocal from S by the same denominator d. This is passing the action of R between the two components, as is permitted by the tensor product. Do this to both pairs so that they have a common denominator. Then use linearity to add the elements of M. It's the same formula.

As an R module, scale a fraction of M by scaling the numerator. The same thing happens in the tensor product. They look like the same R module, and even the same S module, but we still have to verify equivalent fractions correspond to equivalent elements in the tensor product.

Two fractions of M are equal iff their difference is 0, and a fraction of M is 0 iff some zero divisor, within the set of denominators, kills the numerator of the fraction. Compare this with the tensor product. (x,1/d) is 0 if one side or the other is 0 or can be made 0 by the action of R. Pass c from x to 1/d and get c/d, which is 0 in S iff uc = 0, but since c was part of x, u kills x, just as it would in the fraction x/d. From the other side, pass c from 1/d to x. If this turns x into 0 then c kills x, and again x/d is 0 in the fractions of M. The zero criteria are the same in the two modules, and the fractions of M via the ring S are the same, as an R module or an S module, as M×S.

If the denominators of S do not kill anything in M, then each x/d is nonzero, and M embeds into M×S via x/1. This is analogous to embedding an integral domain into its fraction field. In fact, if R is an integral domain, and M is a direct sum/product of ideals of R, then M embeds into M×S, where S is any fraction ring of R. As a special case, M could be a free R module, or a submodule thereof.

All these theorems apply even if R does not contain 1. Use d/d2 instead of 1/d throughout. Note that S always contains 1, in the form of d/d, and M×S becomes an S module.

Several times in this chapter I referred to Q as a flat Z module, and the same for the fractions of any integral domain. Let's prove it now.

Naturally RP is a flat RP module, but it is also a flat R module. Start with the following short exact sequence and ask when the localization embeds KP into MP.

0 → K → M → M/K → 0

Everything in KP can be represented by an element in K over a denominator drawn from R-P. Let x/d be a nonzero member of KP that becomes 0 in MP. As shown in the previous section, some denominator kills x in M, and the same denominator would kill x in K, hence x/d was really 0 after all. Therefore KP embeds, and RP is flat.

There is nothing special about localization here; every fraction ring of R is flat. This holds even if R does not contain 1.

As a special case, Q is a flat Z module. Moving from the rationals to the reals, R is a Q vector space, with a basis, hence R is a direct sum of flat Z modules, and R is a flat Z module.

The base change of a flat module by a flat ring is flat, so if M is flat, MP is a flat R module and a flat S module.

Let R be a pid, and build a quotient module R/p, analogous to the integers mod p. When this is tensored with the fraction field F the result is 0. Divide any element of F by p, and multiply R by p, and get 0. Since R/p tensor F is 0, F cannot be a free R module. Free and projective are the same over a pid, hence F is not projective either.

If R is commutative, 0 is a local property. In other words, x is nonzero in R iff x is nonzero in every maximal localization RP. This generalizes to any module M, and the proof is essentially the same. Every localization of 0 is 0, so assume x is nonzero, and let J be the ideal that kills x, i.e. the annihilator of x, sometimes written [x:0]. Raise J up to a maximal ideal P. Now x/1 remains nonzero in MP. The denominators, outside of P, do not kill x. Thus x is 0 iff every localization is 0.

The entire module M is 0 iff each maximal localization of M is 0. Focus on any nonzero x in M and you're there. We're going to use this below, where a kernel, or a cokernel, is 0 iff all its localizations are 0.

Let f be an R module homomorphism from U into V. If P is a prime ideal, tensor the exact sequence 0 → U → V → 0 with RP and find an isomorphism between UP and VP, as R modules, or as RP modules. In fact this generalizes to any base change. This makes sense, intuitively, because the points of V are merely the points of U relabeled.

In general, the tensor product preserves surjective, and injective if the module is flat.

Conversely, assume f is not injective, so that f has a nontrivial kernel K. Select P so that KP remains nonzero. Tensor the following sequence with the flat module RP, and the localized function f from UP onto VP is not injective.

0 → K → U → V → 0

If f is not surjective, let C be the cokernel.

K → U → V → C → 0

Select P so that CP remains nonzero, and tensor with RP. The new homomorphism from UP into VP is not surjective.

In summary, a function is injective, surjective, or bijective iff it is the same under maximal localizations.

If A is contained in B, write the exact sequence 0 → A → B → C → 0. Tensor with a flat module, such as RP, and AP embeds in BP, and is contained in BP. The original cokernel C is nonzero iff containment is proper. As shown above, the cokernel is 0 iff it is 0 locally. When A is a submodule of B, A = B (module equality) is a local property.

Let A and B be submodules of a module M. Note that B contains A iff (B+A)/B is zero. Tensor the sequence 0 → B → B+A → (B+A)/B → 0 with RP. Remember that B+A commutes with localization. The result embeds BP into BP + AP. As above, the cokernel is zero iff it is zero locally. Thus containment of A inside B is a local property.

Let M and W be R modules, so that hom(M,W), also known as the dual of M into W, is an R module. Let S be an R algebra, so that tensoring with S is a base change. Sometimes S is a fraction ring of R, so that tensoring with S is a form of localization, though other rings will do. We're going to build a map j() from the tensor of the dual into the dual of the tensors. In other words, j maps hom(M,W)×S into hom(M×S,W×S). The domain and range are both S modules, and in some cases, e.g. when M is finite flat, j becomes an S module isomorphism (proved below), so that dual and base change commute. If M is finitely presented, and S is a fraction ring of R, j becomes an isomorphism (not proved in this chapter), so that dual and localization commute.

First define the function j(). Start with a function f from M in to W,crossed with an element c from S. This represents an element of hom(M,W)×S. If f(x) = y, let the corresponding homomorphism g map (x,1) to (y,c). In other words, g is the application of f to all of M, then the image is crossed with c. Thanks to f, and linearity in W×S, g respects addition in M, and the action of R. In other words, g is an R module homomorphism from M into WS.

Extend g by tensoring with the identity map on S. Now g is an R module homomorphism from MS into WSS. Let's see what this homomorphism looks like. Given (x,b) in MS, let g(x,b) = (y,c,b). In other words, the contribution from S carries across. Given d in S, apply d to (x,b) and (y,c,b), producing (x,bd) and (y,c,bd). Thus g respects the action of S, and g is an S module homomorphism, as well as N R module homomorhism. View everything as S modules, and WSS becomes WS. The two S components are simply folded together. Thus (y,c,b) is simply (y,cb). Now g is an S module homomorphism from MS into WS. This completes the definition of j, turning (f,c) into an S homomorphism from MS into WS.

Add f1 and f2 together, then cross with c to get g1 + g2. This is the sum of the corresponding images in hom(M,WS). Tensor with S, and j is linear in its first component.

Add (f,c) + (f,b), and get c+b all the way across. Thus j is linear in its second component.

Pass d, in R, from f to c, or from c to f, and you can do the same with (g,c) in WS. Therefore j respects bilinearity, and is well defined on hom(M,W)×S.

By adding generating pairs together, and scaling generators, j is an R homomorphism. Apply d in S, and g is multiplied by d, which is the action of d in hom(MS,WS). Thus j is also an S module homomorphism.

Let M be the direct product of M1 and M2. A homomorphism f defines, and is defined by, f on M1 and on M2. In other words, direct product and dual commute. Direct product also commutes with base change, hence the domain of j is hom(M1,W)S * hom(M2,W)S. By the same reasoning, the range is hom(M1S,WS) * hom(M2S,WS). Write a function f as f1 + f2, acting on M1 and M2, and cross this with c, and apply j. The result is g1 + g2, still acting on M1 and M2. Each component function stays within its component, and j is the sum of functions j1 and j2, acting through M1 and M2. It follows that j is injective iff j1 and j2 are injective, and j is surjective iff j1 and j2 are surjective.

You know where we're going to go from here - from R, to free R modules, then back to projective modules. So let M = R, and suppose j carries (f,c) to the trivial homomorphism. Of course f is defined by the image of 1, which I will call y. Thus g, as an S module homomorphism, is defined by g(1,1) = (y,c). Therefore (y,c) is equivalent to 0 in WS. Some z in R passes between y and c, making one of the two components 0. Use the same z, passed between f and c, to kill f or c. Therefore (f,c) is equivalent to 0, and j is injective.

Select any homomorphism in hom(RS,WS) and call it g. If g(1,1) = (y,c), let f(1) = y, and j maps (f,c) onto g. Therefore j is surjective, and j is an isomorphism. This is a lot of algebra to prove something rather intuitive. The homomorphisms from R into W are equivalent to W, so tensor this with S and the domain is WS. At the same time, the homomorphisms from S into WS are equivalent to WS. Thus j implements an isomorphism from WS onto WS.

Let M be a free module of finite rank. In other words, M is the direct product of n copies of R. Each copy Ri defines a function ji that is an isomorphism, hence the composite function j, on the free module Rn, is an isomorphism.

Pull back to a finite flat module M, which is the summand of a free module of finite rank, and j is an isomorphism on M. Tensor and dual commute when M is finite flat.

The rank of a free module is well defined; it is the cardinality of its basis. If M is finite flat it also has a rank, the fewest number of generators needed to span M as an R module. Rank is not additive however. Let R = Z/6. The projective modules Z/2 and Z/3 have rank 1, (it couldn't be any less than 1), yet their direct product is R, which also has rank 1.

The rank of M is the rank of the smallest free module F such that M is the homomorphic image of F. Since M is projective, the sequence 0 → K → F → M → 0 is split exact. Thus M is also a summand of F. Conversely, if F is a summand of F it is a quotient of F, and generated by the basis of F. The rank of M is the smallest free module F such that M is a quotient of F or a summand of F.

Tensor with RP, and M becomes finite flat over a local Ring. As shown earlier, MP is a free RP module. The rank of MP as a free Rp module is the local rank of M, relative to P. If the local rank is the same for each prime P, then M has constant local rank. If the local rank through every maximal ideal is 0, then the localizations of M are 0, and M = 0.

Let M be a summand of F, such that F has the smallest possible rank, thus establishing the rank of M. Localize about P, and MP is a free module inside another free module derived from F. Therefore the local rank of M is bounded by the rank of M.

Suppose the local rank of M is always positive, yet M×U = 0 for some nonzero module U. Select P so that UP is nonzero. Perform a base change of M×U through Rp, and the result is still 0. Tensor and base change commute, thus Mp × Up = 0. The first is free, and the second is nonzero, hence the tensor product is nonzero. This is a contradiction, hence M is faithful.

Let P and Q be primes, with Q containing P. Apply the two localizations RQ, then RP. This is M×RQ×RP, or M×(Rq×RP). Tensoring with a fraction ring is just like taking fractions, so RQRP is the fractions of RQ through R-P. Apply a technical theorem about fraction rings, and the result is RP. In other words, MRqRP = MRP = MP.

If MQ is free of rank n, then tensor with RP and MP has rank n. Both localizations give the same rank.

If Q is a maximal ideal, the rank of MQ is the same as the rank of MP for all the primes P in Q. Furthermore, if the intersection of two maximal ideals contains a prime ideal, then all the primes in both maximal ideals lead to the same local rank.

If the nil radical is prime, such as an integral domain, M has constant local rank.

Generalizing the above, let S be a fraction ring whose denominators form a multiplicatively closed set containing R-P. Apply RP, then S, to get something equivalent to M×S. If MP is a free RP module of rank n, then M×S is a free S module of rank n.

Assume M does not have constant rank. Let A be the set of primes P for which MP has rank n, and let B contain all other primes. Let A0 be the intersection of the primes in A, and similarly for B0. Let S be the complement of a minimal prime in A. Derived from a prime in A, M/S has rank n. Also, S is a maximal multiplicative set in R. Naturally S misses A0.

Let P be any prime containing A0. If S does not contain R-P, it can be crossmultiplied by R-P to give a larger multiplicative set. Thus S contains R-P, and MP, like M/S, has rank n. No prime ideal in B contains A0, and similarly, no prime ideal in A contains B0.

The two ideals A0 and B0 define closed sets in the zariski topology of R. Together these closed sets partition spec R. Both sets are open and closed, and are disjoint components of spec R. Turning this around, the primes in a connected component of spec R all exhibit the same local rank.

Find a clopen subspace of spec R with local rank 0. (This subspace could be empty.) Then find a clopen subspace with rank 1, then rank 2, and so on up to the rank of M. Thus spec R has been partitioned into a collection of disconnected subspaces, each with constant local rank. Note that a subspace could contain several connected components, but each component belongs entirely to a subspace of a given rank.

Assume R is a reduced ring, with nil(R) = 0. Since two different ranks imply two disjoint spaces in spec R, one can produce a idempotent for each space. If R/nil(R) contains no nontrivial idempotents, then every finite flat module over R has constant local rank.

As mentioned earlier, a module with constant local rank is faithful, thus every finite flat module over a connected ring is faithful.

The support of an R module M, denoted sup(M), is the set of prime ideals P in R such that the localization MP is nonzero. Since 0 is a local property, sup(M) is a nonempty set of spec R whenever M is nonzero.

Tensor the exact sequence 0 → A → B → C → 0 with the flat module RP. The middle module BP is nonzero iff AP or CP is nonzero. Therefore sup(B) is the union of sup(A) and sup(C).

Tensor a direct sum of modules with RP, giving the direct sum of the individual localizations. Therefore the support of a direct sum is the union of the component supports.

Let L and M be finitely generated R modules, and consider the support of L tensor M. Since tensor and base change commute, this is LP×MP. If either localization is zero the result is zero. Conversely, assume LPMP = 0. The tensor product of two finitely generated modules over a local ring is 0, hence one of the two modules is 0. Therefore the support of L×M is the intersection of sup(L) and sup(M).

Assume R is an integral domain, and M an R module. If a/s1 * x/s2 = 0 in M/S, then some u in S kills ax, which makes x a torsion element in M. Conversely, if ax = 0 then a/1 * x/s = 0, and x over any denominator is torsion in M/S. Therefore, S inverse of the torsion submodule of M equals the torsion submodule of M/S.

If M is torsion free then so is M/S. (The torsion submodule of M is 0, and 0/S = 0.)

Conversely, let x be torsion in M, and let P be the prime (or maximal) ideal of R that contains the annihilator of x. Localize about P, and x/1 remains torsion. (Nothing outside of P kills x.) Thus torsion remains torsion under at least one localization. Put this all together and torsion free is a local property.

Let R be an integral domain, and let F be the fraction field of R. Assume M is not flat, so that a pair generator (x,y) in A×M becomes 0 when placed in the larger context of B×M. Move d ∈ R from y to x. If d kills x, so that (x,y) is 0 in BM, then (x,y) is 0 in AM. Instead, move d from x to y. This might have been impossible in A, but in B, x is divisible by d. Thus d kills y in M. In other words, M is not torsion free. Turn this around and torsion free implies flat.

Conversely if M has a torsion element y, killed by d, then tensor 0 → R → F → F/R → 0 with M. y persists in RM, which is equal to M, but (1,y) in RF = (1/d,dy), which is 0. The resulting sequence is not short exact, and M is not flat.

M is flat iff M is torsion free. Since torsion free is a local property, flatness is a local property.

Let R have one maximal ideal M, generated by t, such that there are no ideals in between successive powers of M. Thus M = jac(R). R is a pid, a local pid, having one maximal ideal M, and one prime ideal M (besides the prime ideal 0).

Let K be the residue field R/M. The R module M3/M4 is generated by t3, giving a natural isomorphism between this quotient module and R/M. Map x to xt3. Thus the quotient of successive ideals is K.

Let U and V be free R modules of rank n, with U containing V. In other words, a free module contains another free module of the same rank. If R were a field, we would be talking about vector spaces, and U would equal V, but R is not a field, so perhaps there is more to U than just V. For instance, 2Z (the even numbers) is a free Z module of rank 1 within Z, and the quotient U/V is Z/2. This theorem characterizes the quotient U/V when it is killed by a power of M. The factor group U/V is a direct product of R mod various powers of M, as seen by building a new basis b for U, and this representation is basis invariant, thus it is called the invariant factor theorem.

The quotient module U/V has to be torsion, since U and V have the same rank, thus everything in U/V is killed by something in R. U is free of rank n, hence U/V is finitely generated. Each generator is killed by an ideal in R, thus U/V is killed by an ideal in R. This ideal is Mj for some j.

Pause for a moment and reflect. You've seen this theorem before - twice in fact. Recall that a finite abelian group is the direct product of prime power cycles. First, the group is separated into p groups by the sylow theorems, whereupon each p group can be analyzed in turn. Any one of these p groups, call it G, is finite, and finitely generated, with n generators, thus G is the homomorphic image of a free abelian group of rank n. G is a Z module, and Z is a pid, hence the kernel, a subgroup of a free group, is also free. Its rank has to be n, as shown earlier, thus G is the quotient of to free groups of rank n. If we hadn't already determined the structure of G using a rather mechanical process, we could invoke the invariant factor theorem, and its a done deal.

The second occurrence was the structure of a finitely generated module over a pid. The torsion free submodule was removed, and the remaining torsion module was then separated into submodules, wherein each was killed by powers of a given prime / maximal ideal. This is analogous to separating a group into p groups by the sylow theorems. From there, each p module was analyzed, in a rather tedious manner, but it is in fact the quotient of two free modules of rank n, so the invariant factor theorem applies.

What follows is only a modest generalization of the two theorems you have seen before, and it is rather technical, so you can skip ahead to the next chapter if you like. I include it because it is, in the spirit of modern mathematics, a subroutine that can be written once and used again and again. Besides, it is a well known theorem in commutative algebra, and I really shouldn't leave it out.

Let Wi = MiU intersect V. This is the successive ideals of the copies of R in U, intersected with V. Note that W0 = V, and each successive Wi can only get smaller.

Let Zi = Wi mod MV. Remember that Wi is already in V, so map Wi forward into the quotient module V/MV to get Zi. Since R/M = K, V/MV is the vector space Kn. Zi lives in this K vector space. As i advances, Wi can only decrease, and Zi can only decrease.

Wi is an R module, and so is Zi. Since M carries Zi to 0 in V/MV, Zi is a K module. This means Zi is a subspace of the vector space Kn. As mentioned earlier, these subspaces can only decrease.

Remember that MjU lies entirely in V. This means Mj+1U lies in MV. With all of Wj+1 in MV, Zj+1 = 0. The descending chain of vector spaces starts with Z0 = Kn, and becomes 0 at Zj+1.

Start at the smallest nonzero subspace of Kn, perhaps Z7, and give it a basis commensurate with its dimension. Move up to the next subspace, perhaps Z5, and extend the preexisting basis to cover Z5. Repeat until you reach the top, Z0, which is Kn. Call the resulting basis y. Thus each subspace Zi is spanned by a subset of our comprehensive basis y.

If y5 is the fifth element in our basis, it is actually a coset of MV in V. Select a coset representative, but not just any representative. Perhaps we selected y5 to span the subspace Z3, which comes from W3. Select x5 from W3 so that x5 represents the basis element y5 in Z3. Do this across the board, and x is a string of representatives in V, which becomes our basis y when reduced mod MV.

Recall the quotient formula for tensor product. Tensoring V with K is the same as V mod MV. Thus tensoring with K carries the elements of x onto our basis y, which spans Kn.

In the following sequence, X represents the submodule of V spanned by the elements of x, and C is the cokernel V/X.

0 → X → V → C → 0

Tensor with K, and the embedding of X into V becomes an embedding of y into Kn, which is an isomorphism. This means C×K = 0. In other words, C = MC. C is the quotient of V, and finitely generated. Apply nakiama's lemma, and C = 0. Thus x spans all of V.

If n generators span a free module of rank n, those generators form a basis. Thus x is a basis for the free module V.

The same reasoning shows the ith subset of x, 0 → Xi → Wi → C → 0, forms a basis for Wi. To use nakiama's lemma, you might need the fact that R is noetherian, so that Wi and C are finitely generated.

Each xi is a power of t times something in U. Recall our example x5, living in W3. Such an element is t3 times something in U. Suppose two different elements in U lead to x5. Their difference is killed by t3, and that contradicts the fact that U is a free module. Thus the preimage is unique. There is a specific b5 such that t3b5 = x5.

If the exponent associated with xi is ei, the basis for V is t to the ei * bi.

Step back, and show that b spans all of U. If we can do this, the quotient U/V is well understood. U has basis elements bi, and V has corresponding basis elements bi times various powers of t. U/V is isomorphic to the direct product of R mod M to the ei, as i runs from 1 to n. (Some of these may be 0; there are at most n components in the direct product.) We just need to show b is a basis for U.

Let q be an element of U. tiq is in V iff tiq is in Wi. Let l be the least exponent such that tlq lies in V. Thus tlq lies in Wl, and no earlier Wi. Now Wl is spanned by the first few elements of x, or if you prefer, the first few elements of b multiplied by various powers of t, not to exceed tl. Some of the basis vectors are multiplied by tl, since l is minimal. Say b7 and b8 are multiplied by tl. Divide by tl, and b7 and b8 span a point that I will call q′. Surely q′ is spanned by b, so if q-q′ is spanned, then so is q. Since the terms with tl have been subtracted away, q-q′ can be multiplied by t raised to a lesser power, such as tl-2, giving an element in Wl-2. By induction, the points of U that lead to earlier submodules Wi are spanned by b, hence q is spanned by b.

Of course we need to start the inductive process. If q is already in V, i.e. already in W0, it is spanned by b, since b spans all of V.

Once again n generators b1 through bn span a free module of rank n, and this implies b is a basis for U. That completes the characterization of U/V as a direct product of quotient rings R mod various powers of M. This is an R module, but it also happens to be a ring.

Given the aforementioned modules U and V, we might have selected a different basis y, and a different basis x, leading to a different basis b, giving a different representation of the quotient U/V as a product of quotient rings. No matter the basis, the resulting decomposition into quotient rings is the same. This is the "invariant" part of the invariant factor theorem.

Suppose a quotient ring Q = R/Mj is the direct product of two R modules. These modules are killed by tj, hence they are also Q modules, or ideals in the ring. Write 1 = e + f, the sum of two idempotents taken from the two ideals. Let g = tj-1, the generator of Mj-1. Within Q, Mj-1 becomes the field K, generated by g. ge and gf are zero divisors, so one of them is 0 in K. Say it is ge, whence e is a multiple of t, and cannot be an idempotent. Therefore Q is not the direct product of R modules. In other words, Q is indecomposable. Also, each Q is acc and dcc, having finitely many ideals, hence krull schmidt says the representation, as a direct product of indecomposable modules, is unique.

We would like to apply this theorem to a pid, or even a dedekind domain, having many maximal ideals. Other theorems separate a finitely generated torsion module into a direct product of submodules, where each summand is killed by a power of some maximal ideal. I referred to these theorems at the start of this section. With this behind us, assume Mj kills U/V.

Here is a small lemma. Let C be the quotient U/V, the finitely generated torsion module that is killed by Mj. Suppose d, outside of M, kills some x in C. The annihilator of x includes some power of M, but also d. Either d is a unit, or it generates an ideal that is coprime to every power of M. Either way the annihilator of x is all of R, and that means x is 0 in C. Nothing outside of M kills anything in C.

Tensor 0 → V → U → C → 0 with the flat module RM. The powers of M in R become the powers of MM in RM. This is ideal correspondence in the fraction ring. The local ring has the desired structure. And the localized quotient CM is killed by Mj. Apply the previous theorem, and CM is a direct product of quotient rings, each of the form RM mod some power of MM. The numerators are cosets of Mi, and the denominators come from R-M. Yet the same ring appears if you localize R/Mi about M. Therefore CM is the direct product of the fractions of quotient rings, based on various powers of M.

Let x be nonzero in C, and suppose x/1 becomes 0 in the fractions CM. Some denominator outside of M kills x, but the previous lemma has ruled that out. Therefore C embeds in CM.

Intersect C with each of the quotient rings in CM. This separates C into a direct product of R modules. Analyze one, and the others follow from there.

Consider the cosets of Mi in R, having denominators in R-M. Throw away the denominators and find R/Mi. Some submodule H of R/Mi is the relevant summand of C, and creates the desired quotient ring when tensored with RM. If H is entirely in M, then HM lives in MM. That's not good enough, so H contains a unit in R/Mi. As an R module, H must be all of R/Mi. It is the contraction of the corresponding quotient ring under localization. Put this all together and U/V is a unique direct product of quotient rings R mod various powers of M.