Abstract Algebra and Discrete Mathematics, Generating Functions

Generating Function
Context Free Grammar
Triangulating Polygons
Approximating cat(n)
Polya's Enumeration Theorem
p(n)
Counting Cycle Decompositions
The Power of Exponentiation
Rooted Trees

This chapter addresses some advanced topics in combinatorics. It is not referenced elsewhere in the book, so if this is not your cup of tea you can skip it. I include it because I think it's beautiful - another unexpected connection between analytic functions and discrete math.

At the heart of it, a generating function is a convenience, a form of concise notation.

Assume there are a_n ways to arrange n items, according to some rules or formulas to be named later. The sequence a₀ a₁ a₂ etc is the solution to our combinatorial problem, specifying the number of arrangements for each n, but the pattern may not be obvious, even after 100 terms. The sequence may follow a pattern; we just can't see it.

Sometimes the sequence holds the taylor coefficients for a function f. In other words, f(x) = a₀ + a₁x + a₂x² + a₃x³ + a₄x⁴ + etc. In this case we say that f is the "generating function" that solves the combinatorial problem through its taylor coefficients.

We want f to be analytic, so that it agrees with its power series, at least for some circle of convergence. Thus f can be folded into other expressions, such as f², and the result can be expanded into a power series, just like the original. Unless you're a big fan of real / complex analysis, the details aren't important here. All the functions you know and love, like x³-x, and exp(x), and sin(x), are analytic.

Here is a silly example. There is exactly one way to put n green marbles into a box. Build a generating function whose coefficients are all 1. Set f(x) = 1/(1-x). That's the solution, expressed as a generating function: 1/(1-x). It is analytic, with a radius of convergence of 1.

Now ask how many ways you can select one marble from a batch of n different colored marbles. The answer is n, of course, but what is the generating function? Take the derivative of 1+x+x²+x³+…, and multiply by x, and the coefficients come out right. Thus the corresponding generating function is x/(1-x)².

All this is a little bit backwards. We usually use generating functions to find the coefficients, rather than the other way around. You'll see this as we solve a few problems.

Given a context free grammar, one can, theoretically, derive a generating function that describes the number of derivations leading to words of length n. However, finding the generating function, and its coefficients, can be challenging, even for modest grammars. Furthermore, if the grammar is ambiguous, multiple derivations will lead to the same word. You really want an unambiguous grammar, so the generating function describes the number of words of length n in your language.

Associate a generating function with each nonterminal, and an equation with each rule. If q is one of the nonterminals, the function q(x) determines the number of words of length n that are generated by q. If s is the start symbol, s(x) provides the number of words of length n in the language. That is the solution.

As a warmup, consider this grammar, with one terminal A. Terminals are capitalized as usual, and E is the empty string. There is exactly one word of length n, namely the string AAAA…A.

s → e | A s | s A

The first rule, s → e, implies one word of length zero, hence the series is 1.

The second rule is recursive, with s referring to itself. The terminal A has the series x, 1 word of length 1. Put this together and A s becomes xs. s A creates the same formula, so the composite formula is s = 1+2xs. Solve for s and get s = 1/(1-2x). By synthetic division, the coefficients are a_n = 2ⁿ. Well there aren't 2ⁿ words of length n, there's only one, so what gives?

There are 2ⁿ derivations leading to words of length n. Each step spins off A at the left or the right. Despite all these different derivations, there is only one word of length n. This is why you want an unambiguous grammar. The proper grammar for this language is: s → e | s A. This gives the formula s = 1+xs, or s = 1/(1-x), which has the expected coefficients a_n = 1.

Here is a more interesting grammar that is (thankfully) unambiguous. This has an immediate application; if A is the left parenthesis and B is the right parenthesis, this grammar cranks out strings of balanced parentheses.

s → E | p s
p → A s B

The first production, s → E, gives one word of length zero as before.

The second production, s → ps, produces the equation s(x) = p(x)*s(x). Take a moment to verify this. If a word has length n, p could produce a word of length k, and s must produce a word of length n-k. Add up the words of length k times the words of length n-k, as k runs from 0 to n, which is precisely the n^th coefficient on p(x)*s(x). It just works.

Put this all together to write the following equations for s and p.

s = 1 + ps

p = x²s

Substitute the second equation into the first.

s = x²s² + 1

Strings can never have odd length, so replace x² with y, and realize that the n^th coefficient now describes strings of length 2n.

s = ys² + 1

ys² - s + 1 = 0

s = ( 1 ± sqrt(1-4y) ) / 2y

The last step is an application of the quadratic formula.

Ok, we have a function, but what are the taylor coefficients? The problem is the square root - how do we expand sqrt(1-4y)?

Use the continuous binomial theorem to perform this expansion. In that section, sqrt(c+x) has already been worked out in detail. Replace c with 1 and x with -4y and get coefficients of -(2n:n) / (2n-1).

Note that this comes out 1 when n = 0. Subtract this from 1 and divide by 2y. Remember that dividing by y shifts the series left by 1, so evaluating the formula at n actually produces a_n-1. Plug n+1 into the formula, and a_n looks like this.

a_n = (2n+2:n+1) / 2×(2n+1)

a_n = (2n+2)×(2n+1)×(2n:n)/(n+1)² / 2×(2n+1)

a_n = (2n+2)×(2n:n)/(n+1)² / 2

a_n = (n+1)×(2n:n) / (n+1)²

a_n = (2n:n) / (n+1)

The first few terms, starting at a₀, are as follows.

1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, 58786, 208012, …

These are the catalan numbers, and they do indeed count the strings of balanced pairs of parentheses, as was suggested in the chapter on combinatorics. Now we have a proof. Here are the 14 strings consisting of 4 sets of parentheses.

()()()()
()()(())
()(())()
()(()())
()((()))
(())()()
(())(())
(()())()
(()()())
(()(()))
((()))()
((())())
((()()))
(((())))

Given a labeled convex s-gon, triangulate it by drawing s-3 nonintersecting chords between vertices. For instance, a square needs only one chord, the diagonal, to produce two triangles. This can be done in 2 different ways. The pentagon requires two chords, from one vertex to the two opposite vertices, to produce three triangles. This can be done in 5 different ways. Start at any of the 5 vertices and draw chords to the two on the opposite side. How many ways can the s-gon be triangulated? That answer is once again the catalan numbers, as shown by mapping each triangulation to a string of balanced parentheses. The number of triangulations is cat(s-2). This works for a triangle, or even a line segment (s = 2), which is sort of a degenerate polygon.

With this established, let's triangulate the unlabeled s-gon. If you are allowed to spin the pentagon around in the plane, how many ways can it be triangulated? Only one. Connect the apex, at the top of the pentagon, to the two points at the base, building an isosceles triangle inside the pentagon. Any other triangulation can be rotated to coincide with this one.

The square also has one triangulation, and the hexagon has four, or three if you allow for reflections. Connect one vertex to the other three, place a triangle inside the hexagon, place a Z in the hexagon, or place a backwards Z in the hexagon. The last two coincide if you allow for reflections.

This may remind you of another problem we solved before. Place red yellow and blue beads in a circle, forming a necklace. But two necklaces are really the same if one can be rotated to coincide with the other. This is where the burnside polya theorem comes into play.

The group is the rotations of the vertices, denoted Z/s. For each rotation, identify the triangulations that are fixed by that rotation. For instance, a rotation of 0 causes the vertices to stay put, and every triangulation is valid. Thus χ(0) = cat(s-2).

Let the length of a chord be the number of vertices, traveling around the shorter side of the polygon, to get from one end of the chord to the other. The shortest possible chord has length 2 and the longest possible chord has length s/2.

Let j be the length of the longest chord in a triangulation. Call this chord c₁, e.g. from vertex 0 to vertex j. Which rotations map this triangulation onto itself? Spin the polygon by anything less than j, and the image chord c₂, from i to i+j, crosses c₁, which is a contradiction. You must rotate the shape by j or higher, and of course the rotation must divide s.

Assume c₁ and c₂ coincide. Thus s is even, j is half of s, and the rotation is 180 degrees. The triangulation on one side is rotated to equal the triangulation on the other. These are polygons of size s/2+1. Thus cat(s/2-1) is brought in whenever s is even. In fact it appears s/2 times, for the s/2 possible diameters.

Next assume c₂ follows c₁ at vertex j, that is, the rotation is equal to j. The chords c₁ c₂ c₃ etc build a regular s/j-gon inside the s-gon. If s/j is at least 4 then another chord is required inside the central polygon, and this chord has length > j. Thus three chords build an inscribed triangle, and j = s/3. The triangle can be placed in the s-gon in s/3 ways, and then the outer polygons, of size s/3+1, are all triangulated the same way. The rotation of 120 degrees, or 240 degrees, contributes s/3 × cat(s/3-1).

Assume c₂ starts at j+g, giving a gap of length g. If j connects to the end of c₂, or anything else farther around, the chord is too long. So connect j to one or more points in the gap, one or more points from j+2 to j+g. If the last connection runs from j to j+i, then another chord connects j+i to 0, and that is too long. There can be no gap. The chords must make a triangle, as described above.

In summary, the only valid rotations are 0, 120, 180, and 240 degrees. Put this all together to get the following equation. The second term is only valid if s is even, and the third term is only valid if s is divisible by 3.

triangulations(s) = { cat(s-2) + s/2 × cat(s/2-1) + 2s/3 × cat(s/3-1) } / s

Here are the triangulations for s = 3 to 12, triangle to dodecagon.

3	1
4	1
5	1
6	4
7	6
8	19
9	49
10	150
11	442
12	1424

Now allow for reflections, i.e. flipping the polygon over. To illustrate, put vertex 0 up at the top, as before, and flip through a vertical line. If s is even the line runs from 0 down to s/2, or from the midpoint of 0 and 1 down to the midpoint of s/2 and s/2+1. If s is odd then the line of reflection runs from 0 down to the midpoint of s/2 and s/2+1.

If c₁ runs from the right side to the left side, then it will cross c₂, the image of c₁, running from the left side to the right side. This can't happen, unless c₁ = c₂. Only a perfectly horizontal chord can cut across the polygon.

Suppose there are two horizontal chords. Pick two horizontal chords with no more horizontal chords between. This forms a rectangle of sorts inside the s-gon. To triangulate this shape, something from the right must eventually connect to something on the left with a nonzero slope, which is a contradiction. There is at most one horizontal chord.

This rules out one of our configurations. The line of reflection cannot run from midpoint above to midpoint below, for then the top and bottom both represent horizontal chords, from 0 to 1 and from s/2 to s/2+1.

Assume there are no horizontal chords, so that everything on the right stays on the right, and everything on the left stays on the left. Look at the chords that start at the top, at vertex 0, and flow downward. The last one around the circle, but still on the right, has to connect to s/2. If it connects to j < s/2, then j on the right has to cut across to the left. So we have a diameter, or a tall thin isosceles triangle. For s even, there are s/2 diameters, and cat(s/2-1) patterns on the right, that are reflected on the left. For s odd, there are s isosceles triangles, and cat(s/2-1) patterns on the right, that are reflected on the left. (s/2 is rounded down.) Here is the formula for s odd, accounting for reflections, with the second term valid only when s is a multiple of 3.

triangulations(s) = { cat(s-2) + 2s/3 × cat(s/3-1) + s × cat(s/2-1) } / 2s

Assume there is one internal horizontal chord c₁. The top and bottom can't be horizontal chords, so s is even and the reflection runs from 0 to s/2. Let c₁ connect j to s-j. Above c₁ is a symmetrically triangulated 2j+1 gon, with the formula cat(j-1). Below c₁ is a symmetrically triangulated s-2j+1 gon, with the formula cat(s/2-j-1). Multiply these together, then take the sum as j runs from 1 to s/2-1. This happens for each diameter, thus another factor of s/2. Put this all together and get the following formula for s even, with the third term valid when s is divisible by 3.

triangulations(s) = { cat(s-2) + s × cat(s/2-1) + 2s/3 × cat(s/3-1) +
s/2 × ( ∑(j=1,s/2-1) cat(j-1) × cat(s/2-j-1) ) } / 2s

Here are the triangulations for s = 3 to 12, triangle to dodecagon.

3	1
4	1
5	1
6	3
7	4
8	12
9	27
10	82
11	228
12	733

It's fun - well some people think it's fun - to verify the 27 triangulations of the 9-gon. There are 6 lines to draw inside. If these make a pentagon you need 2 more, so that's no good. A square has to have a diagonal inside, and another tail on one of its corners. Put the tail at the top of the square, and it either touches the diagonal or not; 2 patterns.

If there are 2 triangles then they touch at a vertex on the perimeter, only 1 pattern here.

One triangle inscribes in 3 different ways, depending on where the extra vertices are. The symmetric embedding gives each chord of the triangle a length of 3. We need 3 tails to complete the pattern. If 2 tails start at 1 vertex there is one pattern. Otherwise the tails all run the same direction, 1 more pattern.

Place the "extra" 3 vertices below the base of the triangle, requiring 3 chords below. You can have 3 tails, a tail and a double tail, a tail leading to 2 tails, or a triple tail, 4 patterns.

Put 2 tails or a double tail below the triangle and a tail at the right. This gives 8 patterns, depending on which way the tails flow. That finishes the inscribed triangle.

If one vertex connects to 2 others they have to be adjacent, else you build a triangle in between. Connect vertex 0 to 6 others, or 5 with a tail off of one, or 4 with 2 tails on one side or a double tail or a tail on either side, or 3 with a tail and double tail on one side or a tail leading to 2 more tails or a triple tail or a tail on one side and a double tail on the other; 9 patterns. Finally if each vertex connects to at most 2 others you have a zigzag, 1 pattern. Add them up and find 27 patterns.

Can we find an approximation to the catalan numbers for large n? Once you approximate the binomial coefficient (n:n/2), you're home free.

Follow along as n increases step by step. When n = 2, (n:n/2) = 2. Advance to 3 and bring in a factor of 3/2. Advance to 4 and bring in a factor of 4/2. The next two steps bring in 5/3 and 6/3. The next two steps bring in 7/4 and 8/4. Every other fraction is equivalent to 2. The other fractions are 1/1, 3/2, 5/3, 7/4, etc.

Divide through by 2ⁿ and get 1/2 times 3/4 times 5/6 times 7/8 etc.

Take the reciprocal, because that's a bit easier to analyze. Thus 7/8 becomes 8/7, or 1 + 1/7.

Take the log of the product, which is the sum of the logs. The log of 1+1/r is quite close to 1/r for large r. (This can be made rigorous by the taylor formula.) Add up these logs and you are adding the odd terms of the harmonic series. 1 + 1/3 + 1/5 + 1/7 + …

The sum of the harmonics through n is very close to log(n). Go halfway, and cut the terms in half, to get the even reciprocals. This is half the log of half of n. Equivalently, it is log(sqrt(½)) + log(sqrt(n). Subtract this from the log of n to get the odd terms. That leaves log(sqrt(n)) - log(sqrt(½)).

This is the log of the product of fractions, So take the exponential and write the following approximation.

(n:n/2) approximately equals 2ⁿ / sqrt(2n)

Apply this to the catalan numbers and get this.

cat(n) approximately equals 4ⁿ / sqrt(n) / (2n+2)

Estimate for n = 7 is 387, compared with 429.
Estimate for n = 12 is 186280, compared with 208012.

Polya developed general guidelines for manipulating generating functions, and, as you might imagine, he often combined generating functions with his Burnside Polya counting theorem.

Assume g(x) is a generating function that determines the number of ways you might arrange n balls in one bucket. Bear in mind however, that the concepts of balls and buckets are entirely abstract. The coefficients of g actually play out the first row of some two dimensional combinatorial function, BucketBalls(k,n), where k is set to 1 for one bucket.

How many ways can we arrange n balls in two buckets? Note that the buckets are ordered; number them 1 and 2 if you like. Arrange j balls in bucket 1 and n-j balls in bucket 2, giving a_ja_n-j combinations. Add this up as j runs from 0 to n. The result is the n^th coefficient on g². In other words, g² solves the problem for two buckets, the second row in our two dimensional function.

If you like, set g(0) = 0, so that a bucket cannot be empty. Thus there are no combinations with all n balls in the first bucket, or the second bucket; both buckets are used.

By induction, g^k counts the arrangements of n balls in k buckets, using all k buckets if g(0) = 0. Add these functions together, g + g² + g³ + g⁴ + …, to count the arrangements of n balls in an arbitrary number of buckets. The new generating function is as follows.

f = g / (1-g)

Let's try an example. How many ways can n be written as an ordered sum of positive integers? We're assuming order is significant. That is, 3+5 is different from 5+3. This is the same as placing n balls into a row of buckets, so that each bucket has at least one ball. Each bucket represents a positive integer.

The number of ways n balls can be placed in one bucket is 1. The generating function that implements this is 0 at 0 and 1 elsewhere. Set g(x) = x / (1-x).

To write n as the sum of two integers, employ g². For 3 integers, use g³, and so on. For all possibilities, evaluate g/(1-g).

1-g = 1 - x/(1-x) = (1-2x) / (1-x)

f = g / (1-g) = x / (1-2x)

a_n = 2^n-1, with a₀ = 0

Try this out for low values of n and see. When n = 4 there are 2³ = 8 possibilities.

1+1+1+1 2+1+1 1+2+1 1+1+2 2+2 3+1 1+3 4

How many ways can n be written as the sum of positive integers, disregarding order? This problem, known as p(n), or the partition function of n, is enormously difficult; so difficult that it took the stellar brilliance of Ramanujan to make any progress at all. And even that was just a start. Needless to say, we're not going to solve this problem here, but as we look at unlabeled buckets, you may begin to see why it is so hard.

In 2011, a deep connection was found between this problem and fractals.

Assume there are three buckets on the floor, unlabeled. If the buckets can be moved about, so that one arrangement of n balls looks like another, these arrangements are considered the same. We are really counting distinct orbits, rather than individual arrangements, and that's where the burnside polya theorem comes in.

The number of orbits is the average number of arrangements fixed by each group element. The group is S₃, all permutations of 3 objects, and there are 6 group actions to consider. The first is the identity element, which leaves the buckets alone, and fixes every possible arrangement. Thus the term g³.

If two buckets are swapped, how many arrangements are fixed by this interchange? The arrangement in one equals the arrangement in the other. If the first has k balls, there are a_k arrangements, and the second simply follows suit. Together they consume 2k balls, and we're really looking at the coefficients of g(x²). The third bucket is independent of these, so the generating function is g(x)g(x²). Similarly, a permutation that acts as a 3-cycle fixes arrangements according to g(x³).

This generalizes to all products of cycles. If a permutation on 13 buckets has the cycle decomposition c₁c₂²c₃c₅, the generating function that describes the arrangements fixed by this permutation is g(x)g(x²)²g(x³)g(x⁵).

Return to three unlabeled buckets and write the expression for the generating function that counts the distinct arrangements of n balls distributed among these buckets. S₃ has 3 permutations with an interchange, and 2 3-cycles, hence the coefficients on the second and third terms.

f(x) = { g(x)³ + 3g(x)g(x²) + 2g(x³) } / 6

To find the arrangements among k or fewer buckets, take the sum of expressions similar to the one above, for 1 bucket, 2 buckets, 3 buckets, 4 buckets, and so on up to k buckets. For 3 or fewer buckets, add g(x), and {g(x)²+g(x²)}/2, to the above to get this.

f(x) = { 6g(x) + 3g(x)² + 3g(x²) + g(x)³ + 3g(x)g(x²) + 2g(x³) } / 6

But there is an easier way. Add 1 to g. Remember that g(0) was 0; now it is 1. Derive the formula for k buckets and apply it to g. Since g(0) = 1, a bucket can be empty. In fact any number of buckets could be empty, hence k or fewer buckets are used. You don't have to solve the problem for 1 through k; just solve it for k. Here is the earlier equation for 3 buckets, once again, on a function h, which is then set to g+1. The result is the same as the expression shown above, plus 1. The extra 1 allows for no balls in no buckets, a₀ = 1, which was not permitted earlier.

f(x) = { h(x)³ + 3h(x)h(x²) + 2h(x³) } / 6

Substitute h = g + 1.

f(x) = { g(x)³ + 3g(x)² + 3g(x) + 1 +
3g(x)g(x²) + 3g(x) + 3g(x²) + 3 +
2g(x³) + 2 } / 6

This simplifies to the earlier formula, plus 1.

Now let's take a tentative step towards p(n). Write n as the sum of 3 integers or less, in any order. Recall that g was x/(1-x), but we are adding 1, so now g is 1/(1-x). Put this into the equation for 3 buckets over S₃ and get the following. You can use a program if you wish.

f = 1 / { (1-x) × (1-x²) × (1-x³) }

Start with 1/(1-x), which sets a_n = 1. Then divide by 1/(1-x²), giving the sequence:

1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, …

This isn't bad, but then divide by 1-x³, and a cycle of period 2 beats against a cycle of period 3. The formula for a_n will depend on n mod 6. Let n = 3m and you are adding 1 + 2 + 4 + 5 + 7 + 8 + 10 + 11 etc. Shift by 1 and add 1 + 3 + 4 + 6 + 7 + 9 + 10 etc. Shift by 2 and add 2 + 3 + 5 + 6 + 8 + 9 + 11 etc. I'll do the first case; the others are similar. Start out with a₀ = 1. The difference between a₆ and a₀ is 2+4. The difference between a₁₂ and a₆ is 5+7. The next difference is 8+10. Step by 6, and the difference sequence is 6, 12, 18, 24, etc. The next difference sequence is flat at 6, 6, 6, 6, 6, etc. The formula satisfies a second order difference equation, and is quadratic. The first three terms are enough to define the quadratic function.

a₀ = 1
a₆ = 7
a₁₂ = 19

a_6m = 3m² + 3m + 1

Example m = 3: 18 is the sum of 3 numbers in 37 different ways.

18, 17+1 through 9+9, 16+1+1, 15+2+1, 14+3+1, 14+2+2, 13+4+1, 13+3+2,
12+5+1 through 12+3+3, 11+6+1 through 11+4+3, 10+7+1 through 10+4+4,
9+8+1 through 9+5+4, 8+8+2 through 8+5+5, 7+7+4, 7+6+5, 6+6+6

Derive all 6 formulas, and this problem is under control, however, this is a far cry from p(n). It is merely the representation of n as the sum of three or fewer positive integers disregarding order.

How many times does a particular cycle decomposition appear in the symmetric group S_k? I'll illustrate with c₁c₃²c₆ in S₁₃.

6 of the 13 letters participate in the cycle of length 6, 1 in the cycle of length 1, 3 in the first cycle of length 3, and 3 in the second cycle of length 3. By the multinomial theorem, that is 13! over 1!×3!²×6!. But this overcounts the permutations just a bit. We don't particularly care if abf participates in the first cycle of length 3, and dgj in the second, or if dgj is in the first cycle of length 3 and abf is in the second. I don't want to count that permutation twice, so divide by 2!. For each c_u^v, there is a factor of (u!)^v×v! in the denominator.

Once a set of letters is assigned to a cycle, how many cycles are possible? Let the permutation move the lowest letter in the set to the next and the next and the next and so on, defining a circular permutation. This puts (u-1)! in the numerator, for every cycle of length u. Each (u-1)! upstairs almost cancels its u! downstairs, but there is a u left over. Clean things up, and the formula is k! over the product of u^v×v!. In our example, we get 13! over 1 × 3²×2 × 6!, or 480480.

Within the denominator, cycles contribute independently of one another. In other words, c_u^v, in any permutation, contributes 1/(u^v×v!). This independence will prove valuable in the next section.

Return to the general problem of arranging n balls in k or fewer unlabeled buckets. Let one of the permutations on k buckets, one of the elements of the group S_k, have a cycle decomposition that includes v cycles of length u, as in c_u^v. These v cycles contribute g(x^u)^v. All the permutations having this same cycle decomposition conspire to build a coefficient of 1 / u^vv!, as shown in the previous section. Actually this coefficient has k! in the numerator, but the burnside polya theorem has k! in the denominator, for the size of the group, so these cancel. We are levt with g(x^u)^v / u^vv!, and this comes rolling in for every unique cycle decomposition containing v cycles of length u.

Check with our earlier example on three buckets and see if this holds true.

f(x) = { g(x)³ + 3g(x)g(x²) + 2g(x³) } / 6

The trivial permutation has cycles c₁³, with u = 1 and v = 3, and the coefficient is indeed 1/6. The term c₁c₂ has coefficient 1 × 1/2 = 1/2, and c₃ has coefficient 1/3.

Remember that g(x) describes the number of arrangements of n balls in one bucket. In our "3 buckets or less" example we set g(0) = 1, but let's go back to g(0) = 0. That's what we need for Polya's enumeration theorem. Now g^k is precisely k buckets, with some balls in each bucket. To cover everything, all the orbits in any number of buckets, add up the expressions for k = 1, k = 2, k = 3, etc, all the permutations across each S_k, all the cycle induced expressions in g with all their coefficients for all the cycle decompositions across all S_k.

With this in mind, take a look at e^g. This is 1+g+g²/2+g³/6+g⁴/24 etc. Look at one of the terms in this expansion, such as g³/6. If a permutation on k buckets has c₁³ as part of its cycle decomposition, you need the factor g(x)³, with a coefficient of 1/6. This is precisely what we have here; g³/6.

Consider another expression, the exponential of g(x²)/2. Assume a permutation includes five 2-cycles, written c₂⁵. You need g(x²)⁵, with a coefficient of 1/(2⁵×5!). This is the fifth term in our exponential: g(x²)/2, raised to the fifth, and divided by 5!. It is just what the doctor ordered.

Now put this all together. Let f be the infinite product of the exponentials of g(x^j)/j, as j runs from 1 to infinity. Each exponential expands into an infinite sum, and the product of these infinite sums builds all the terms for all the cycle decompositions for all the permutations for k buckets, for all values of k. Even k = 0 is covered; that's the 1 that leads each exponential, and the product of all these 1's is 1.

Assuming g is analytic at 0, is this product well defined?

To evaluate an infinite product, take logs and turn it into a sum. Of course, taking logs of exponentials is quite delicious. The result is the sum g(x) + g(x²)/2 + g(x³)/3 + etc. Since g(0) = 0, the sum at x = 0 is 0. Everything is ok at 0.

Keep x inside the unit circle, and inside the radius of convergence for g. Thus x² x³ etc get even closer to the origin. Since g is approximately linear near 0, g(x^j) begins to look like a constant times a geometric series. Dividing the terms by j only makes them smaller, hence the series converges near 0. I didn't do that very rigorously, but I think the reasoning is valid.

The sum is analytic at 0, and when you raise e to this power, f is analytic at 0. This function is well behaved.

Given g, derive the formula for f, extract its taylor coefficients, and solve the problem of arranging n balls in any number of buckets, without regard to order, according to the constraints of g(x). This is polya's enumeration theorem.

So - do we have a handle on p(n) yet? Set g(x) = x/(1-x). This becomes no way to put no balls in one bucket, and exactly one way to put any positive number of balls into one bucket. To find f, build the aforementioned series in g, and apply e to the result. The log of f is the following sum.

x/(1-x) + x²/(1-x²)/2 + x³/(1-x³)/3 + x⁴/(1-x⁴)/4 + …

A term like x³/(1-x³)/3 becomes x³/3 + x⁶/3 + x⁹/3 + x¹²/3 etc. Add these terms together and look at the resulting power series. There are no constants, so a₀ = 0. Only the first term contributes an x, hence a₁ = 1. The first two terms contribute to x², and a₂ = 3/2. The first and third terms set a₃ = 4/3. Skip ahead to a_n, and the terms that contribute to a_n are the terms with index d, where d divides n. This gives the following formula.

a_n = ∑(d divides n) 1/d

At this point I will make a conjecture. Recall that n, written as three numbers or less, was 1 / { (1-x) × (1-x²) × (1-x³) }. Perhaps four numbers or less brings in 1 / (1-x⁴). Perhaps all partitions of n is an infinite product 1 / (1-x^j). Take the log of the reciprocal of this product. This is the sum of log(1-x^j). Expand the j^th log by the taylor formula and get - { x^j + x^2j/2 + x^3j/3 + x^4j/4 + …}. Strip off the minus sign in front, thus compensating for the reciprocal of f, add these up for all j, and look at a_n. If d divides n then x^dj/d contributes to xⁿ, and once again a_n is the sum of 1/d as d divides n. This is the same formula produced by polya's enumeration theorem, the same taylor coefficients, and the conjecture is confirmed.

By computer experiment, the inverse of the partition function, 1-x times 1-x² times 1-x³ times 1-x⁴ etc, has taylor coefficients of 1 and -1 that spread thin in a quadratic pattern. Run this program and get the following taylor expansion up to degree 100.

x¹⁰⁰ + x⁹² - x⁷⁷ - x⁷⁰ + x⁵⁷ + x⁵¹ - x⁴⁰ - x³⁵ + x²⁶ + x²² - x¹⁵ - x¹² + x⁷ + x⁵ - x² - x + 1

So all we need do now is prove this quadratically thinning relationship for the taylor coefficients, then find the taylor coefficients for the inverse of this function, and thus p(n) is solved.

"Is that all, Captain? We have five days you know." (Miri)

Let r(x) be the generating function that counts the rooted trees with n nodes. Set r(0) = 0; there are no trees with zero nodes.

Consider a rooted tree on n nodes. Below the root, there are some edges, we don't know how many, and each has a rooted tree below it. Thus the rooted tree on n nodes is synonymous with the rooted forest on n-1 nodes that lies below.

Apply polya's enumeration theorem. Let d(x) = r(x) + r(x²)/2 + r(x³)/3 + r(x⁴)/4 etc, and take the exponential of this function. The result is the number of rooted forests. Connect the trees in this forest back to the root to give the corresponding rooted tree. Here is the equation that relates r to a shifted version of itself.

r(x) = xe^d(x)

This doesn't tell us what r(x) is, nor its taylor coefficients, but it's the best we can do for now.

Let an edge rooted tree be a tree with a designated edge. To count edge rooted trees on n nodes, place a rooted tree at one end and a rooted tree at the other, which is done in r² ways. But swapping the two rooted trees, one for the other, is the same edge rooted tree, so we're double counting. Divide by 2. Finally, the trees that are perfectly symmetric on this edge were not double counted, and we cut those in half. Add them back in again, giving this formula.

e(x) = ½{ r(x)² + r(x²) }