bossylobster: August 2011

Monday, August 29, 2011

Finding (Fibonacci) Golden Nuggets Part 2

This is the mostly code second half of a two part post that delivers on a promise of meaningful uses of some theory I overviewed in my last set of posts. If you see words like topograph, river, and base and you aren't sure what I mean, you may want to read that last set of posts.

In the first half of this post, I outlined a solution to Project Euler problem 137 and will continue with the solution here. Stop reading now if you don't want to be spoiled. There was no code in the first post, so this post will be mostly code, providing a pretty useful abstraction for dealing with binary quadratic forms.

In the very specific solution, I was able to use one picture to completely classify all integer solutions to the equation \(5 x^2 - y^2 = 4\) due to some dumb luck. In the solution, we were able to use "Since every edge protruding from the river on the positive side has a value of 4 on a side...by the climbing lemma, we know all values above those on the river have value greater than 4," but this is no help when trying to find solutions to \(5 x^2 - y^2 = 9\), for example.

To answer the question \(5 x^2 - y^2 = 9\), we'll use the same pretty picture, but emphasize different parts of it. As you can see below, to classify all the values, we only need to travel from the initial base

along the river until we arrive at an identical base as the blue circles indicate below:

As noted above, for problem 137, we luckily were concerned about finding values \(4\) or \(1\), and the climbing lemma saved us from leaving the river. However, as I've noted above with #1, #2, #3, and #4, there are four tributaries coming from the river where we can consider larger values. Using the Arithmetic Progression Rule, we find values \(19\), \(11\), \(11\), and \(19\) as the first set of values above the river. From this point, we can stop checking for solutions to \(f(x, y) = 9\) since the climbing lemma says all further values above the tributaries will be \(11\) or greater. Thus, the only solutions come via scaling solutions of \(f(x, y) = 1\) by a factor of \(3\) (using homogeneity of a quadratic form).

Now for the (Python) code.

First, the data structure will be representative of a base along the river, but will also include the previous and next faces (on the shared superbases) and so we'll call it a juncture (my term, not Conway's). Each face in a juncture needs to be represented by both the pair \((x, y)\) and the value that \(f\) takes on this face. For our sanity, we organize a juncture as a tuple \((B, P, N, F)\), (in that order)

where \(P\) and \(N\) form a base straddling the river, with \(P\) taking the positive value and \(N\) negative, as well as \(B\) the face "back" according to our orientation and \(F\) the face "forward". Note, depending on the value of the form at \(F\), the river may "turn left" or "turn right" at the superbase formed by \(P\), \(N\) and \(F\).

To move "along the river until we arrive at an identical base", we need a way to move "forward" (according to our imposed orientation) to the next juncture on the river. Moving along the river, we'll often come to superbases \((B, N, P)\) and need to calculate the forward face \(F\). To calculate \(F\), assume we have already written a plus function that determines the vector at \(F\) by adding the vectors from \(P\) and \(N\) and determines the value at \(F\) by using the arithmetic progression rule with the values at all three faces in the superbase. Using this helper function, we can define a way to get the next juncture by turning left or right:

def next_juncture_on_river(juncture):
    B, P, N, F = juncture
    forward_val = F[1]
    if forward_val < 0:
        # turn left
        NEXT = plus(P, F, N[1])
        return (N, P, F, NEXT)
    elif forward_val > 0:
        # turn right
        NEXT = plus(N, F, P[1])
        return (P, F, N, NEXT)
    else:
        raise Exception("No infinite river here.")

Next, to know when to stop crawling on the river, we need to know when we have returned to an identical juncture, so we define:

def juncture_ident(juncture1, juncture2):
    B1, P1, N1, F1 = juncture1
    B2, P2, N2, F2 = juncture2
    return ((B1[1] == B2[1]) and (P1[1] == P2[1]) and
            (N1[1] == N2[1]) and (F1[1] == F2[1]))

Using these functions, we can first find the recurrence that will take us from a base of solutions to all solutions and second, keep track of the positive faces on the river to generalize the solution of \(f(x, y) = z\). For both of these problems, we impose a simplification for the sake of illustration. We will only be considering quadratic forms \[f(x, y) = a x^2 + b y^2\] where \(a > 0\), \(b < 0\) and \(\sqrt{\left|\frac{a}{b}\right|}\) is not rational. This guarantees the existence of a river. We will pass such forms as an argument form=(a, b) to our functions. We start our river at the juncture defined by the trivial base \((1, 0), (0, 1)\)

and crawl the river using the functions defined above. (Note: \(f(1, -1) = a(1)^2 + b(-1)^2 = a + b\), etc.)

To find the recurrence, we need just walk along the river until we get an identical juncture where the trivial base is replaced by the base \((p, q), (r, s)\). Using the same terminology as in part one, let the base vectors define a sequence \(\left\{(u_k, v_k)\right\}_{k \geq 0}\) with \(u_0 = (1, 0)\) and \(v_0 = (0, 1)\), then we have a recurrence \begin{align*}u_{k + 1} &= p u_k + q v_k \\ v_{k + 1} &= r u_k + s v_k. \end{align*} Using this -- \(u_{k + 2} - p u_{k + 1} - s(u_{k + 1} - p u_k) = q v_{k + 1} - s (q v_k) = q(r u_k)\) -- hence \(u\) satisfies the recurrence \(u_{k + 2} = (r q - p s)u_k + (p + s)u_{k + 1}\). (You can check that \(v\) satisfies this as well.) Hence our function to spit out the recurrence coefficients is:

def get_recurrence(form):
    a, b = form
    B = ((1, -1), a + b)
    P = ((1, 0), a)
    N = ((0, 1), b)
    F = ((1, 1), a + b)
    J_init = (B, P, N, F)
    J_curr = next_juncture_on_river(J_init)
    while not juncture_ident(J_init, J_curr):
        J_curr = next_juncture_on_river(J_curr)

    final_B, final_P, final_N, final_F = J_curr
    p, q = final_P[0]
    r, s = final_N[0]
    return (r*q - p*s, p + s)

For solving \(f(x, y) = z\), (\(z\) positive) we need to consider all the positive tributaries coming out of the river and let them grow and grow until the climbing lemma tells us we no longer need to consider values larger than \(z\). In order to consider tributaries, we describe a new kind of juncture. Instead of having a positive/negative base in the center of the juncture, we have two consecutive faces from the positive side

and have the negative from across the river as the "back" face. With this definition, we first write a function to return all tributaries:

def all_positive_tributaries(form):
    # ...Initialization logic...

    new_positives = []
    J_curr = next_juncture_on_river(J_init)
    while not juncture_ident(J_init, J_curr):
        # we add a new positive if the forward
        # value is positive
        forward = J_curr[-1]
        if forward[1] > 0:
            new_positives.append(J_curr)
        J_curr = next_juncture_on_river(J_curr)

    # For each (B, P, N, F) in new_positives, we want to
    # transform to a juncture with positive values, which will
    # be (N, P_1, P_2, P_F)
    result = []
    for new_positive in new_positives:
        B, P, N, F = new_positive
        new_face = plus(P, F, N[1])
        tributary = (N, P, F, new_face)
        result.append(tributary)
    return result

For each tributary, we can climb up away from the river until our values are too large. So we write a helper function to take a given tributary and a max value and recursively "climb" the topograph until we exceed the value. This function will naively return all possible faces (value and vector) without checking the actual values.

def seek_up_to_val(juncture, max_value):
    N, P_1, P_2, P_F = juncture
    if P_F[1] > max_value:
        return []
    result = [P_F]

    turn_left = plus(P_1, P_F, P_2[1])
    J_left = (P_2, P_F, P_1, turn_left)
    result.extend(seek_up_to_val(J_left, max_value))

    turn_right = plus(P_2, P_F, P_1[1])
    J_right = (P_1, P_F, P_2, turn_right)
    result.extend(seek_up_to_val(J_right, max_value))
    return result

Finally, we can combine these two helper functions into a function which will find all solutions to \(f(x, y) = z\) above the river. We may have a pair (or pairs) \((x, y)\) on the topograph where \(f(x, y) = \frac{z}{k^2}\) for some integer \(k\); if so, this gives rise to a solution \((kx, ky)\) which we'll be sure to account for in our function.

def all_values_on_form(form, value):
    # Use a helper (factors) to get all positive integer factors of value
    factor_list = factors(value)
    # Use another helper (is_square) to determine which factors can be
    # written as value/k^2 for some integer k
    valid_factors = [factor for factor in factor_list
                     if is_square(value/factor)]

    tributaries = all_positive_tributaries(form)
    found = set()
    for tributary in tributaries:
        candidates = seek_up_to_val(tributary, value)
        found.update([candidate for candidate in candidates
                      if candidate[1] in valid_factors])
        # Since each tributary is of the form (N, P_1, P_2, P_F) for
        # P_1, P_2 on the river, we need only consider P_1 and P_2 since
        # those faces above are in candidates. But P_2 will always be in
        # next tributary, so we need not count it. You may assume this ignores
        # the very final tributary, but here P_2 actually lies in the 
        # second period of the river
        N, P_1, P_2, F = tributary
        if P_1[1] in valid_factors:
            found.add(P_1)

    # Finally we must scale up factors to account for
    # the reduction by a square multiple
    result = []
    for face in found:
        (x, y), val = face
        if val < value:
            ratio = int(sqrt(value/val))
            x *= ratio
            y *= ratio
        result.append((x, y))

    return result

Combining all_values_on_form with get_recurrence, we can parameterize every existing solution!

As far as Project Euler is concerned, in addition to Problem 137, I was able to write lightning fast solutions to Problem 66, Problem 94, Problem 100, Problem 138 and Problem 140 using tools based on the above -- a general purpose library for solving binary quadratic forms over integers!

Sunday, August 28, 2011

Finding (Fibonacci) Golden Nuggets Part 1

As I mentioned in my last set of posts, the content would go somewhere and this post will be the first to deliver on that. In fact this is the all math, no code first half of a two part post that will deliver. If you see words like topograph, river, and base and you aren't sure what I mean, you may want to read that last set of posts.

In this post, I outline a solution to Project Euler problem 137, so stop reading now if you don't want to be spoiled. There is no code here, but the second half of this post has a pretty useful abstraction.

The problems asks us to consider \[A_F(z) = z F_1 + z^2 F_2 + z^3 F_3 + \ldots,\] the generating polynomial for the Fibonacci sequence*. The problem defines (without stating so), a sequence \(\left\{z_n\right\}_{n > 0}\) where \(A_F(z_n) = n\) and asks us to find the values \(n\) for which \(z_n\) is rational. Such a value \(n\) is called a golden nugget. As noted in the problem statement, \(A_F(\frac{1}{2}) = 2\), hence \(z_2 = \frac{1}{2}\) is rational and \(2\) is the first golden nugget.

As a first step, we determine a criterion for \(n\) to be a golden nugget by analyzing the equation \(A_F(z) = n\). With the recurrence relation given by the Fibonacci sequence as inspiration, we consider \begin{align*}(z + z^2)A_F(z) = z^2 F_1 &+ z^3 F_2 + z^4 F_3 + \ldots \\ &+ z^3 F_1 + z^4 F_2 + z^5 F_3 + \ldots. \end{align*}Due to the fact that \(F_2 = F_1 = 1\) and the nature of the recurrence relation, we have \[(z +z^2)A_F(z) = z^2 F_2 + z^3 F_3 + z^4 F_4 + \ldots = A_F(z) -z\] which implies \[A_F(z) = \frac{z}{1 - z - z^2}.\] Now solving \(A_F(z) = n\), we have \[z = n - n z - n z^2 \Rightarrow n z^2 + (n + 1)z - n = 0.\] In order for \(n\) to be a golden nugget, we must have the solutions \(z\) rational. This only occurs if the discriminant \[(n + 1)^2 - 4(n)(-n) = 5 n^2 + 2 n + 1\] in the quadratic is the square of a rational.

So we now positive seek values \(n\) such that \(5 n^2 + 2 n + 1 = m^2\) for some integer \(m\) (the value \(m\) must be an integer since a rational square root of an integer can only be an integer.) This equation is equivalent to \[25n^2 + 10n + 5 = 5m^2\] which is equivalent to \[5m^2 - (5 n + 1)^2 = 4.\] This is where Conway's topograph comes in. We'll use the topograph with the quadratic form \(f(x, y) = 5 x^2 - y^2\) to find our solutions. First note, a solution is only valuable if \(y \equiv 1 \bmod{5}\) since \(y = 5 n + 1\) must hold. Also, since \(\sqrt{5}\) is irrational, \(f\) can never take the value \(0\), but \(f(1, 0) = 5\) and \(f(0, 1) = -1\), hence there is a river on the topograph and the values of \(f\) occur in a periodic fashion. Finally, since we take pairs \((x, y)\) that occur on the topograph, we know \(x\) and \(y\) share no factors. Hence solutions \(f(x, y) = 4\) can come either come from pairs on the topograph or by taking a pair which satisfies \(f(x, y) = 1\) and scaling up by a factor of \(2\) (we will have \(f(2x, 2y) = 2^2 \cdot 1 = 4\) due to the homogeneity of \(f\)).

Starting from the trivial base \(u = (1, 0)\) and \(v = (0, 1)\), the full period of the river has length \(8\) as seen below:

(Note: in the above, the values denoted as combinations of \(u\) and \(v\) are the vectors for each face on the topograph while the numbers are the values of \(f\) on these vectors.) Since every edge protruding from the river on the positive side has a value of \(4\) on a side (the value pairs are \((5, 4)\), \((4, 1)\), \((1, 4)\), and \((4, 5)\)), by the climbing lemma, we know all values above those on the river have value greater than \(4\). Thus, the only solutions we are concerned with -- \(f(x, y) = 1\) or \(4\) -- must appear on the river. Notice on the river, the trivial base \((u, v)\) is replaced by the base \((9 u + 20 v, 4 u + 9 v)\). This actually gives us a concrete recurrence for the river and with it we can get a complete understanding of our solution set.

When we start from the trivial base, we only need consider moving to the right (orientation provided by the above picture) along the river since we only care about the absolute value of the coordinates (\(n\) comes from \(y\), so it must be positive). As such, we have a sequence of bases \(\left\{(u_k, v_k)\right\}_{k \geq 0}\) with \(u_0 = (1, 0)\), \(v_0 = (0, 1)\) and recurrence \begin{align*}u_{k + 1} &= 9 u_k + 20 v_k \\ v_{k + 1} &= 4 u_k + 9 v_k. \end{align*} This implies that both \(\left\{u_k\right\}\) and \(\left\{v_k\right\}\) satisfy the same relation \begin{align*}u_{k + 2} - 9 u_{k + 1} - 9(u_{k + 1} - 9 u_k) &= 20 v_{k + 1} - 9(20 v_k) = 20(4 u_k) \\ v_{k + 2} - 9 v_{k + 1} - 9(v_{k + 1} - 9 v_k) &= 4 u_{k + 1} - 9(4 u_k) = 4(20 v_k). \end{align*} With these recurrences, you can take the three base solutions on the river and quickly find each successive golden nugget. Since each value is a coordinate in a vector, it satisfies the same linear recurrence as the vector. Also, since each of the solution vectors occur as linear combinations of \(u_k\) and \(v_k\), they must satisfy the same recurrence as well.

Since the recurrence is degree two, we need the first two values to determine the entire sequence. For the first solution we start with \(u_0 + v_0 = (1, 1)\) and \(u_1 + v_1 = (13, 29)\); for the second solution we start with \(u_0 + 2 v_0) = (2, 4)\) and \(u_1 + 2 v_1 = (17, 38)\); and for the third solution we start with \(5 u_0 + 11 v_0 = (5, 11)\) and \(5 u_1 + 11 v_1 = (89, 199)\). For the second solution, since \(f(1, 2) = 1\), we use homogeneity to scale up to \((2, 4)\) and \((34, 76)\) to start us off. With these values, we take the second coordinate along the recurrence and get the following values:

n	First	Second	Third
0	1	4	11
1	29	76	199
2	521	1364	3571
3	9349	24476	64079
4	167761	439204	1149851
5	3010349	7881196	20633239
6	54018521	141422324	370248451
7	969323029	2537720636	6643838879
8	17393796001	45537549124	119218851371
9	312119004989	817138163596	2139295485799
10	5600748293801	14662949395604	38388099893011

We don't get our fifteenth golden nugget candidate (value must be one more than a multiple of \(5\)) until \(5600748293801\), which yields \(\boxed{n = 1120149658760}\). By no means did I do this by hand in real life; I didn't make a pretty representation of the river either. I just wanted to make the idea clear without any code. To get to the code (and the way you should approach this stuff), move on to the second half of this post.

*The Fibonacci sequence is given by \(F_0 = 0\), \(F_1 = 1\), and \(F_{n} = F_{n - 1} + F_{n - 2}\).

Tuesday, August 23, 2011

Conway's Topograph Part 3

This is the third (continued from Part 2) in a series of three blog posts. In the following we'll investigate a few properties of an object called Conway’s topograph. John Conway conjured up a way to understand a binary quadratic form – a very important algebraic object – in a geometric context. This is by no means original work, just my interpretation of some key points from his The Sensual (Quadratic) Form that I'll need for some other posts.

Definition: For the form \(f(x, y) = a x^2 + h x y + b y^2\), we define the discriminant as the value \(ab - \left(\frac{1}{2}h\right)^2\). \(\Box\)

The base \((1, 0)\) and \((0, 1)\) take values \(a\) and \(b\) on the form in the Definition above and are part of a sequence with common difference \(h\). In fact, if we know the values \(a'\), \(b'\) and the difference \(h'\) at any base (edge in the topograph), the value \(a'b' - \left(\frac{1}{2}h'\right)^2\) is independent of the base and the direction (left or right) which determines the sign of \(h'\) and hence equal to the discriminant. To see this, first note the sign of \(h'\) is immaterial since it is squared. Also, consider the two other bases (edges) in the superbase. As in the proof of the climbing lemma, one base takes values \(a' = a\) and \(b' = a + b + h\) with common difference \(h' = 2a + h\) which gives
\begin{align*}
a'b' - \left(\frac{1}{2}h'\right)^2 &= a(a + b + h) - \frac{1}{4}\left(2a + h\right)^2 \\
&= a^2 + a b + a h - \left(a^2 + a h + \frac{1}{4} h^2\right) \\
&= ab - \left(\frac{1}{2}h\right)^2.
\end{align*}
Similarly the other base in the given superbase gives
\begin{align*}
a'b' - \left(\frac{1}{2}h'\right)^2 &= (a + b + h)b - \frac{1}{4}\left(2b + h\right)^2 \\
&= b^2 + a b + b h - \left(b^2 + b h + \frac{1}{4} h^2\right) \\
&= ab - \left(\frac{1}{2}h\right)^2.
\end{align*}
Having showed that there are no cycles when starting from a given superbase, our work in understanding the topograph is not complete. We haven't actually showed that we can get from one superbase to any other superbases within the topograph. To show this, we'll use the discriminant and the following.

Definition: A superbase \(W\) is called a well if all the edges at \(W\) point away from \(W\).

\(\Box\)

Notice a well is dependent on the values, hence depends on the form \(f\). In a positive--valued topograph, we may find a well by traveling along the topograph in the opposite direction of the edges. Eventually, we must encounter a superbase where all arrows point out (as above), leaving us nowhere to travel and thus becoming our well. This is because, assuming the topograph is positive--valued, we can only decrease in value for so long (eventually the values must approach the minimum).

Lemma: (The Well Lemma) For a positive--valued form \(f\) and a well (with respect to \(f\)) \(W\), the three values \(f\) takes on the faces in \(W\) are the smallest values that \(f\) takes on the topograph.

Proof: Using the labels from the well in the definition above, the Arithmetic Progression Rule for our differences gives
\[2\alpha = b + c - a, \quad 2\beta = c + a - b, \quad 2\gamma = a + b - c\]
and solving,
\[a = \beta + \gamma, \quad b = \alpha + \gamma, \quad c = \beta + \alpha.\]
Let the superbase \(W = \left\{e_1, e_2, e_3\right\}\). Since \(W\) is a superbase, we may write any vector as
\[v = m_1 e_1 + m_2 e_2 + m_3 e_3\]
for \(m_1\), \(m_2\), \(m_3 \in \mathbf{Z}\). Also due to the fact that \(W\) is a superbase, \(e_1 + e_2 + e_3 = (0, 0)\) and so we may also write
\[v = (m_1 - k) e_1 + (m_2 - k) e_2 + (m_3 - k) e_3\]
for \(k \in \mathbf{Z}\). From this it is clear only the differences of the \(m_i\) matter. With this as our inspiration we write
\[f(v) = \alpha(m_2 - m_3)^2 + \beta(m_1 - m_3)^2 + \gamma(m_1 - m_2)^2,\]
a formula discovered by Selling.

To verify this, notice both sides of the equation are quadratic forms in \(v\) and
\begin{align*}
f(e_1) = a &= \beta + \gamma = \alpha \cdot 0^2 + \beta \cdot 1^2 + \gamma \cdot 1^2 \\
f(e_2) = b &= \alpha + \gamma = \alpha \cdot 1^2 + \beta \cdot 0^2 + \gamma \cdot (-1)^2 \\
f(e_3) = c &= \beta + \alpha = \alpha \cdot (-1)^2 + \beta \cdot (-1)^2 + \gamma \cdot 0^2.
\end{align*}
hence they must be equal since both sides are quadratics that agree on more than two points.

If two of the \(m_i\) are equal, then \(v\) must be an integral multiple of the third vector, hence the value \(f(v)\) will be at least as large as the value of \(f\) on the third vector. If not, all the differences must be non--zero (hence greater than or equal to \(1\) in absolute value, since integers), thus
\[f(v) \geq \alpha \cdot 1^2 + \beta \cdot 1^2 + \gamma \cdot 1^2\]
which is greater than or equal to each of \(a = \beta + \gamma\), \(b = \alpha + \gamma\), and \(c = \beta + \alpha\) since all of \(\alpha\), \(\beta\), and \(\gamma\) are non--negative. \(\Box\)

Corollary: The topograph is connected; one may travel along the topograph from any given superbase to any other.

Proof: Using the same quadratic form \(f\) as we did to show the topograph had no cycles, we can show it is connected. Any arbitrary superbase is on the topograph, hence must be in some connected component of the topograph, but there may be more than one component. Since \(f\) is positive--valued, we must have some well in this component. But, by the above, the values at a well must be the absolute lowest values \(f\) takes on the topograph. This implies the well must take the values \(1\), \(1\), \(1\) and shows all superbases must be in the same component. \(\Box\)

From this point, we will concentrate on a special type of form relevant to our discussion. For a form \(f\) which takes both positive and negative values, but never \(0\), the topograph has a special path that separates the which separates the faces where takes a positive value and those where \(f\) takes a negative value.

Claim: If a form \(f\) takes both positive and negative values, but not zero, then there is a unique path of connected edges separating the positive and negative values. What's more, the values that occur on this river do so periodically.

Proof: Since the topograph is connected, there must be some edge where positive and negative values meet. As we proceed along adjacent edges, we can choose to follow a path of edges which will separate positive and negative (each subsequent value must be positive or negative, allowing us to "turn" left or right).

On first sight, there is no reason that this path should be unique. However, with the climbing lemma in mind, starting on the positive side of the path and moving away from the negative values, we must have only positive values. Using the logic of the climbing lemma instead with negative values, we similarly see that starting on the negative side and more away from the positive values will yield all negative numbers below the path. Hence nowhere above the path can positive and negative values meet and similarly below. Thus the path must be unique.

To show this path is periodic, we must utilize the discriminant. For each edge along the path, we have some positive value \(a\) and a negative \(b\) (by definition of the path) and the common difference \(h\). Thus the determinant \(D\) must be negative since the product \(ab\) is, hence
\[\left|D\right| = \left|ab\right| + \left(\frac{1}{2}h\right)^2.\]
Thus, both \(\left(\frac{1}{2}h\right)^2\) and \(\left|ab\right|\) are bounded by \(\left|D\right|\). So \(a\), \(b\) and \(h\) are bounded (by \(\left|D\right|\). Thus we have finitely many possible triples \((a, b, h)\), hence some value must be repeated in the path. This forces the path to be periodic since the triple starting from one triple \((a, b, h)\) determines next triple along the path and hence the entire path.

This path is so crucial that we give it it's own name.

Definition: If a form \(f\) takes both positive and negative values, but not zero, we call the path separating the positive and negative values the river. \(\Box\)

Thanks for reading, I'll make use of all this in a few days!

Update: This material is intentionally aimed at an intermediate (think college freshman/high school senior) audience. One can go deeper with it, and I'd love to get more technical off the post.

Update: All images were created with the tikz LaTeX library and can be compiled with native LaTeX if pgf is installed.

Conway's Topograph Part 2

This is the second (continued from Part 1) in a series of three blog posts. In the following we'll investigate a few properties of an object called Conway’s topograph. John Conway conjured up a way to understand a binary quadratic form – a very important algebraic object – in a geometric context. This is by no means original work, just my interpretation of some key points from his The Sensual (Quadratic) Form that I'll need for some other posts.

In the following, as mentioned in Part 1, "when referring to a base/superbase, we are referring to the lax equivalent of these notions."

To begin to form the topograph, note each superbase \(\left\{e_1, e_2, e_3\right\}\) contains only three bases
\[\left\{e_1, e_2\right\}, \left\{e_2, e_3\right\}, \left\{e_3, e_1\right\}\]
as subsets. Going the other direction, a base \(\left\{e_1, e_2\right\}\) can only possibly be contained as a subset of two superbases:
\[\langle e_1, e_2, (e_1 + e_2)\rangle, \langle e_1, e_2, (e_1 - e_2)\rangle.\]
With these two facts in hand, we can begin to form the geometric structure of the topograph. The interactions between bases and superbases (as well as the individual vectors themselves) give us the form. In the graph, we join each superbase (\(\bigcirc\)) to the three bases (\(\square\)) in it.

Each edge connecting two superbases (\(\bigcirc\)) represents a base and we mark each of these edges with a (\(\square\)) in the middle. Since each base can only be in two superbases, we have well--defined endpoints for each base (edge). Similarly, since each superbase contains three bases as subsets, each superbase (endpoint) has three bases (edges) coming out of it.

As we traverse each edge (base) surrounding a given vector (\(e_1\) above), we move from superbase (vertex) to superbase (vertex), and form a face. Starting from a base \(e_1, e_2\), traveling along each of the new faces encountered we begin to form the full (labeled) topograph below:

Notice the values of \(f\) on the combinations of \(e_1\) and \(e_2\) is immaterial to the above discussion, hence the shape of the topograph doesn't depend on \(f\).

If we know the values of \(f\) at some superbase, it is actually possibly to find the values of \(f\) at vectors (faces) we encounter on the topograph without actually knowing \(f\).

Claim: For vectors \(v_1, v_2\),
\[f(v_1 + v_2) + f(v_1 - v_2) = 2\left(f(v_1) + f(v_2)\right)\]
Proof: Exercise. (If you really can't get it, let me know in the comments.) \(\Box\)

This implies that if
\[a = f(v_1), \quad b = f(v_2), \quad c = f(v_1 + v_2), \quad d = f(v_1 - v_2)\]
then \(d\), \(a + b\), \(c\) form an arithmetic progression with common difference \(h\). This so--called Arithmetic Progression Rule allows us to mark each edge with a direction based on the value of \(h\). Hence if \(d < a + b < c\), we have \(h > 0\) and the following directed edge:

Obviously starting from a base \(e_1, e_2\), one wonders if it is possible to move to any pair \((x, y)\) with \(x\) and \(y\) coprime along the topograph. It turns out that we can; the topograph forms a structure called a tree, and all nodes are connected.

Lemma: (Climbing Lemma) Given a superbase \(Q\) with the surrounding faces taking values \(a\), \(b\), and \(c\) as below, if the \(a\), \(b\) and the common difference \(h\) are all positive, then \(c\) is positive and the other two edges at \(Q\) point away from \(Q\).

Proof: First \(c\) is positive because \(h = c - (a + b)\), hence \(c = a + b + h > 0\). The two other edges at \(Q\) have common differences \((a + c) - b\) and \((b + c) - a\). Since \(c = a + b + h\) is greater than both \(a\) and \(b\), these differences are positive. \(\Box\)

Notice also that this establishes two new triples \((a, a + b + h, 2 a + h)\) and \((b, a + b + h, 2 b + h)\) that continue to point away from each successive superbase and hence climb the topograph. We can use this lemma (along with a specific form) to show that there are no cycles in the topograph, i.e. the topograph doesn't loop back on itself.

Consider the form which takes the following values at a given superbase:

Due to the symmetry, we may consider traveling along an edge in any direction from this superbase identically. Picking an arbitrary direction, we reach the following superbase:

Since the values must increase indefinitely as laid out by the climbing lemma, the form can't loop back on itself; if it were to, it would need to loop back to a smaller value. Since this holds in all directions from the original well, there are no cycles.

Follow along to Part 3.

Update: This material is intentionally aimed at an intermediate (think college freshman/high school senior) audience. One can go deeper with it, and I'd love to get more technical off the post.

Update: All images were created with the tikz LaTeX library and can be compiled with native LaTeX if pgf is installed.

Conway's Topograph Part 1

This is the first in a series of three blog posts. In the following we'll investigate a few properties of an object called Conway’s topograph. John Conway conjured up a way to understand a binary quadratic form – a very important algebraic object – in a geometric context. This is by no means original work, just my interpretation of some key points from his The Sensual (Quadratic) Form that I'll need for some other posts.

Definition: A binary quadratic form \(f\) is an equation of the form:

\[f(x, y) = A x^2 + H x y + B y^2.\]

That is, a function of two variables which is homogeneous of degree two. The coefficients \(A\), \(H\), and \(B\) and variables \(x\) and \(y\) are often real numbers, rational numbers or integers. \(\Box\)

When we require the coefficients \(A\), \(H\), and \(B\) as well as the variables \(x, y\) to be integers, we get an integer--valued form. In his Disquisitiones Arithmeticae, Gauss asked (and largely answered) the fundamental question: what integer values can each form take? For example, you may have seen the form

\[f(x, y) = x^2 + y^2,\]

where it was determined that the only primes (Gaussian primes) occuring were \(2\) and those odd primes congruent to 1 modulo 4.

As each form \(f\) is homogenous degree two, \(f(\lambda x, \lambda y) = \lambda^2 f(x, y)\). As a result, if we can understand the values of \(f\) for pairs \((x, y)\) which don't share any factors, we can understand the entire set of values that \(f\) takes. Also, letting \(\lambda = -1\), there is no change in the value of \(f\) since \(\lambda^2 = 1\), hence it suffices to think of \(v = (x, y)\) as \(\pm v\), i.e. \(\left\{(x, y), (-x, -y)\right\}\).

For integers \(x\) and \(y\), any point \((x, y)\) can be expressed as an integral linear combination of the vectors \(e_1 = (1, 0)\) and \(e_2 = (0, 1)\). So if we like, we can express all relevant inputs for \(f\) in terms of two vectors. However, instead considering \(e_2 = (1, 1)\), we have

\[(x - y) \cdot e_1 + y \cdot e_2 = (x, y)\]

and realize a different pair \(e_1, e_2\) which again yield all possible integer valued vectors as integral linear combinations.

Definition: A strict base is an ordered pair \((e_1, e_2)\) whose integral linear combinations are exactly all vectors with integer coordinates. A lax base is a set \(\left\{\pm e_1, \pm e_2\right\}\) obtained from a strict base. \(\Box\)

Definition: A strict superbase is an ordered triple \((e_1, e_2, e_3)\), for which \(e_1 + e_2 + e_3 = (0, 0)\) and \((e_1, e_2)\) is a strict base (i.e., with strict vectors), and a lax superbase is a set \(\langle\pm e_1, \pm e_2, \pm e_3\rangle\) where \((e_1, e_2, e_3)\) is a strict superbase. \(\Box\)

For our (and Conway's) purposes, it is useful to consider the lax notions and leave the strict notions as an afterthought since a binary quadratic form is unchanged given a sign change. From here forward, for a vector \(v\), we use the notation \(v\) interchangeably with \(\pm v\) and when referring to a base/superbase, we are referring to the lax equivalent of these notions.

Follow along to Part 2.

Update: This material is intentionally aimed at an intermediate (think college freshman/high school senior) audience. One can go deeper with it, and I'd love to get more technical off the post.

Wednesday, August 17, 2011

The Lesson V8 Can Teach Python and Other Dynamic Languages

Being unable to completely give up math for computers, I am naturally drawn to Project Euler and as a result solved a ridiculous number of the problems posted there while learning Python. A few months ago (March 13), after reading Secrets of a Javascript Ninja, I decided to begin converting my solutions to Javascript. A month and a half later I came back to it, and then finally two months after that, I began to take it seriously.

After making this decision, I noticed the prime Sieve of Eratosthenes was mighty fast when I ran it in Chrome, maybe even faster than my beloved Python. I tabled the thought for a little, but never really forgot it. So a few weeks ago (early August 2011), I finally got a working install of node running on my machine and was able to make more of this thought. (I say finally installed because on two previous tries I gave up because of conflicts with my version of gcc, coupled with the fact that I had no good reason to use node.)

When I originally did the conversion, I had skipped problem 8, because my implementation required pulling in the problem data as text from a file. While hanging out with Boris and Eric from on the Chrome Developer Relations team, I decided to give it another go on node (xhr requests not allowed) and found it to be quite simple with readFileSync in the node native fs module. After witnessing this, over this weekend, I decided to harness the power of V8 -- the Javascript engine that powers Chrome and node -- and run all my scripts locally with node. So over a two day period, I hack-hack-hacked my way into converting the Python solutions for problems 11 through 50 (the remaining unconverted) into their Javascript equivalents, while also converting a good portion of my hefty functions module.

Once this was done, I had also found I could replace most of the nice parts about Python with my own equivalent. For example, I was able to replace functionality I needed from the Python set datatype with

function uniq(arr) {
  var result = {};
  for (var i = 0, val; val = arr[i]; i++) {
    result[val] = true;
  }

  return Object.keys(result);
};

and I was able to replace the (amazingly) useful Python handling of long integers with a non-native node package called bigint that uses libgmp among other usings. Of course, for Python's secret sauce -- the list comprehension -- I was able to substitute enough filter, reduce and map statements to almost make it seem like I had never left Pythonland. After doing all this, I also ended up writing my own operator.js to replace the wonderful Python native module operator, and my own timer.js to stand in for the Python native module time.

Finally, I had working code and could do a side by side comparison of V8 and the Python interpreter. Update: I added a column for PyPy, a just in time implementation of Python. Here is what I found (averaging the runtime over 10 separate calls to each function, the results are):

Problem	Answer	Python	Javascript	Ratio (PY/JS)	PyPy
1*	233168	1804ms	1215ms	1.48	385ms
2*	4613732	247ms	102ms	2.42	85ms
3*	6857	4725ms	1508ms	3.13	582ms
4	906609	8708ms	149ms	58.44	282ms
5*	232792560	136ms	186ms	0.73	114ms
6*	25164150	10ms	4ms	2.50	6ms
7	104743	656ms	12ms	54.67	11ms
8*	40824	18045ms	5014ms	3.60	7042ms
9	31875000	610ms	3ms	203.33	8ms
10	142913828922	6628ms	167ms	39.69	116ms
11	70600674	49ms	2ms	24.50	11ms
12	76576500	5127ms	203ms	25.26	100ms
13*	5537376230	1795ms	10710ms	0.17	1423ms
14	837799	5572ms	1712ms	3.25	362ms
15*	137846528820	54ms	18ms	3.00	55ms
16*	1366	1844ms	265ms	6.96	462ms
17	21124	87ms	4ms	21.75	7ms
18*	1074	2291ms	1790ms	1.28	1090ms
19*	171	2254ms	336ms	6.71	342ms
20*	648	1061ms	9154ms	0.12	374ms
21	31626	18910ms	1038ms	18.22	728ms
22	871198282	188ms	7ms	26.86	8ms
23	4179871	83318ms	1120ms	74.39	1295ms
24*	2783915460	206ms	210ms	0.98	139ms
25	4782	5865ms	35ms	167.57	232ms
26	983	28ms	18ms	1.56	4ms
27	-59231	645738ms	22536ms	28.65	28288ms
28*	669171001	8509ms	1037ms	8.21	981ms
29	9183	184ms	96ms	1.92	20ms
30	443839	52167ms	1037ms	50.31	877ms
31	73682	9606ms	257ms	37.38	154ms
32	45228	206888ms	12096ms	17.10	4266ms
33	100	300ms	6ms	50.00	15ms
34	40730	7462ms	2447ms	3.05	247ms
35	55	8617ms	848ms	10.16	242ms
36	872187	189788ms	2183ms	86.94	3532ms
37	748317	2389022ms	71845ms	33.25	61551ms
38	932718654	506ms	10ms	50.60	12ms
39	840	178ms	6ms	29.67	12ms
40*	210	326ms	202ms	1.61	119ms
41	7652413	2627ms	133ms	19.75	65ms
42	162	65ms	7ms	9.29	8ms
43	16695334890	38ms	2ms	19.00	2ms
44	5482660	384013ms	27744ms	13.84	6621ms
45*	1533776805	17ms	4ms	4.25	8ms
46	5777	2864ms	202ms	14.18	65ms
47	134043	400967ms	12838ms	31.23	4425ms
48	9110846700	46ms	16ms	2.88	6ms
49	296962999629	115ms	8ms	14.38	13ms
50	997651	3277ms	80ms	40.96	51ms
*These were very quick to run, so the runtimes are the time taken to run 10000 times.

As you'll notice, standard Python gets its butt kicked. I was kind of saddened by this, but in the end, just giddy that our web is faster because of it (90% of my life is digital) and also that we can do scripting faster on the server side (attribute to Boris Smus) because of the node project (thanks Ryan Dahl).

Standard Python is actually slower in 46 of the 50 problems. In 28 of the 46 node is faster by a factor of 10 or greater, in 9 of those 28 by a factor of 50 or greater and in 2 of the 9 by a factor of 100 or greater! The only 4 in which Python was faster were from the n = 10000 sample. In fact, I was able to pinpoint exactly why:

#5 - My own Javascript implementation of gcd is slower than the native (from fractions import gcd) Python library (resulting in a difference of 50 ms over 10000 iterations)
#13 - The node package bigint is slower than the Python native long int (Javascript is slower by a factor of 6)
#20 - The node package bigint is slower than the Python native long int (Javascript is slower by a factor of 8.5)
#24 - Having to perform two slices is slower in Javascript than in Python and there is no good way to just remove one element (resulting in a difference of 4 ms over 10000 iterations; a little bit about that here)

So what, you ask, is that lesson I speak of? Well, Javascript didn't used to be this fast. How did it get that way? The brilliant and inspired people behind V8 rethought the Javascript compile steps and after much work, we now have code that is closer to the metal (attribute to: Eric Bidelman, i.e. closer to machine code) than we had ever had before. The use of just-in-time compilation and other incredible techniques has taken a formerly slow and clunky language (Javascript) which was used well beyond its original intended scope, and turned it into a damn fast dynamic language. Hopefully, this movement will make its way to Python and other dynamic languages and we can all have our code end up this close to the metal.

Update: In response to the comments, I ran the same code on the same machine, but with PyPy in place of Python. This is the direction I hope standard Python goes in and commend the guys pumping out faster and faster just in time implementations over at PyPy. I went through and counted 20 node wins, 29 PyPy wins and 1 tie. (I reserve the right to update the post with more detailed metrics.) While I do commend them, the results don't really belong in this post because PyPy is still an offshoot. (However, as I understand, they both have ties to C, as PyPy uses GCC to compile to bytecode and V8 is written in C++. Feel free to supplement my knowledge in the comments.)

Update: All benchmarking was run on my Mac Pro Desktop with a 3.2 GHz Quad-Core Intel Xeon processor and 4 cores for a total of 12 GB 1066 MHz DDR3 memory. I used Python version 2.6.1, node version 0.4.9, and PyPy version 1.5 (running on top of Python 2.7.1 with GCC 4.0.1).