Motivation: I’m writing some F# code to solve Project Euler problem #14. The problem relates to the Collatz conjecture. It goes like this: start with a number – 3, say. Apply the following rules: if your number is odd, multiply by 3 and add 1. If your number is even, divide it by 2. Keep applying those rules until you end up with 1 or the universe reaches heat death. Tell me: of all the starting numbers from 1 to a million, which number takes the most turns before reaching 1?
Incidentally, this problem is typical of the Project Euler problems in that mathematical insight alone is not going to get you the answer (at least, not the level of mathematical insight of which I’m capable ). Also, in most cases, brute force number crunching won’t get you the answer in a reasonable time; it usually takes both mathematical insight and brute force number crunching to get to the answer.
Here’s my naive first attempt.
let rec CollatzLength n =
if n = 1 then 1 else if n % 2 = 1 then CollatzLength (3 * n + 1) + 1 else CollatzLength (n / 2) + 1
let highestCollatzLengthUnder n =
[1..n]
|> Seq.map (fun x -> (x, CollatzLength x) )
let c = highestCollatzLengthUnder 1000000 |> Seq.max
printfn "%A" c
What’s going on here? All the magic is happening in the first two lines.
Line 1, “let rec CollatzLength n =” tells F# that we’re declaring a function called CollatzLength taking a single parameter named n, and that the function may recurse (“rec”). If we leave out the “rec”, the function body is not allowed to refer to itself by name.
Line 2 is the function body. It tells us that:
- the CollatzLength of 1 is 1 (because by definition, it’s the end of the sequence); and
- the CollatzLength of odd numbers is 1 (for the current number) plus the CollatzLength of 3 times the current number plus 1; and
- the CollatzLength for all other numbers (i.e. even numbers) is one plus the CollatzLength of two times the current number.
So, what happens when I run this code? It’s bad news: I get a StackOverflowException when I try to calculate the Collatz length for 113,383:
Process is terminated due to StackOverflowException.
Why is this happening? I thought that tail-recursion voodoo in F# would protect me in just this situation. Well, actually, no, it turns out.
Tail recursion is an optimisation that the compiler can apply to some, but not all, recursive functions. The optimisation can only be applied when the calling code contains no operations that must be performed after having called itself. Tail recursion avoids allocating a new stack frame for each “nested” recursive call by exploiting (where possible) the fact that the state of the parent is no longer required once it has recursed down another level. In this way, an arbitrary depth of tail recursion can be supported without adding anything to the call stack.
Imagine you’re the compiler, and you’re compiling a method that includes a recursive call. At the point of the recursive call, the normal, non-tail-recursion behaviour is for the original instance of the function’s state to remain on the stack while a new instance is reserved above it and used during the “inner” call. This is the normal behaviour in C# (although even the C# compiler will generate tailcall-friendly code in some cases, apparently, and the various JIT-ers may or may not get in there and play ball), and it is the behaviour in F# when tail recursion cannot be applied. It’s also the behaviour which, in my naive implementation above, is leading to a stack overflow when we nest somewhere between 100,000 and 1,000,000 levels deep.
But when applying the tail-recursion optimisation, the compiler observes that the parent’s state (on the stack) no longer influences the result it returns; there’s no need to keep it around, so the compiler exploits this by re-using the parent’s stack frame when executing the child. This continues on down through the recursive depths; no matter how far you go, you’ll still be executing in the original stack frame.
So, given all that, what’s the problem with my naive implementation? It’s pretty simple.
let rec CollatzLength n =
if n = 1 then 1 else if n % 2 = 1 then CollatzLength (3 * n + 1) + 1 else CollatzLength (n / 2) + 1
In the recursive cases (i.e. where n != 1), CollatzLength recursively calls itself, and then adds one to the result before returning. Tail recursion cannot be applied, because CollatzLength keeps executing once its recursive call completes; it therefore needs its state to remain on the stack, meaning the tail recursion optimisation of re-using the parent’s stack frame for the child cannot be applied.
So let’s rewrite CollatzLength to permit the compiler to use tail recursion:
let CollatzLength n =
let rec CollatzLengthInternal n countSoFar =
if n = 1 then
countSoFar
else
if n % 2 = 1 then
CollatzLengthInternal (3 * n + 1) (countSoFar + 1)
else
CollatzLengthInternal (n / 2) (countSoFar + 1)
CollatzLengthInternal n 1
What’s happened here?
- From the first line “let CollatzLength n =”, we’ve dropped the “rec”, meaning that CollatzLength is no longer recursive; it no longer invokes itself.
- We’ve simplified CollatzLength itself – its body consists of a single line (the last one above): “CollatzLengthInternal n 1”. That is, CollatzLength just evaluates to the result of calling CollatzLengthInternal, passing the number we’re calculating the Collatz length for, and a mysterious “1”.
- We’ve added a new nested function “CollatzLengthInternal”. It accepts two parameters: the number we’re calculating the Collatz length for, and “countSoFar”, which tracks how many steps into the Collatz sequence we are. Another way of looking at it is that countSoFar always equals the depth of recursion into CollatzLengthInternal.
- If n is 1, CollatzLengthInternal just evaluates to countSoFar. From this it’s easy to see that (CollatzLength 1) will be equal to 1.
- If n is odd, CollatzLengthInternal evaluates to CollatzLengthInternal of 3n + 1, but with countSoFar increased by 1.
- And the same for even n, except that we take n / 2 in this case.
The key change to notice is that in both branches where CollatzLengthInternal calls itself, the result of that call is immediately returned; no further calculation happens with the result, and we immediately bubble back out to our parent.
So now, the compiler is free to apply the tail recursion optimisation. Let’s run the overall program (which calculates the CollatzLength for every number from 1 to 1,000,000), and see what happens.
…
Seems to be taking a while.
…
Bear with me.
…
Okay, attach in the debugger, and see what’s going on…
Well, from the call stack, I can see that highestCollatzLengthUnder is calling CollatzLength, which is calling CollatzLengthInternal, so we do seem to be doing some work. If I make highestCollatzLengthUnder the active stack frame, I can see that x is 113,383. Wha? That’s our nemesis from the non-tail-recursion version, back again! With a bit of thought, you can see what’s going on – for some reason, when calculating the Collatz sequence for 113,383, my code’s recursing arbitrarily deep. When not tail-recursive, that triggered a stack overflow. When tail-recursive, the code will happily run forever, nesting until… the heat death of the universe, I guess.
In the context of CollatzLengthInternal, I can see that countSoFar is currently 506,136,061, which is a fairly heroic level of recursion. In that same context, I can see that n is 0. There’s our problem! If n becomes 0, we’ll run forever, since it’s even, and for even numbers the successor in the sequence is n / 2, which for 0 is… 0 again.
So, somehow, in both implementations, I seem to have a bug which is causing us to get to 0 for sequence 113,383, but not for any earlier sequence.
First, let’s think about whether 0 could legitimately be in the Collatz sequence of an integer greater than 1. I don’t think so, because:
- n / 2 for any positive even number is a positive number.
- n * 3 + 1 for any positive odd number is a positive even number.
So what’s happening?
Ironically, for a non-tail recursive implementation, it’d be trivial to find out what predecessor in the sequence led to the first 0 – I’d just run to stack overflow in the debugger, then binary search through the stack looking for the first level where n was 0, then step back one level to see what its predecessor was. As you can see from the screenshot above, in tail-recursive land, you only see one level of CollatzLengthInternal in the stack, so that approach isn’t available. So let’s switch back to the non-tail recursive implementation and find the predecessor.
Well, that isn’t going to work. As you can see from the screenshot’s call stack, Visual Studio’s telling me “The maximum number of stack frames supported by Visual Studio has been exceeded.”
So, let’s fall back on a breakpoint where n = 0.
Yep, blargh. Immediately before n became 0, it was –1. How did that happen? This is starting to look like an overflow problem. Let’s just dump the whole sequence and have a look:
113383
340150
170075
510226
255113
…
1103160598
551580299
1654740898
827370449
-1812855948
-906427974
-453213987
-226606993
-113303496
-56651748
-28325874
-14162937
-7081468
-3540734
-1770367
-885183
-442591
-221295
-110647
-55323
-27661
-13830
-6915
-3457
-1728
-864
-432
-216
-108
-54
-27
-13
-6
-3
-1
0
Well, there’s the problem. We got as high as 1.6 billion (perilously close to the maximum signed 32-bit int around 2.1 billion), but didn’t actually overflow into negative territory until we landed on 827,370,449, which was odd, causing us to multiply by 3 and add 1, wrapping over into negative numbers.
So let’s switch to 64 bit ints and try again.
let CollatzLength(n) : int =
let rec CollatzLengthInternal(n, countSoFar) =
if n < 1L then
failwith "blargh"
else
if n = 1L then
countSoFar
else
let nextNumber = (if n % 2L = 1L then (3L * n + 1L) else (n / 2L))
CollatzLengthInternal(nextNumber, countSoFar + 1)
CollatzLengthInternal(n, 1)
let highestCollatzLengthUnder highest =
[1..highest]
|> Seq.map (fun x -> (int64)x)
|> Seq.map (fun x -> (CollatzLength x, x))
let c = highestCollatzLengthUnder 1000000 |> Seq.max
printfn "%A" c
Just by passing an int64 into CollatzLength, F# propagates that type change nicely. Irritatingly, though, I have to jam an L on the end of every integer literal, or else F# complains that I’m performing operations on a mixture of different integer types. In this case, I’d prefer that it widen the integer literals for me so that the caller can effectively specify the bit-ness based on the type of int it supplies, but it’s not the end of the world.
And there you have it. The answer to the question “which is the longest Collatz sequences starting with an integer betwen 1 and a million” is “the sequence starting with 837,799, which is 525 steps long”.
It would be cooler if…
This is one of those Project Euler problems which can be brute-forced. There’s no real mathematical insight in what I wrote above; it’s not much more than a translation of the problem statement into executable code.
I was hoping that the sequences would be long enough, and converge often enough, that it would be necessary to memoize the result of calling CollatzLengthInternal somehow in order to avoid re-tracing well-worn paths.