The undecidability of the halting problem is not very important

Many a programmer, when confronted with a difficult computational problem, claim that it doesn’t make sense to investigate it further, since it is just a special case of the halting problem, and as proved by Turing back in the day, this is just not solvable in the general case. But I will try to show tha this is not a very useful or smart way of looking at things.

There are two important and opposite approaches that show the little utility of this thought process. They are two distinct attacks to the same area, both demostrably valid, and I hope to make them both easy to understand in this blog post.

The first issue involves the fact that, even if something is not tractable in the fully general case, it doesn’t mean there are not many useful cases where it’s not only perfectly tractable, but also well worth doing it. Let’s take static code analysis for example. Here is a nice piece of C code to find the maximum value in an array of unsigned integers:

This code has a subtle but important bug: it will just never terminate. It’s actually an infinite loop. Why? Because the loop counter is not incremented. There is a typo, we missed a “+” sign in the “++i” expression, and thus the expression meant to increment the counter does nothing. Thus, the loop counter will stay at zero, and the code will loop until the sun goes nova.

Turing showed that, in the fully general case, it’s impossible to determine whether a given piece of code will terminate or not. But he did not determine, at all, it is impossible to write a tool that would detect this case, and diagnose, for sure, that this code is wrong and will crash the program that runs it.

Of course, most modern compilers will emit a diagnose of the simple typo above: “expression has no side effect”, or something similar. But this is not the point – there are many subtler cases of the same phenomenon, which can’t be easily detected by a compiler’s peephole code analysis, but which can be analyzed by sufficiently advanced tools. Fortunately, John Carmack recently bringing up static code analysis will hopefully help with interest, attention and funding towards this useful area, Gรถdel and Turing notwithstanding.

Now, let’s go to the second angle from which the though process described at the beginning of the post should be questioned. Let’s imagine designing a language that doesn’t allow non-terminating programs. By definition, this can’t be Turing-complete. And so we should be safe from programs that crash or hang indefinitely. Leaving aside the difficulty of even defining what “correctness” is for a program, we could think that we would have advanced a lot towards this goal. At least, our programs will be determined to complete. Following this direction, and trying to find ways to prove that certain classes of programs terminate, advanced terms and distinctions such as data and codata have been invented. A program that responds to an input stream should actually be allowed to run at least in proportion to the length of the input, and still be called correct!

Anyway, if this gets you excited, let me pop you bubble by introducing you to the beautifully named Ackerman function. I won’t go into the mathematical details, it’s just a perfectly well defined functions (or family of related functions, if you will), which can be simply calculated for any input, which involves some recursion. The beauty of it is the following: for small values of the input (say, 0 to 4), the function returns normal low values (1, 3, 7, 61…). But as soon as you try to calculate the function having as input 5, 6, or something higher, the result explodes, and there just aren’t enough atoms in the known universe to even print out the result. Actually, even if it is guaranteed to terminate, its calculation will keep chugging along the infinite loop seen above until the sun goes nova.

Isn’t it beautiful? As computer theorists, the first example loop and the Ackerman function with 10 as an input are completely different beasts. One is a demonstrably infinite loop that is guaranteed by theoretical reasons to arrive nowhere. And the other one is a demonstrably finite loop, which is also guaranteed by practical reasons to arrive nowhere.

We are still in the infancy of the field of computation and software development. We have achieved very little compared to what can be done, and many of today’s approaches need a lot of rethinking before we can move forward. The boundaries of functional programming, declarative programming, partial evaluation, speculative computation, etc… are often very hardly defined, and seem to leave very little room for improvements over current techniques. But the real value lies in the gaps between those, in combining the practical approaches of each, and thus it is really difficult to really see the shape of this “new programming”. But slowly, we advance, and in a few years, it will be obvious how 2012 is the stone age of programming.

We already have the words, we just need to find the way to put them together to read like “Hamlet”.

PS: I believe even genius Roger Penrose himself fell prey to a similar faux-pas in the central argument in this book “Shadows of the mind”. But this is fodder for another post some time in the future.