The undecidability of the halting problem is not very important

Many a programmer, when confronted with a difficult computational problem, claim that it doesn’t make sense to investigate it further, since it is just a special case of the halting problem, and as proved by Turing back in the day, this is just not solvable in the general case. But I will try to show tha this is not a very useful or smart way of looking at things.

There are two important and opposite approaches that show the little utility of this thought process. They are two distinct attacks to the same area, both demostrably valid, and I hope to make them both easy to understand in this blog post.

The first issue involves the fact that, even if something is not tractable in the fully general case, it doesn’t mean there are not many useful cases where it’s not only perfectly tractable, but also well worth doing it. Let’s take static code analysis for example. Here is a nice piece of C code to find the maximum value in an array of unsigned integers:

unsigned int val = 0;
for (int i = 0; i < N; +i)
{
  if (array[i] > val)
    val = array[i];
}

This code has a subtle but important bug: it will just never terminate. It’s actually an infinite loop. Why? Because the loop counter is not incremented. There is a typo, we missed a “+” sign in the “++i” expression, and thus the expression meant to increment the counter does nothing. Thus, the loop counter will stay at zero, and the code will loop until the sun goes nova.

Turing showed that, in the fully general case, it’s impossible to determine whether a given piece of code will terminate or not. But he did not determine, at all, it is impossible to write a tool that would detect this case, and diagnose, for sure, that this code is wrong and will crash the program that runs it.

Of course, most modern compilers will emit a diagnose of the simple typo above: “expression has no side effect”, or something similar. But this is not the point – there are many subtler cases of the same phenomenon, which can’t be easily detected by a compiler’s peephole code analysis, but which can be analyzed by sufficiently advanced tools. Fortunately, John Carmack recently bringing up static code analysis will hopefully help with interest, attention and funding towards this useful area, Gödel and Turing notwithstanding.

Now, let’s go to the second angle from which the though process described at the beginning of the post should be questioned. Let’s imagine designing a language that doesn’t allow non-terminating programs. By definition, this can’t be Turing-complete. And so we should be safe from programs that crash or hang indefinitely. Leaving aside the difficulty of even defining what “correctness” is for a program, we could think that we would have advanced a lot towards this goal. At least, our programs will be determined to complete. Following this direction, and trying to find ways to prove that certain classes of programs terminate, advanced terms and distinctions such as data and codata have been invented. A program that responds to an input stream should actually be allowed to run at least in proportion to the length of the input, and still be called correct!

Anyway, if this gets you excited, let me pop you bubble by introducing you to the beautifully named Ackerman function. I won’t go into the mathematical details, it’s just a perfectly well defined functions (or family of related functions, if you will), which can be simply calculated for any input, which involves some recursion. The beauty of it is the following: for small values of the input (say, 0 to 4), the function returns normal low values (1, 3, 7, 61…). But as soon as you try to calculate the function having as input 5, 6, or something higher, the result explodes, and there just aren’t enough atoms in the known universe to even print out the result. Actually, even if it is guaranteed to terminate, its calculation will keep chugging along the infinite loop seen above until the sun goes nova.

Isn’t it beautiful? As computer theorists, the first example loop and the Ackerman function with 10 as an input are completely different beasts. One is a demonstrably infinite loop that is guaranteed by theoretical reasons to arrive nowhere. And the other one is a demonstrably finite loop, which is also guaranteed by practical reasons to arrive nowhere.

We are still in the infancy of the field of computation and software development. We have achieved very little compared to what can be done, and many of today’s approaches need a lot of rethinking before we can move forward. The boundaries of functional programming, declarative programming, partial evaluation, speculative computation, etc… are often very hardly defined, and seem to leave very little room for improvements over current techniques. But the real value lies in the gaps between those, in combining the practical approaches of each, and thus it is really difficult to really see the shape of this “new programming”. But slowly, we advance, and in a few years, it will be obvious how 2012 is the stone age of programming.

We already have the words, we just need to find the way to put them together to read like “Hamlet”.

PS: I believe even genius Roger Penrose himself fell prey to a similar faux-pas in the central argument in this book “Shadows of the mind”. But this is fodder for another post some time in the future.

The HOW and the WHAT

No, a low-level programming language is not defined by being closer to the metal. And high-level programming is not defined for being closer to English. Heck, BASIC was probably the closest to English, and you’d be hard pressed to defend that it’s higher-level than Haskell. Which is probably as difficult to understand for a native English speaker as any real programming language can get, with the sole exception of APL.

Anyway, indeed, some of those things are true. But they are only secondary attributes. They are like the fact that vi allows you to do much more in less keystrokes than any other editor: the core reason is that vi was designed to work over 300 baud phone lines, and this involved, among many other things, reducing keystrokes – but this was not the goal, but a side-effect.

Lower-level programming languages are created first, higher-level programming languages are created afterwards, in order to make programmers’ lives less painful. You know, programming is in some aspects very similar to being tortured: you have to keep dozens (if not hundreds) of things in mind while you write every line of code. And if you forget any of them, you write a bug – and you don’t know right away. Back in the day when the web wasn’t gobbling up all other platforms like The Nothing in Michael Ende’s The Neverending Story, you used C++ and the compiler would punish you with a basic warning or error for most cases, which is similar to getting a little pinch. Nowadays, it’s Javascript or Python, and you will be punished for the bug with a crash in front of your user or customer, which makes the minimum torture threshold more similar to getting a nail torn out.

But even if the driving force is improving the life of programmers reducing the cost of software development, the real way this is achieved is not by getting further away from the proverbial metal, but by allowing you to describe processes better. A higher level language allows you to describe things more succinctly, without having to get into the nitty-gritty details of every single excruciating little step. See a sample in assembly language:

                MOV EAX, [first_value]
                ADD EAX, [second_value]
                MOV [result], EAX

This calculates and stores the sum of two values. In three steps. Read the first value into EAX. Add the second value into EAX. And store the value of EAX in the result variable. Kids, this is how we did things back in the day.

And see this in C:

result = first_value + second_value;

Thanks progress exists.

But still, even if this shows real evolution in programming, from a decidedly lower-level  language to a higher-level one, it doesn’t show what the underlying key is. Removing the references to specific CPU registers and instructions is one step, but if we don’t understand the core, we can’t further this process and create even more useful programming languages. So, what is it?

In Turing’s spirit, all programming languages are equivalent in what you can do with them. But the truth is that they are not the same at all. Something like the SKI combinators are fully capable of any computation, and what’s more, it’s just three combinators, and one of them is redundant. See a sample program:

(S (S (K S) (S (K K) I))   (S (S (K S) (S (K K) I)) (K I)))

(This piece above is number 2. Imagine how an XML parser written using the SKI combinators must look like.)

You could say SKI combinators are as far from the metal as you can imagine. Functional combinator application. Rewriting rules. Church numerals. No conventions, only a couple of basic core definitions. What not.

But even with this, I doubt anyone would dare call SKI combinators a high-level programming language. It’s about as low-level as it gets, similarly to assembly language. Ok, let’s cut SKI calculus some slack, at the very least, it’s probably the only calculus with its own Facebook page (http://www.facebook.com/pages/SKI-combinator-calculus/138648626160295 – and all of 6 people worldwide like it).

So, if closeness to the metal does not define the high/low level distinction, what does? Because it’s obvious that C and Java Fortran and Haskell share something useful and valuable that assembly language and the SKI combinatory calculus don’t. (Sorry, just kidding about Java.)

And here is the key: if you compare assembly and SKI, you see that most of the code you write deals with the nasty details of each step of doing something: reading a value into a register, or substituting a combinatory parameter in some way to rearrange information. While, on the other hand, the sample C code above depends on the underlying infrastructure, and the C code just lists WHAT the end goal is.

And this is the key: lower-level programming languages make you worry about HOW each thing is done. While a higher-level  language allows you to get busy with WHAT to do, while letting the tools worry about how that goal is achieved.

Why is people’s code so bad? Even after getting a degree in CompSci?

I’m fixing code written by some quite junior programmers. I’m always amazed at the code people write. Only in these moments I really see what good code is about. When you read or write good code, it just seems the obvious thing to do, and no big deal.

Principle #1: if one small piece of code is repeated, and it would be wrong if you changed it in one place but not the other, then THE CODE IS BAD! You have to REWRITE it. Just so that you can’t change things in one place and not in the other(s)! It’s that simple.

The simplest case, and one I STILL see too often is numeric constants. You are adjusting some things (say, things in a UI layout done programmatically). You change it in one place, run the program, and it doesn’t work. Or you do something non-direct (like rotating the screen and then going back to the original place), and things break up. You check it, and the position is set in two places, to a numeric constant, and you had only changed it in one place!

The solution is obvious: have a #define at the top giving the right name to the value, and use the constant name in both places. In Java, you can just use some “private final static int” or so. Whatever your languages buys you, but there’s certainly a way (or switch languages, pronto!).

I had this idea that if someone indents code wrong, you should stop looking at it, because the rest will be wrong too. If a developer can’t get indentation right, it’s impossible they’ll get conceptual consistency right. I’m expanding on this idea now: if someone repeats small non-independent pieces of code or numeric constants, instead of refactoring to some simple function or definition, then I won’t bother with it (unless totally unavoidable).

May you never break the DRY principle: Dont Repeat Yourself. I have a theory that every good development practice boils down to just a circumstancial version of this principle. There is one practice that doesn’t fit the bill, so that’s why I’m withholding from sharing until I can really prove the elegance. I can boil all down to “WRITE HONEST CODE”, but that just isn’t so catchy.

Live hard, pivot hard

From my days as a games developer for PC, PlayStation 2 and Xbox (classic), about 10 years ago, I remember reading some interesting advice from the famous game designer Peter Molyneux, which may come handy now in the world of startups.

See, when creating video games, one of the most important elements to giving players a compelling experience is to reach the correct balance. All the different parameters in the game have to be just right: difficulty, rewards, comparative power of the different options available to players, etc…  it’s really hard to get this right, and a significant part of the development process must be devoted to getting this right. Game developers and designers who can do this well outshine the competition.

In thist context, the advice from Peter Molyneux was something like this: “When you are balancing a game, don’t do small adjustments. They take you nowhere, and you can’t really even know if you are on the right path. If you need to increase the cost of some element in the game, do not add or remove 10%, make it double or half.”

This stroke me back then as genius advice, and it came from someone who really knew what he was doing. It has stuck around in my memory since then, and in twisted comeback, it has resonated with me lately, as it is quite applicable to our situation in our start-up. Let me explain.

The setup

We released our system to easily create native mobile apps a few months ago. After having invested a long time developing it, we finally pushed it online and announced it, where but in Hacker News and the programming Reddit. Quickly, it turned out the reaction from developers on those places was not positive. You can read the comment threads following the previous links. Developers didn’t like that it added custom CSS tags, that it wasn’t open-source, or that it wasn’t based on HTML5.

After reflecting on this a bit, we were thinking how to move the situation forward. There were issues with the presentation of the technology and how it was packaged, presented and documented. We could try to improve the situation that way. But did we want to put all our efforts in improving this? It would still take a long time, and it wasn’t clear whether this would help overcome the barriers people found with our project.

Pivot hard

So, evaluating it, we decided to attack the core problem of lack of attraction for programmers, and embody the same technology in another product that could really overcome most of the obstacles in the previous announcement. Too complex technology and documentation? Go for a vertical target that does without hardly any documentation at all. Not attractive for programmers? Package together a product that doesn’t target programmers.

And so we came up with Mouinpress: a system that allows you to create native mobile apps for your WordPress site. It completely sidesteps the main problems with the original technology: install a WP plugin in 15 minutes and order your app. Hardly any documentation needed. Target customers won’t complain about custom CSS tags or HTML5, since they just don’t care about how the technology achieves the goal.

We’ve been talking about it with different people, potential partners and customers, and the conversations are already much more fluid and constructive than with the previous offereing.

Pivot hard. That was Molyneux’s concept, and we are going to see how well it helps us in our foray.

Live hard

As an extra, another issue has come now, and it is that there is someone else who came up with a similar idea, and they released it a couple weeks ago. And not only that, it’s a fellow HN’er too, mind you, so of course they announced their solution on HN last week!. He also explained that he’s quit his job to do this. Same as ourselves, we’re putting all our passion, effort and illusion behind this product.

So the times coming ahead are really interesting:

  • We are going to see whether the “pivot hard” philosophy really helped.
  • We are going to be competing with a fellow HN’er halfway across the world (we are in Spain, the Appifier guys are in Canada).
  • We are all going to see what size of an opportunity there is in this market, and whether our competitors’ approach and our approach are attractive to the market.

Live hard. You can’t put it in a different light.

Ready… go!

Exciting times ahead. I will try to document our advances. I hope that the reflection about Peter Molyneux’s Pivot hard philosophy can help you when thinking about your own projects. I hope that it serves our project well. And I hope there is a market for all our competitors and for ourselves.

And, by the way, we are looking for partners and investors, so if you can help in these areas, I will be happy to hear from you! I’m available on Twitter as @jonbho.

Quick reflections on yesterday’s post

I really enjoyed the conversations resulting from publishing “I want to fix programming“, and I’ve come to a few quick conclusions I want to share:

  • It’s great to have so much intelligent feedback, both on Hacker News and on the post itself. Very smart and knowledgeable people, some of them having explored the same area. Lots of insights. Lots of pointers to interesting projects. Proposals for approaches to get somewhere. I’m really happy to have brought this up in the open. There are a few “haters” there too, but I guess it’s the price of being out there – and this is key to getting the invaluable feedback from other people!
  • Raganwald suggested that just the notation could be useful. Invaluable advice. Maybe that should be the first goal. Still, I plan to publish the full series with my progresses so far before really getting down to work, so this is some time away. Also, Raganwald is probably the only person who uses github as a blogging platform.
  • Gruseom suggested the great idea that restricting things to a given problem space could be a great way to get somewhere practical.
  • There is interest out there. Many people are (more than reasonably) skeptical, but many other people would love to see some advance in this area and are happy to contribute ideas, etc… this is great.

I do not have an academic background, and I sure lack some or many of the necessary skills. But I’ve done my homework, or at least a lot of it. I have insights and approaches that can add some value. I may be able to contribute a “general sketch” that can help in building a practical tool. But in any case, attempting this here in the open and without any commercial interest can be the best way to advance!

Now, since we are launching a new startup and product later today, please excuse me from this topic and I’ll get back to posting on this topic in a few days. Thanks for all your comments.

I want to fix programming

(The SB series, part 1 of 7)

Programming is broken. Completely broken. The very way we write code today is just so broken that it makes you want to puke. I’ve been saying for years that I hate programming. I am and have been a fulltime developer for over 20 years, I don’t regret it, and I still love what you can do with programming. Still, I hate programming.

The way code is written is just braindead. At each step of the process, you can crash the program, be it by filling up all available memory, accessing a stray pointer or reference, or entering an infinite loop. No doubt programming feels like walking barefeet in a floor filled with broken glass. One tiny inch out of the right place, and bang, you nearly lose one of your toes.

Every step of the way, in every statement, line of code, function call or sequence-point, if you want to write good code, you need to think about all the different possible states of the whole program. This is invisible and impossible to define. Provably. Always. In all existing languages. That’s the reason 100% code-coverage unit-testing doesn’t guarantee bug-free code, and it never will. This is also the reason bad programmers don’t become good: they just can’t think about all those possible cases in a structured way.

(By the way, this only gets about a thousand times worse with multithreaded code – so it’s not getting better, it’s getting worse.)

The problem is that the fundamental way that code is written is wrong. It’s just wrong. You write a bunch of instructions, step by step, which supposedly result in a given wanted state. But each step is separate from all others, it is only understood by the compiler/interpreter separately, and it’s almost easier to screw things up than to get them right.

I thought for a long time that functional programming could be the solution, and studied it in detail. Lisp. Haskell. The Lambda calculus. The functional approach brings indeed several improvements over regular imperative or object-oriented programming. But still it’s not the solution. It is still composed of many discrete simple steps that excruciatingly calculate the output.

What is the key thing that is wrong with code? The fact that you don’t express what you want. You express what steps have to be taken. Imagine telling your pal to grab you a beer from the fridge, step by step, with each step being robotically-rigid and with very little regard for context.  It would be excruciating. And prone to catastrophic failure. Exactly like code is.

Libraries help with this phenomenon, but they are only shortcuts for higher-legoal-specific needs. They don’t really fix it.

There was recently a very interesting post by John Carmack about static code checking, where he cited a great tweet by Dave Revell on code checking:

“The more I push code through static analysis, the more I’m amazed that computers boot at all.”

A vision

So, how should code look? Let’s take a simple example: sorting a list. Say you have an input list, let’s call it, ahem, input. Assume it has several elements. And now we want to compute a new list, called output, containing the same elements that the input has, but sorted in ascending order. How should we do it?

Traditional methods include bubble sort, quick-sort, shell-sort, insertion sort, etc… these are all algorithms that allow us to sort a list in a way that provably works. For example, bubble sort:

def bubble_sort( input, output ):
  output = input    # start with the unsorted list
  swapped = True
  while swapped:
    swapped = False
    for i = 1 to length(output) - 1:
      if output[i+1] > output[i]:
        swap( output[i+1], output[i] )
        swapped = True

Straightforward enough. But if you write the code to do the sort, you may still make a mistake! What if you forget to set the “swapped” parameter to true when you exchange two elements? Or more typical than that, what if you make an off-by-one mistake with the loop counts?

This is why I say that programming is broken: sorting is a simple concept to grasp and describe, and still, the code to do this is convoluted, and full of traps where you can crash the program or generate wrong output. It’s nuts!

Someone could come up with a functional version of one of the algorithms above, but it would still be very similar: no side effects, but many steps to do all the tasks needed. Recursion may be more elegant than iteration (ahem), but it isn’t fundamentally better.

So, how should the code to sort a list really look like? After many years, I’ve come to the conclusion that it must be something like this (pardon the pseudo-code for a style of programming that doesn’t really exist yet):

def sort(input[]) = output[]:
  one_to_one_equal(input, output)
  foreach i, j in 1..len(output):
    where i < j:
      output[i] <= output[j]

Let me try to explain it: the first line of the sort definition says that there exists a one-to-one set of “relations” between elements in the input and output sequences. We will see later how the one_to_one_equal definition achieves this. But this ensures that the input and output list contain the same elements. It defines the space of possible answers.

Second, the key thing, the next line specifies that for each pair of elements in the output sequence, when the first index is lower than the second index in the pair, then the value of the element is less or equal. Which basically states that the output list is sorted. It defines which of the possible answers is the solution.

And it’s that simple. The sort function only specifies WHAT sorting is. Not HOW it is done. It shows the properties of the output and the input, related, and it leaves as an exercise for the compiler to figure out how to get there.

Undoubtedly, there are two key questions here:

  • First, how can the compiler process this? Is it actually possible? In future articles, I will try to show that yes, it is indeed possible, and that the compiler could even be able to figure out the algorithms to get this result.
  • And the second key question is, how applicable is this to more complex problems? I will also try to show that it is indeed applicable to any fully-general programming or computation task, and that such a way of programming can be much simpler, more efficient, and much closer to being failproof!

I had been meaning to save this knowledge to set up a company in the future, but several circumstances have made me rethink this plan. I’d rather share this knowledge now, and see where things go. Watch out for the next articles in the series.

Afternote

We’ll get deeper into this later in the series, but here is a small bite. This one_to_one_equal function would be a “standard library function” in this idealized language, but here it is more or less how it would look, so that you can see the underlying logic:

   def one_to_one_equal(output[], input[]) = c:
     c = relations(input[i], output[j])
     foreach x = input[i]:  len(c(x,)) = 1
     foreach x = output[i]: len(c(,x)) = 1
     foreach x(a,b)=c[i]: a == b

Let me try to explain it: the first line of the definition says that there exists a one-to-one set of “relations” between elements in the input and output sequences.

The second and third lines state that there is a single relation in “c” that pertains to each element in the input and output lists, thus ensuring the relation is one-to-one. Finally, the last line states that the two elements in each relation are of equal value, ensuring the two lists are the same, just reordered.


My current day job involves Mouinpress (a system to turn WordPress sites into native iPhone, Android, Blackberry and J2ME mobile apps), and ViEmu (a vi/vim emulator for Visual Studio, Word & Outlook, and SQL Server). If you want to help me with my project to fix programming, I’ll be grateful if you have a look, evaluate, and maybe buy or promote either of them’products. Or, since I’m looking for investors, maybe you can help with that!

I’m available at http://twitter.com/jonbho


Starting a new blog

I have reflections to share. Stuff I’ve been mulling over for the past few years. Let’s see if I can share these matters in a way you can get some value from what I write. I’ll be happy to host, read and respond to your comments. Mistakenly, I’ve done too much one-way communication in my life. It’s now the time to try and get some two-way communication rolling.