Is your compiler throwing onions in the varnish?
In a by now well-known anecdote, Primo Levi, an Italian chemist-become-writer from the early twentieth century, was reviewing how varnish was produced in an industrial facility. The recipe, involving many chemical components, included instructions to throw in a raw, peeled onion in the varnish. This was puzzling to him: he couldn’t figure out why the onion could be needed, or even how such a small amount of material could affect the resulting varnish. Nobody he asked at the factory knew why it was needed. Seemingly, it had been done like that forever.
Intrigued, he researched where this came from, and after some heavy digging, he discovered the reason: in earlier times, when varnish was produced in a less industrialized fashion, varnish-makers would throw in the onion, they would look at it, and when it become brown from being fried, they’d know the varnish was in the right temperature for the next step. When the production system had gained thermometers and automation, someone had just kept on throwing the onion out of habit, and thus it had become an ingredient of the varnish.
As an example of a modern varnish recipe, let’s have a look at the canonical MAX function: let’s see how you would find the maximum out of a list of numbers in any modern imperative language:
This loop finds the maximum value in the input array and returns it.
And you may ask, what is the problem with the above code? Isn’t it just pretty conventional code to walk an array and extract some information from it? Isn’t that how programming is supposed to be done?
The problem is the fact that we are writing the detailed steps on how to do something, at a very low level, instead of describing what we are trying to attain. And this has three key negative properties:
- The actual goal of the code is lost. The code does not contain or embody it, it only lists the steps we concluded we needed. Someone reading it needs to deduce the thinking and the goal, the “spec”, is not actually there. The runtime (whatever it may be: interpreter, compiler or just-in-time trace-compiler) understands too little of what it’s doing. What it’s doing is like following step-by-step directions without a map, or like a blind man taking each step without any knowledge of the environment. Given this, it has limited possibilities to find shortcuts or improvements.
- Code becomes very repetitive, verbose, and takes a long time to write. Current programming practice involves reinventing the wheel too often. After programming for a few years, it often seems as if you are rewriting the same code over and over again. Libraries and frameworks help somewhat, but they become a kind of “straitjacket”, you can benefit from them, but it comes with a huge extra cost in conventions and other conditions you have to adhere to.
- Writing code this way is very prone to errors. It’s very easy to think you are writing the steps for a given result, while you are making a mistake and getting something else. Too often, this only apparent when running the code, and even worse, only in edge-cases when unforeseen circumstances take place.
When the compiler is generating the code to compare two values, for example, it is doing exactly the same as the worker who threw an onion in the varnish. The compiler doesn’t know what it is for, it is just blindly following instructions, which may or may not be correct for the desired goal. There are none in the example above, but actually, you often find “onions” in real code out there: pieces of code that serve no purpose, but that are being run because they’ve been there since some time they were useful. And nobody, not the developers, and especially not the compiler, has bothered to review their value in light of the current state.
I believe there is a better way to express computation, a better way to describe the function to compute the maximum, which can sidestep the drawbacks listed above. And this requires getting to the bottom of the issue, understanding the essence of what the “maximum” function above is supposed to do. Let me give it a try.
This is what I think the core of the MAX function is: the maximum of a set of numbers is the number out of them that is larger or equal than all the others in the set. This is the essence of MAX. We are not saying how it can be calculated, we are defining what we understand as the maximum value in a list or set: just the value that is larger or equal than all others in the set.
When you see the imperative code above, you see a sequence of steps that will allow you to find the maximum value as defined above. But it’s a specific method, a recipe, and it doesn’t describe either what the final result is, or why we took those specific steps. There are actually other ways to find the maximum, but the definition is unique.
Of course, we, as the developers who wrote the code, know why the steps are there. But by the time the compiler/interpreter/runtime-system reads the steps, the reasons are lost, and only dumb instructions remain, without the reasons why they should be followed, and of course without any possibility of adjusting further.
If we could find a way to embody this essence, we could have something much more valuable than the imperative code above.
Don’t imperative and OO programming cover this with their libraries and frameworks?
Imperative programming is the canonical “all information is lost” approach. State is stored in variables or objects (object-oriented programming is a kind of imperative programming), and each step is free to alter any state it wants. Actually, not only can any statement in the example above botch the calculation, but it can crash the program too. I believe this is the reason programming sometimes feels to me like walking barefoot amid pieces of broken glass.
Even more importantly, there is a brick wall that this is hitting: due to the loss of information, it’s nearly impossible to reorganize this kind of code automatically at all. One big loss here, very relevant for the coming years, is that it’s nearly impossible to automatically parallelize imperative code. Those 64-core processors that are coming out soon? You’re out of luck if you want the code your team wrote during the last 5 years to benefit from them.
Many a developer will just say: “duh, the standard library in $MY_FAVORITE_LANGUAGE already has a MAX function, so I don’t even have to think about it – and I’m sure it will parallelize things the day my desktop has 64 cores”. And the answer is: yes, your current language may help with this, but it doesn’t help with the many variations of the MAX function you are going to have to write for your code. Of course, if your favorite environment has a standard call to do something, you will do well in using it. But pretty soon, you will deviate from the exact needs the environment covers: instead of the MAX function, you may need the largest value below a certain threshold, or the first value after some given one, or whatever variation. And right then and there, you will have to start writing code like the above, which won’t benefit from the 64 cores unless you put in serious extra work.
We need to find something more powerful than imperative calls and standard class libraries.
Does the current trend towards functional programming cover this?
Functional programming is touted as one of the possible solutions to the parallelism problem. Functional programming definitely has some benefits, but I think it has a big problem, as that is the fact that is based on lists.
Lists in functional programming are a weird creature. In all common functional programming languages, they are defined recursively, with a head that is the first element of the list, and a tail that is another list, the “rest” of the original list. This seems completely normal to functional programmers. If you are not familiar with it, it definitely looks asymmetric and weird. I’d say it actually is weird, this is not how us humans think about sets and sequences. The advantage is that such lists provide a simple and formal way to structure and think about computation, which is why they’re so widespread.
Here is the canonical definition of MAX in most functional programming languages:
There are variations of this, but they are all basically the same. As in imperative programming, we are just walking down the list, recursively, and doing the comparisons that we know will let us find the maximum value. To begin with, its recursive nature makes it more difficult to understand. I’d say our brains are better wired for iteration than recursion.
But apart from that, it is still missing the same information as the imperative version. This doesn’t show anywhere the core property that we understand as being the max value, that is, being larger than all other values in the set or list!
How about Prolog and declarative programming?
Prolog takes an initial promising approach, with declarative code, but it very quickly gets off-track. The language lacks expressiveness for really strong declarative power, things like “cut” turn off the guaranteed validity of programs, and the canonical definition of MAX in Prolog is as poor as those in imperative and functional programs:
This is tied to the fact that sets in Prolog are represented as lists, and lists are defined recursively as a head that is an element, and a tail that is a list. This all but forces the definition of MAX above.
So how would it look?
Here is a pseudocode draft of how I believe the code should look in order to contain the essence of the “maximum” concept:
This is a radically different way of definining the maximum function. It is not saying how to calculate the maximum value: it is saying what the maximum value of a set is. It is describing what we understand when we think about the maximum. And it is doing so in formal terms, just a set of properties and constraints that describe the solution:
- The maximum of a set is one of the elements of that set
- This element, which we call “max”, fulfills some properties
- The properties are related to the other elements of the set: for every element in the set other than this one we call “max”, their value is less than or equal than the value of max.
As you see, we are doing exactly the opposite to all the other examples above. We’re defining the WHAT, and leaving out the HOW.
As far as I know, there isn’t a programming language out there that works like this. If you know of one, I’m interested, be sure to let me know!
The bad news
The bad news is that, even if the sample code above has all these beautiful properties about describing the solution, there is something that it is missing: it is not describing how to reach that solution. And thus, the next challenge is born: devising a runtime (compiler, interpreter, interactive dashboard, whatever) that can take code like the above, and turn it into something that can actually be run.
This is a tough and large problem, but one I believe is very worth attacking.
And why is this important?
Say you write the MAX function in a way as described above. Describing its essence. And the runtime environment has the brains to run that code, that is, to compute its results. What would that mean?
- The code is guaranteed correct with regards to the spec! There is no possibility of an off-by-one error leaving you lost in those rare cases where the last value in the list is the maximum and it is missed! If you defined the properties correctly, the runtime will reach the right result.
- The runtime or compiler understands much more of what it’s doing. It can cache partial results, it can decide what is safe to cache and what not. It can restart computations. It has much more information to manage how it runs the code.
- Parallelism! While a dumber runtime will decide to iterate over the list to find the highest value, a smarter runtime may decide to break the list in chunks, calculate the maximum of each chunk in a separate core, and then calculate the maximum of the partial results. With only the definition above, and knowing the transitive property of the >= operator, the runtime could reach this conclusion in a formal way. The possibilities that this opens are endless.
- Even if it won’t be easy to write a really smart runtime, we could write a dumber runtime that could take “hints” in order to get good efficiency, and then write the hints separately from the code itself. This type of runtime is an easier target, and at the very least, this approach allows much better separation between defining WHAT we are after and HOW we are going to get it. Similarly to SQL databases, it would separate the problem of defining the data model with tables and queries, and optimizing execution deciding the indexes to keep, how to keep them, etc… ensuring that the optimization will never introduce any bugs, and possibly allowing much better separation of work between different people with different skill sets.
Help is welcome!
I’m trying to address this somehow. I’ve thought a lot about it. I’ve written a bit about it. I’ve even written a bit of code. I have some insights on how to tackle the issues that come up defining such a beast. And there are many reasons I would be grateful for your help:
- First things first: I think I’m following some valuable insight, while going after some practical, realistic, attainable goal. But I may be delusional? So, be sure to let me know if I’m missing some key point that invalidates the whole enchilada!
- I have an eminently practical background, not theoretical or academic. I think I’ve done my homework, but I may be missing something out there. I know there are researchers in the different related areas: Coq for theorem proving, constraint-system solving, compiler research, static code analysis, etc… if you have a background in this area, and you can point me in the required directions, I will be grateful. Especially, if someone is already working in a system like this, or pieces of a system like this, I’d definitely like to know!
- If some of these things are already done, and the whole project turns out to be a matter of putting existing pieces together, let me know! It will take less effort that way.
- At least to me, this is a fully new approach to computing and programming. It requires expertise in compilers, languages, type systems, optimizers, constraint-system solving, etc…. how could I not need all possible help?
- If you are practically minded, as I am, you can help me make sure we create something that is useful for day-to-day work and can make developers’ lives less miserable, instead of a research project (much respect to those, but that’s not my goal here).
- I have little time and I have to dedicate a lot of my time to mundane things like earning a living and paying off debt. So of course, my resources are limited, and all help is appreciated.
- In general: I’m sure I’ve missed things. Let me know which ones!
This is a long-term project. I doubt much advance can be done in the short term. But hopefully, given enough years, eyeballs, neurons and hands, we will be able to program in much nicer ways in the future.
And we will do away with a lot of onions!
This is a very interesting approach to programming. It reminds me of something that might be called “definition-oriented programming”, where most things depend on that “what” versus functional programming which is more “how”. I suppose in a sense this could be done by defining things and maintaining those definitions , but how is that not already done by OO programming, with invariants?
I appreciate this new outlook on programimng, but how do we begin?
I know in some languages (C++) there are libraries that have been wwritten to multi-thread. I have no experience in multi-threaded programming, though, so I don’t know much about it. Could this approach be applied to C++’s in conjunction with C++’s multi-thread library?
But I guess that would be a moot point, because you’re writing about a compiler that would do all that sort of stuff for us, instead of us having to delegate the threads ourselves.
Thanks for the post!
jyhwei, I think a new modeling of information is necessary, and a new model on how a computation is performed. I hope to work on this over the next few months and years, and post here as I advance!
I’m wondering why your declarative max function checks if x = max.
“foreach x in set max >= x”
work just as well because of the greater than or equal to sign?
Also, it appears that this language could also handle error conditions better. For example, when given an empty set, your language could easily output “max is undefined for the input empty set” whereas imperative languages give errors based on the implementation like “index 0 is out of range”
Trishume, indeed, it’s not necessary, but I think it corresponds better with the definition that the element in question need not be compared with itself. The code should actually make that distinction.
Apart from that, I think you are right, and that a description of the goal allows for better error conditions.
So my next steps: start to turn this into something practical. It’s going to be a long journey.
How about expanding on your example one step further and removing the loop. something like this perhaps?
where element of set >= set
changed to distinguish a compare of the collection and the items of the collection.
where some element of set >= all elements of set
essentially we add the “of”, “some”, and “all” terms that will act on the set
I agree that the bit to exclude comparisons of the element with itself is not necessary in the example, I’d even argue that it is wrong and a remnant of the imperative mindset. Perhaps you want it there as an example of the type of operators and scaffolding needed for more general solutions, which is fine; but to me, a construction as fundamentally sequential and thus imperative as if..then feels out of place. I suppose in this case it may look clearer with a boolean operator, something like “x IS max OR x <= max". This computation is basically a functional reduce/fold, so it should be easy to spot and compile.
The first question that comes to mind is, would such a compiler consist essentially of a collection of problem patterns and known algorithms or theorems that prove or solve them (or rely on an automatic theorem prover)? Would this be a good first step to at least start validating some questions and ideas about syntax and semantics of this language?
The second question is about the elements of the language: how would you address the category of solutions that are naturally described by the steps needed to find them? For example, the sum of numbers in a set, or some statistical value like the std dev for more fun. Would those need to be primitives?
The next question: you mention hinting to help the compiler create more efficient code that solves the problem, but there's a whole category of problems where we prefer approximate solutions that bound some criteria that is somewhat external to the problem description, normally related to resource consumption during execution: 'in reasonable time', 'using constant memory', 'minimize the max size of all queues', etc.
Ahh such a fascinating topic.
Randy, the “foreach” is not meant to be a loop, it just means that the body “holds” for all elements in the set. It’s “simultaneous”, as in a “simultaneous set of equations”.
You are right that there could be built-in operators or syntax to make expression more succint. But right now, I’m trying to find the building blocks to define the underlying model, and thus not after succintness per se, but after the most expressive/powerful primitives and model.
Jare, definitely, the compiler will need to be able to apply methods. I think it realy needs to “understand” algorithms, as in, being able to deduce why applying a certain sequence of steps reaches a result holding certain properties. It actually has to do the inverse if it’s going to be useful at all, which is much more difficult!
There might be computations that are most naturally described by the steps, I don’t think addition is one of them (it may be with integers defined in the Peano “nat = zero | suc nat” fashion, but I think other definitions should be used). When a sequence of steps is the best (or only!) way, then I’m sure declarative properties can be “shoehorned” into describing imperative steps. Hopefully, that will be an uncommon case.
And the third area you mention is also incredibly interesting, and makes things still more difficult. Approximate computation, or knowingly getting incorrect but sufficient results. Current programming very often makes use of this (see: 99% of all C and C++ code out there, where the possibility of integer overflow in most variables is simply ignored). And in the model I’m proposing, it will definitely need extra work, instead of less work. But if such “relaxing” of conditions can be added, in a controlled way, it can be incredibly useful to use it as a “knob”. It would be great to also apply this when parallelizing code, as a relaxation of correctness requirements can make a big difference here.
But right now, I’d be happy with some simple, deterministic, slow system to start manipulating this kind of code.
My current idea is that a first way to start advancing is to define some data model, a language to describe the constraints, and a system to find solutions by brute force. As you say above, it’s necessary to start researching into validating ideas and questions about syntax and semantics, and we better start with something simple!
Thanks for the detailed & thought-provoking comment!
Another thought I had that kind of expands on raganwald’s idea:
The first step in a compiler that can understand algorithms is checking existing ones. Using the declarative definition and an existing implementation of the algorithm. The parameters and results represent the same thing in the implementation and definition so a checker could be written that uses formal logic techniques to match up the algorithm with the definition. And if it could not be matched then it is likely a bug.
One thing to think about that suggests that the writing a declarative definition compiler would be very difficult is that it does exactly what programmers do in programming contests. A problem is given that defines very specifically the inputs and outputs but not the process. The worlds best programmers often have trouble solving these problems and coming up with innovative solutions so it is likely that a computer would have a very tough time replicating that ability.
trishume, you are right, and I do think that matching an algorithm with a declarative definition is on the early path to writing the whole thing. It requires fully specifying the syntax and semantics for the declarative definition, the data model, and the way to describe the algorithm with regards to these. And these are the same that are needed for the compiler. Actually, I think the compiler needs to use heuristics to search the algorithm space, but the search space is defined in terms of these syntax and semantics.
I will try to post a rough idea of the roadmap I want to follow with all of this. It’s a big challenge and it will be good to get feedback on this. I’m finding the general feedback to be very helpful in focusing. And it’s motivating too.
Your idea is very interesting, but I’m not sure whether this approach can be used as a general programming paradigm. My doubt comes from the following.
1. Do you think that a specification can contain less errors than the implementation? Let’s look at your MAX() implementation. If I understand it well, the result is undefined
– if the list is an empty list
– if the list contains only one item, or all items have the same value (e.g. [3,3,3]) – because max is only defined if x != max
So even for such a small function we have 2 bugs in the code – about the same ones that we would have in an imperative code.
Now, this kind of error (undefined variable/result) can be caught in compile-time – but this already works for imperative programming languages with a good compiler. With imperative languages you can mistype or miss things only in the implementation – but with your approach, these bugs will go right into the specification. Do you think that they will be easier to find? Because this [3,3,3] case shows to me that these things can be indeed overlooked easily (_some_ of them can be caught by the compiler, but obviously not all)
(Btw. just a thought for this kind of bug: I think you should not allow “if” without “else” if you want to have a defined result in every case).
2. Let’s say that your list is ordered – in this case, max = last(list) if len(list) > 0. Where would you write this code? Because if you write this into the MAX() definition, you will not have any advantage – the goal of the specification (finding the maximum) will not be clear for the reader (they will think that you just take the last item of a list). If you separate this from the MAX() definition, what language will you specify these algorithms? (you cannot avoid writing algorithms in a non-trivial program)
3. Do you think that your approach is general enough that you can write your compiler in it? I mean, how will you generate an algorithm code in a language with which you do not write algorithms?
These are just my initial thoughs on your idea, and I’m sure you were already thinking about these things. I’m really interested in what you think about this.
Gyim, thanks for sharing your reflections, point by point:
1. That ‘maximum’ is undefined for an empty list is just part of the definition. It’s not a bug, it’s like the square root of -1 when you are only using real numbers, it doesn’t exist. Also, the code above is rough pseudocode, proper syntax and semantics are missing. An “if” applied to a property that “must hold” means that, if the condition is not met, then the property needn’t hold. You are actually defining constraints, and these can be conditional. What is a very interesting area in this point you bring up is the point about ordering and about repeated values. This opens up a huge area. Actually, my MAX definition above is ambiguous, if there are repeated values, there are multiple solutions to this. In the case of sets (no ordering), it can be problematic, because there is simply no way to define MAX with a single solution. In the case of an sequence (ordered), you can define it as the earliers element in the sequence that holds the property, which makes sure that there are no ambiguities. Some time ago, I decided I would focus on sets first, as they’re actually simpler than sequences with regards to declarative properties.
Also, spec correctness is a big issue unto itself. I think this is probably unsolvable. Because if being correct means being what you want, and the spec is the formalization of what you want, then it’s correct by definition. But anyway, I think the approach doesn’t mean you will always get the program you want, just that it will help make it easier for you 🙂
2. This is actually the type of place where I think a declarative definition of MAX can shine! If the list is known to be ordered becuase of where it comes from in the program, then applying MAX() to it can be compiled down to just taking the last element. But this is the job of the compiler! It just needs to know, handle and apply this behavior! Then, if for whatever reason, the input sequence becomes not-sure-to-be-sorted, the compiler will generate the right code. This is the kind of thing that will help get good performance and remove bugs at the same time.
3. In order to write the compiler for this in itself, the syntax and semantics need to cover a lot of ground. Initially, I’m happy if I can find a subset, hopefully a useful one, over which to write a working “Proof of Concept”, mainly dealing with sets, integers, etc… once the basics are established, it’ll be easier to cover more ground (ordered sequences, etc…) and hopefully in the end, yes, you can probably describe the implementation of this type of system with its own notation, but it’s not a realistic goal as of now.
Well, if you want to take small subset of problems, e.g. sets of integers, then you can define a reasonable sized model where you can indeed solve problems by defining output properties. For example, in SQL you can define MAX() with a very similar approach (SELECT DISTINCT n FROM table WHERE n >= ALL (SELECT n FROM table)). But it works exactly because its model is limited (relational algebra) and mathematicians worked for many years to find suitable algorithms in this problem area.
But I’m quite sure that it’s theoretically impossible to write a _general_ problem solver. You may be able to _define_ any problem, and your compiler may generate some code for it, but it just won’t be useful. For example, I guess it’s possible to define the properties of the perfect compiler (output = the shortest running code for a given function). A theoretically good solution for this is to generate all possible compiler codes, and see which runs the fastest… In the same way, for MAX(), it’s theoretically correct to generate all numbers the set can possibly hold (0-MAXINT), and check whether it’s the maximum or not. As a developer, how can I be sure that the compiler will not generate this later solution, based on only the declaration of the output properties?
By the way, this problem reminds me of my experience with Prolog: it was easily possible to write a theoretically good definition which just never terminated in practice. You had to know how it works _imperatively_ (using backtrack algorithm, etc.) to write useful code in it. We were taught “think declaratively, verify imperatively”.
I had similar experience with Haskell as well: I wrote a perfectly clean, theoretically good code, but since I was very new to Haskell and I did not know what _low-level_ code it really creates. The resulting program leaked all memory (due to some lazy evaluation-related stuff) and was completely useless in practice. So I had to dig deeply into the low-level stuff again (how the compiler works, how to trigger its optimizations).
This is my main problem with “smart” compilers: not that they create sub-optimal code, but that they sometimes create completely useless code. And since the basic idea of these languages is that the compiler can optimize better than the user, they usually don’t provide good tools to find/fix what’s wrong.
I think the most loved languages today are those where you can define things declaratively, but you can easily dig deeper if you need. For example, in Rails you can write “validates_presence_of :name”, and not think about it in 99% of the time. But if something goes wrong, you _know_ what it does: it calls a function, which adds some hooks somewhere, etc.
If you want to create a useful language, I think you should also find this middle ground: think declaratively, verify imperatively. For example, if the most naive (non-optimized) implementation could be derived from every declaration, it would be a good starting point for a developer. In the MAX() case, it would be helpful if the writer knew that the_worst case is a foreach on the list (e.g. “element-in” always compiles to a foreach if no better solution is found).
Very good points. Indeed, I want to create something useful, so those are things to keep in mind.
Making the “compilation” inspectable and adjustable, using hints or so, could help. And it’s a great idea to have naive solutions as the baseline. Brute force is a possible way to solve many problems, and other solutions are improvements on this. I believe being able to calculate upper and lower bounds for a computation can also be a good way to avoid generating unusable code. Or at least, letting the developer know where it’s happening.
I am not sure, but I think that you are right that a fully general problem solver is provably impossible. But I’m not after that. I hope there is a way to write the kind of code we write today more easily and reliably, and that this will also open the way for more advanced things.
I think at a certain point, the code has to match up with the hardware. A “sufficiently smart compiler” may one day exist, but I think we’re a long way from it. Yes, some C code might resemble a blind person following a map, but that’s because a CPU is closer to a blind person than a thinking person.
I realize that abstracting this away would allow for a language that is hardware agnostic, and compilers would just have to be smart enough to create the right “map.” Might be super useful when there’s a lot more hardware diversity (kind of starting to get there, with GPUs…)
Also, you might want to consider using good ol’ mathematical notation. Math folk have gotten very good at describing algorithms. Prolog is certainly an attempt to do this, but you’re right, in that it doesn’t quite do it.
It’d be great to have a language where I could describe a max function as:
x such that for all i x > a[i]
Or a shortest-path problem as:
P such that there does not exist an L in (all paths – P) such that W(L) < W(P)
(that one would've looked a lot cleaner with symbols instead of words, but I think you get the idea).
Your post reminded me of the Knuth quote: "Beware of bugs in the above code; I have only proved it correct, not tried it." It would be awesome to have the language that avoids those off-by-one errors, but I just can't imagine that language being anything other than the precise, symbolic language already widely used by mathematicians everywhere.
LaTeX based language? Might be cool. Too bad that "sufficiently smart compiler" is trouble…
Anyway, ignore my rambling. Here's my question: where's the line? I know it'd be great to have a compiler that can look at high-level statements and pick a hardware-specific algorithm to make it work. It'd be great to be able to write one line of code to find the shortest path in a graph, or calculate the neutron flux of a reactor, or to prove unproven mathematical theorems.
It seems like if I extend your logic a few steps further, I should be able to write a program that says "solve my problem" and a "sufficiently smart compiler" could figure it out… obviously that's no good.
*** So where's the line between something like an array_max function and something like solving a differential equation? Or creating an awesome game? ***
Ryan, indeed, math notation can be very useful here. I do want to use an ASCII representation for practical reasons. But you are right that mathematics has many tools that can be useful. On the other hands, mathematics proofs and math too often resort to using English, and here, everything has to be not only formal, but actually manageable by a real program 🙂
With regards to your question: indeed, there isn’t a line. There are many lines. I’m only trying to advance a bit. If MAX is used in a program, it’s used because of something. For example, it could be a subsystem trying to allocate memory to the size of the largest data set it’s going to handle, and thus, there is a higher level than MAX() itself. And if it’s allocating space for something, there is a goal on top of that, etc… so there is no real line, like there is no real line separating the WHAT and the HOW. In any case, if I am able to define a language and system that can take the declarative definition of MAX above as input, and produce the imperative code to find it as output, I will be more than happy, as I think it’s a tall order, and I will have had to solve several very important and very challenging problems on the way, resulting in some useful tools and knowlege. And building upon that towards real AI (“write me an awesome game”) will be the job for others to come. Plenty of work to do in Information Science and Technology! 🙂
One more point, in case anyone is still reading. The reason I was including the guard “if x!=max” is the following: actually, the “max” function is properly defined if one element is *strictly larger* than all the others. If we are actually finding the maximum of a set of numbers, it is not very important, as taking any of the repeated ones is equivalent. But in the more frequent case where we have a sequence of elements of some other kind, and we are trying to find the one for which a given attribute is the maximum, the difference is very relevant. The “strict maximum” function is not defined if two or more elements share the maximum value of the tested attribute, but then this is good to have come up, and make you define how to choose in this case.
Suppose you are trying to find the pupil with the highest grades from a list – if two or more pupils share the highest grade, a “strict maximum” function would ensure that we have to define somehow a secondary attribute in case a single solution is required, and make us think this case. In traditional solutions, we will keep either the first or the last value encountered, subtly depending on how we write the code, and not making this case explicit.
To sum it up, a “max” function defined with “>=” is defined in more cases, but it is often ill-defined and thus it can be a cop-out or a source of not-properly-considered behavior, while a “strict max” function defined with “>” is undefined in more cases, but it’s conceptually clearer. Remember that, in either case, the function is not defined for an empty set, so you have to deal with that case anyhow, thus aspiring to have something “always defined” is not an attainable goal.
One more note – actually, a “max” function defined with “>=” is not actually “undefined” per se when several elements share the highest value, but it has multiple solutions. So, we should probably provide explicit and simple ways to deal with the cases of undefined/multiple-solution/properly-defined solutions to the stated goals, as these possibilities are certain to come up in many cases.
If we take “a = b” to mean we can’t tell (or don’t care the difference) between a and b, then “>=” is a fine and unambiguous definition for max. Most of your concerns above would only apply to argmax.
But I digress… I will reply to the main thread privately assuming your contact info is here somewhere. (If not, email me.)
Yep, specially in the “really rough pseudocode” above, the devil is in the details even more than in regular cases. I just wanted to show some “proof of concept” of the type of definition I’m after, and I was avoiding a lot of landmines 🙂 Will go into more detail in the future, but anyway, clarifying all that in a formal language is what the project is all about! I’m actually starting to think that defining the language properly is over half of the solution already… at least at the level of easily having a “reference implementation”.
This steals from Haskell:
“:” means the argument is a list, a is an element, and b is the rest of the elements.
max(a:b) where (a:b) is a splittable-set
is a if a >= max(b)
is max(b) if a < max(b)
is a if b = null
otherwise is null
If the set is implemented as a tree, the underlying algorithm can split the tree and run max() on each branch, then execute a max() on a list composed of each of the results.
I completely disagree with your statement that “our brains are better wired for iteration than recursion.” I would agree with something like, “most programmers learnt loops first, and thus prefer them to recursion”, but the thought process is far better with recursion.
Imagine max of the values in the leaves of a tree, for example, and the recursive example is immediately clearer. It also makes a bunch of little things more obvious, like the behaviour if it’s empty. Note that functional programming makes explicit the the difference between fold-left and fold-right, and between fold-right and fold-right-one. Your implementation of sequence max is absolutely not the correct one in a functional (or concatative) language; the correct one in SML is something like “let max a b = if b < a then a else b end; let seqmax = foldl1 max;". (See Stepanov's Elements of Programming for why that's the correct way to write the condition.)
It's hard to make anything more precise than the second part of that definition. The foldl1 tells me immediately that there must be at least one element in the list, and that the result is the values of the list combined by the obvious and understandable max function. In your world, the only thing that's unnecessary is the "l" part, since max is associative, and it should figure out the best way to go about the decomposition. (Which would allow it to do an array in parallel, for example.)
For reference, here's the same thing in the Haskell prelude: , so your clain that your “functional” definition above is “canonical” is clearly false.
I suspect implementing this will prove a challenge (quicksort vs heapsort vs funnelsort vs …, and things like O(1) operations vs amortized O(1) operations for hard or soft real-time), but I’ll ignore that for now.
Here’s a simpler question: What’s your “standard library” of function result scopers like the “element-in” you use above? Can I write my own? (And do you have a good name for them, so I can ask better questions?)
Another try at the haskell prelude link: http://ww2.cs.mu.oz.au/172/Haskell/tourofprelude.html#max
Your proposed function is too long.
max (set): element-in set
foreach x in set
if x != max
max >= x
The max >= x condition holds even if x == max! So it can be shortened:
max (set): element-in set
foreach x in set
max >= x
Your logical solving notation should work without being tied to some function notation. Basically, it should identify all free variables and instantiate them.
Imagine this Lisp macro/operator:
(logic-bind (logical expr)
… here, the free variables from logical expr are instantiated and bound to their values)
;; assume s is a variable holding a set which is in scope
(logic-bind (forall (x set) (>= max x))
Turn into a function:
(defun max (set)
(logic-bind (forall (x set) (>= max x))
logic-bind identifies all free variables in the expression. This is easy to do lexically. The bound variables all have quantifiers like exists or forall. So for instance, at the point above where (print max) occurs, the variable x is not in scope: it is bound by the quantifier and so it is internal to the logic.
Some situations could have multiple solutions and so the variable is bound to a list of them. A special syntax could exist for asserting that the solution must be unique, like by wrapping a variable v with (unique v).
A nicer syntax would spell out which variables there are, so that meta-programming is easier:
(logic-bind (max) (forall (x set) (>= max x)) ..)
If the logical condition contains free variables not mentioned in the variable list, then they are just ordinary variable references. For instance:
(let ((fortwo 42))
(logic-bind (gt-fortwo) (forall (x list) (>= gt-fortwo fortwo))
;; gt-fortwo is list of solutions to the logic: all elements of list greater than 42.
And of course if a free variable in the logic does not have a binding in the surrounding code, and the logic requires its value, then it’s an unbound variable error.
I very well understand what you describe, but I am extremely sure the same goals can be achieved by using a classical (pure) functional language and an even less complicated sufficiently smart compiler (although still more potent than what exists today). Write me an e-mail if you’d like to discuss, as I got lots of ideas about how the next language should be, too. 😉