Higher calling: Revenge of the functional languages
Wednesday June 25, 2003

- by Joab Jackson -

At midnight June 28 GMT, the International Conference on Functional Programming (ICFP) will post the next challenge in its annual programming contest. Teams from around the world will be given 72 hours (or 24 hours if they pick the lightning round) to write software for the specified task. The goal of the yearly contests is to show what the organizers believe is the superiority of functional languages over the imperative languages more widely in use today.

In years past, ICFP teams have raced to build virtual package-carrying war bots or optimizers for wireless games. As if building these complex creations within the compact time frames isn't taxing enough, the organizers usually put some crimping limitations on computational capacities or the size of the code base.

Entries written in C, C++, Perl, Forth, and other languages are frequently submitted, said Tim Sheard, one of the organizers of the 2002 contest. But more often than not, the victors tend to be programs written in the lesser-known functional languages, such as OCaml, Haskell, and Clean.

Why do they fare so well? Advocates claim that functional languages run faster and take less time to develop, once you know how to program in them.

In contrast to traditional languages, functional languages evaluate expressions rather than execute commands. "Functional programming is a style of programming in which the primary method of computation is applying functions to arguments," says Graham Hutton, a Professor at University of Nottingham's School of Computer Science and IT.

"Functional programs are much more concise," says Hutton, who is writing a book tentatively entitled Programming in Haskell. He estimates that functional code is between two and 10 times more compact than that of normal languages. Given the industry rule-of-thumb that programmers tend to write the same amount of code regardless of the language they write in, writing in a functional language leads them to develop more quickly.

However simple the difference, the disparity in programming styles can be jarring if you haven't been exposed to functional programming before. Variables aren't assigned. Flow-control is virtually non-existent. The whole process might seem terribly limiting compared to, say, Java or even C.

"The functional language community is excessively dour. The functional ascetics forbid themselves facilities which less pious programmers regard as standard," warns the course notes of one functional language programming class.

"The functional programmer sounds rather like a medieval monk, denying himself the pleasures of life in the hope that it will make him virtuous," John Hughes wrote in his 1984 seminal paper "Why Functional Programming Matters."

Perhaps because of such seemingly harsh environs, functional languages never really found a foothold in the commercial field of software development. Instead, they have flourished in the academic realm, and largely within European universities at that.

Advocates, however, say that, with the increasing complexity of software programming, functional languages may address the growing concerns about bugs and security issues.

"I think a big increase in the use of functional programming is imminent, but it will probably be in small specialized niches at first," Sheard says.

The power of the function

Back in the mid-'80s, just before object-oriented programming started making serious inroads into the programming community, Hughes and others viewed functional languages as a solution to the problem of writing complex code. It seemed structured programming techniques could no longer keep up. Programmers had to keep straight in their minds the scopes of the variables and as well as the ways that the pieces of a program fit together. Functional languages could keep the workflow simple.

Programs written in functional languages are nothing more than collections of mathematical functions, often written in an algebraic-like form. Even the functional language-based program itself is one big function, with the input being the argument and the output being the result, according to Hughes's paper. The program, seen as one big function, is defined solely by other functions, which, in turn, may be defined by still other functions.

Take, for instance, a simple task of summing a string of consecutive numbers. In most languages, you would need to declare two variables, one as a counter, the other to serve as an accumulator. Then, you would write a loop. In C, the code for summing up all the numbers between 1 and 10 would look something like this:

int count, total;
count = 0;
total = 0;

for (count = 0; count < =10; count++) {
total += count;
}

With a Haskell functional language interpreter (Hugs 98), the same function can be done thusly:

sum[1..10]

In this instance, obviously, the programmer has less to type, and less to worry about.

"You wouldn't need to break down the task into tiny steps of declaring variables, writing a loop, setting the counter to zero," Hutton says. With functional languages, "you work at much higher level."

Less work is done by the program as well. The above C code depends on changing both the total and count variable 10 times, one for each loop instance. In contrast, the Haskell program breaks it into three functions. The first, "[..]," fleshes out all the numbers between 1 and 10. "Sum" adds these digits together by drawing on a third function, "+," which was placed between each of the digits by the "[..]" function.

The order in which the functions are applied in a calculation, while not affecting the answer, can determine at what point a calculation ends, according to Hutton.

As a result, every function only runs until it is finished and then terminates. By linking functions in sequential order, no computation is done until the result is actually necessary, Hutton said.

"Functional languages emphasize recursion. Many algorithms are recursive by their nature, doing things in terms of themselves. It's a natural way to think, but normal languages don't emphasize it, so you have to go through roundabout ways of doing things [such as looping] that are essentially recursive," Hutton said.

Coding by functions also makes the code itself easier for programmers to understand.

"Functional languages are designed to be compositional. That means that the meaning of a program is determined only by the meaning of its subparts," Sheard wrote by email. "So if I break the program into pieces (say to restructure it to accommodate a new idea or some extension) the pieces still mean the same as they did before I broke the program into pieces."

All of which is not to say that regular languages can't be used to write functional programs. They can, but it requires work.

"Most programming languages support programming with functions, but the language and the features that they provide don't encourage you to think in that way," Hutton says. Imperative languages willfully discourage programmers from doing such things as storing functions in lists, constructing intermediate structures, allowing functions to be defined in terms of themselves, or producing functions as results, Hutton said.

Functional languages

Not surprisingly, the origins of functional languages come from the world of mathematics. The first modern functional language, Meta-Language or ML, was developed in the late '70s by Robin Milner, Mike Gordon, and Chris Wadsworth for use in automatic mathematical theorem proving, according to Robert Harper, who in his post-doctoral work at the University of Edinburgh, worked with Milner.

"ML was particularly well designed for symbolic computing -- manipulating logical data structures," Harper says.

ML was not the first functional programming language. That honor goes to Lisp, which drew some of its ideas from lambda calculus, a mathematical theory of functions developed in the 1930s by Alonzo Church.

However ML was "the first statically typed functional language," according to Harper. Static typing was absolutely essential for theorem proving. Although written specifically for theorem checking, ML served as the platform of many more generalized functional languages in use today, including Standard ML or SML. (For others, see the Comp.Lang.ML FAQ.) SML introduced pattern matching, a feature that made it and its derivatives a natural candidate for writing compilers.

"It supports the manipulation of programs as data objects, which is what you are doing when you are compiling," Harper said.

Last year's winner of the ICFP was written in a dialect of ML called Objective Caml, or OCaml.

"OCaml allows the programmer to write much more compact code than C without losing quality of code," declared the programmers who built the winning Rog-O-Matic III program which best completed year's task, involving helping a robot deliver packages over a deadly water- and hostile-bot-filled terrain.

Caml itself stands for "Categorical Abstract Machine Language." It was created in 1996 by Pierre Weis primarily as a teaching language.

Caml comes in two flavors. Caml Light offers the core features of the language, while OCaml is a full class-based object-oriented language.

For Xavier Leroy, chief developer of the OCaml, functional techniques bring to object-oriented languages the advantage of strong static typing, which eliminates run-time type errors.

He gives an example. In Java, "all collections (such as vectors, hashtables, sets) operate over the universal class Object: A Hashtable maps Objects to Objects, and a downcast is needed to convert the Object read from Hashtable to its real class. Such a downcast can fail at run-time if the programmer made a mistake on the expected class of the object," Leroy wrote by email.

To avoid this pitfall, Leroy said, Caml classes can be parameterized by types. Thus, a hashtable doesn't just have type "Hashtable" but rather "Hashtable associating data of type U to keys of type T."

"Thanks to this refined typing, data read from such a hashtable can be statically guaranteed to have type U," Leroy wrote.

Beyond the ivory towers?

While functional languages flourish in the academic realm, especially in Europe where much of the development on OCaml and Haskell is being done, they have remain largely untouched by the commercial software field.

Perhaps the methodology's academic roots hamper its popularity. As Hutton points out, programmers are often trained on the fly, picking up Java or Perl or other languages as the job requires. However easier they might make the programmer's job, functional languages nonetheless can have a steep learning curve.

"If you are going to program in a functional language, you really need proper training," Hutton said.

Yet, its influence in the commercial world may increase. If the industry shifts its emphasis away from getting products out to market and toward customer demands for better security or more flawless code, then a corresponding shift towards functional languages may also take place.

Take security for instance. Harper's research group at Carnegie Mellon University uses SML to develop certifying compilers that check code generated from high-level language compilers to verify that such code is secure down to the assembly language level.

As developers have increasingly moved to using higher-level languages, the purchasers of their products have increasingly trusted that the compilers that these programs were generated on were bulletproof -- an assumption that doesn't always hold, given the complex nature of the compilers, according to Harper.

"Historically, you have had to trust that the code you are about to run has good safety properties -- that it was written in a high-level language, compiled on a compiler that is correct, and wasn't a compiler whose sole purpose is to defeat security guarantees. Such things could exist," Harper says. The certifying compilers Harper's team has developed can provide "a certificate that can be checked that tells you that the code you have -- regardless of how it was created -- has certain security properties that you can verify yourself."

"This is the way compilation will be done in the future," Harper predicts. "There's a lot of research that still needs to be done but I think the case has been clearly made."

One historical drawback to commercial use, observers have noted, is the lack of commercial-grade tools. That too, however, is changing.

For instance, the development team of the Clean functional language has released its own integrated development platform, called Clean IDE. Clean IDE includes the editor, compiler, code generator, debugger, linker, project manager, and graphical user interface library. The latest version, 2.0, is available only for the Windows platform.

The 2.02 version of the IDE platform, released last December, also supports indexing of generic functions. It is the only functional language IDE platform available, according to Marko van Eekelen, an associate professor of the software technology department of the University of Nijmegen in the Netherlands who is one of Clean's developers.

What sets Clean apart from other functional langauges is a feature called uniqueness typing. Uniqueness typing is the destructive updating of objects, which allows programs to interface directly with foreign language code as well as precisely control the way data is updated in file updates, in-place array updates, and user-defined data structures.

Microsoft may be jumping into the functional fray as well. Earlier this month, word leaked out on Slashdot that Microsoft is implementing its own dialect of ML, called F#, for its .NET Web services development platform.

Well, yes and no, said Don Syme, the creator of F#.

Like most large companies, Microsoft invests in research to anticipate future trends. It sees functional languages as at least having the potential of being useful in the creation of Web services. However, there is no guarantee that F# will be commercially released as a .NET plug-in. (Version 0.6.4.1 of the F# compiler is the latest version that can be downloaded.)

"F# is a general purpose programming language, and can be used for all sorts of things," Syme wrote by email. "F# code is type safe like ... other .NET languages, but the code is often much more compact than other programming languages because of type inference. The programmer doesn't need to know all about classes, interfaces, structs, inheritance, and so on, and just has to master a few simpler constructs."

Syme started out with the intention of simply implementing OCaml for .NET. However, he saw that some of "hard" features of ML, such as functors, were not immediately necessary at least for initial versions. "Functors are an advanced modularization mechanism that can express abstractions that are impossible to express in a typesafe way in other languages. However this kind of abstraction is rarely needed, and can make code hard to understand," Syme wrote.

F# is not the only .NET functional language implementation underway. A team at the University of Cambridge has just released a SML compiler for .NET, called, appropriately enough, SML.NET. Also, work seems to be underway in making a version of Haskell.NET by the University of Canterbury in New Zealand.

While .NET is far from the only commercial development platform, the appearance of F# and similar .NET implementations sends a message. Industry may finally be ready for the functional paradigm. Those developers willing to expand the way they think about programming may be rewarded with stronger code.

--Joab Jackson



[ The archive || [E-mail]