Thursday, 21 March 2013

Design by committee works slowly

The latest draft for polymorphic lambda expressions, which I advocated for in a post about 3 thousand years ago, is a step in the right direction. I greatly appreciate the time that the authors are taking to push C++ forward. I know they do it on a volunteer basis and I believe their passion for it makes C++ one of the best languages to use for a variety of projects. On reading the draft though, I'm still a little underwhelmed.

Lambda expressions are anonymous functions that are common in languages with first-class functions like Lisp. Roughly, the language gives you the ability to create functions "at runtime" which allows you to store data and other state. Once this is possible, anything is possible. You can read more about anonymous functions at Wikipedia.

When a programmer uses anonymous functions, he or she is not doing it for a technical reason (i.e., performance) They are doing it for one or all of the following reasons:

  • Lexical locality: The data that the anonymous function will be operating on is somewhere nearby and we just need to do a little transform on it to make it useful to something else which is also nearby.
  • Readability: x => 2*x+y is much easier to read and understand than MyFunctor f(y) because you need to go look up the definition of MyFunctor.

In x => 2*x+y, you can see that the 'y' value must come from somewhere else in the function: capturing data in lexical scope is an important part of anonymous functions. This is the reason why MyFunctor takes in 'y' as a parameter.

Anyway, as my post tried to explain, one of the main problems is the ridiculous verbosity implicit in monomorphic lambda expressions. By allowing polymorphic lambdas, the verbosity has a chance to be reduced or even eliminated to the simplest possible thing. The latest draft makes an "auto" necessary on lambda expression parameters.

To recap, C++11 lambda expressions transform a statement like:

    [](double slope, double intercept, double x){ return slope * x + intercept; }

into a function object not completely unlike:

    struct LOL
    {
        double operator()(double slope, double intercept, double x){ return ... ;}
    };

Most lambda expressions will only ever be used with one set of parameter types and in one situation so it is not hard to understand why this is one acceptable syntax. However, languages like C# have much more concise syntax for the above case:

   slope,intercept,x => slope * x + intercept

The compiler figures out the types since it is a statically typed language and everyone is happy.

Before lambda expressions, in C++, we might have written:

    namespace bl = boost::lambda;
    ...    bl::_1*bl::_2 + bl::_3 ...

My goal for C++ lambda expressions would be to never use any of the Boost lambda libraries again, as useful and awesome as they are. With the new draft, the C++11 version becomes:

    [](auto slope, auto intercept, auto x){ return slope*x + intercept; }

As you can see, the above Boost Lambda form is arguably still preferable to the draft version of polymorphic functions just on length alone. And although it is longer, it is slightly easier to read and understand because of the named parameters. But why can't we spoil ourselves? There aren't too many technical tricks required to automatically turn:

    [](slope,intercept,x){ return slope*x + intercept; }

into the same form behind the scenes.

In my humble opinion, the auto actually adds nothing to readability and takes it away because I am required to read more to understand what is going on. Multiply this by thousands of expressions and multiple projects and it is just another thing I have to skip over. There is actually very little reason to require auto. With this extension, it is still easier to use Boost Lambda

The 5 people who voted "strongly against" making auto optional should rethink their votes. This is the best chance we have of getting it right the firstsecond time.

No comments: