Nested Lambdas and Move Capture in C++14

Friday, September 12, 2014

Anonymous functions, or lambdas, were introduced in C++11 as a convenient, lightweight syntax for creating one-off functions. What excited me most about this development was that I could now compile a functional language into C++ by constructing a more faithful embedding of higher-order functions, without requiring me to deal with issues like closures and lifting. It wasn’t perfect though, and there were a number of warts that made lambdas unusable for anything other than the simplest cases.

In this post I’m going to motivate C++14’s initialized lambda capture as a solution to one of the numerous problems hitting one of my most common use-cases: nested lambdas.

Take #1: Pass by Value, Capture by Value

Consider a simple haskell-esque function:

add :: Int -> Int -> Int
add = \x -> \y -> x + y

This translates into a correspondingly simple C++11 lambda as follows:

auto add = [] (int x) {
    return [x] (int y) {
        return x + y;
    }
}

On the face of it, this doesn’t look so bad – an outer lambda which takes an argument and returns an inner lambda which takes its own argument and while capturing the outer argument. This function can be called as add(5)(2), in the usual curried form.

On closer inspection, things can get less than pretty. Specifically, see if you can spot the number of times the value x is actually copied before the addition is performed.

Since add takes x by value, that’s potentially where the first copy happens. x is then copied again, when it is captured in the inner lambda. That’s two copies, when at worst we should only have one (the outer one), and at best none at all.

There are several methods to eliminate some or all of these copies, methods that are available in C++11. They all have some drawbacks, making you forfeit either convenience, semantics or safety. Let’s take a look at a few of these.

Take #2: Pass by Value, Capture by Reference

The obvious way to eliminate the capture copy is to capture by reference, instead of by value. After all, that’s what it’s there for. This leads to issues however, because we’re allowing a reference to a stack-allocated value escape its scope. Here’s the same example, capturing x by reference:

auto add = [] (int x) {
    return [&x] (int y) {
        return x + y;
    }
}

While x is no longer copied at the capture point, this opens us up to a whole host of unforeseen behaviors. Since x is stack allocated, it technically doesn’t exist after add returns. This means that any piece of code that can conceivably call the inner lambda is already dealing with a reference to a memory location that doesn’t necessarily contain the value it did when the lambda was created. At best you’ll get the wrong answer immediately; at worst you’ll get it sometime later where you can no longer correlate the error and the cause.

Take #3: Pass by Reference, Capture by Reference

What if x were guaranteed to exist after add returned? This would solve the inner lambda’s problems, but the only way to accomplish this to force add to accept its own argument by reference.

auto add = [] (int& x) {
    return [&x] (int y) {
        return x + y;
    }
}

Combined with reference capture, this eliminates both copies. This is not however without its own problems. The semantic issue from the previous solution still exists, since we are still capturing in the inner lambda by reference.

A more obvious problem is that we can no longer call add with r-value arguments, such as temporaries and literals. For example, we can’t call add(5)(2), as 5 isn’t an l-value, and cannot therefore be passed by reference. We’re forced to do one of two things: declare the argument ahead of time (int x = 5; add(x)(2)), or use constant references.

Take #4: Pass by Constant Reference, Capture by Reference

Accepting an argument by constant reference allows us to pass in both l-values and r-values – the compiler extends the lifetime of an r-value by just long enough that things work out. Our add function now looks like:

auto add = [] (const int& x) {
    return [&x] (int y) {
        return x + y;
    }
}

Which is great – no copies anywhere, and we can call the function however we’d like. Semantic issues still abound, but apart from the fact that the program might not do what we want it to do, we’re fine.

But we’re not done yet.

Suppose we wanted to change the definition of add just a little bit – increment x before adding it. This is a contrived example, but it’s not that difficult to imagine a function where you’d like to modify your arguments.

auto increment_add = [] (const int& x) {
    ++x;
    return [&x] (int y) {
        return x + y;
    }
}

But this can’t work! We’ve already declared x to be passed in by constant reference, which means we can’t change it. What we really want is for x to be passed in by value. At this point, we realize that we’re precisely back at square one, and throw up our hands.

At least that’s how it was before initialized lambda capture in C++14.

Take #5: Pass by Value, Capture by Move

Initialized lambda capture gives us access to the [id = expression] syntax in the capture specifier list, allowing us to initialize the captured variables however we want.

The utility in our recurrent example is in the realization that x isn’t used in the outer function after the return statement (for obvious reasons), so it can technically be moved into the returned lambda.

auto add = [] (int x) {
    ++x;
    return [x = std::move(x)] (int y) {
        return x + y;
    }
}

By moving x into the lambda, we prevent the capture copy, while permitting x to be modified in the outer function (and indeed, in the inner function as well). Additionally, by accepting x by value in the outer function, we permit the compiler to infer moves there as well. This will result in exactly as many copies as are necessary in order to get the desired behaviour.

This technique exhibits varying degrees of success for different data-types; primitive types such as int and float arguably don’t benefit that much, since a move is about as expensive as a copy. More complex types with heap-allocated resources such as std::vector might take arbitrarily long to copy, making a move quite appealing.

Lastly, for types which can only be moved and not copied – such as std::unique_ptr, this is the only way to capture them.