Inspirel banner

What Happened To R-Value

Introduction

An interesting question was posted on Usenet groups with C++ example showing apparent conflict within rules related to r-value vs. l-value requirements for assignment expressions.

A bit of introduction is in order to show where the problem comes from.

In the world of fundamental types

The following function returns a temporary object:

int foo()
{
    return 7;
}

The temporary object that is returned by this function cannot be assigned to due to the fact that the expression involving a call to this function is what is formally called an r-value - an r-value does not designate any object that could be named or pointed, but a pure value of some type that can be only consumed or read.

One of the most important consequences of being r-value is that r-values cannot be assigned to. In other words, the following is incorrect:

foo() = 5;

The above statement will not compile, as assignment requires that the left-side of the operation is a proper l-value - that is, an expression that denotes an object or variable and not just a consumable value of some type.

There are two important rules that together prevent the above assignment - one is that function call expressions returning values have the property of being r-values and another is that r-values cannot be assigned to.

Strange things start to happen when the classes are involved instead of simple types.

Classes do it differently

The following class can be used to wrap integers:

class Int
{
public:
    Int(int i) : i_(i) {}

    void operator=(int other)
    {
        i_ = other;
    }
  
private:
    int i_;
};

The assignment operator in this class could have been automatically generated by the compiler, but showing it explicitly highlights the problem, which can be perfectly expressed as:

Int bar()
{
    return 7;
}

// and later:
bar() = 5;

Interestingly, the above compiles and works properly - that is, the function returns a temporary object and the assignment operation modifies that returned value. How is that possible with regard to the two standard rules described above? Obviously, the function call expression is an r-value and that is not supposed to be modified - but obviously, some l-value is involved in the assignment operation.

What happened to r-value?

The Usenet discussion that was triggered by a very similar example gravitated initially towards the concept of implicit conversion of r-values to l-values. This idea was quite intuitive, although something was wrong with it as there is no such conversion in the language standard. What is the explanation, then?

What is really happening above is that standard rules that would normally prohibit such code with fundamental types (as described earlier) are no longer effective when overloaded operators are in use.

Most importantly, invocations of overloaded operators behave as regular function calls and any special requirement that would be effective for fundamental types is gone - in particular, it is no longer required that assignment operator has to be called with l-value on the left side. It can be called with any expression as its prefix and in fact the above example can be rewritten as:

bar().operator=(5);

where operator= is really nothing more than a member function with a bit unusual name.

Still, the prefix expression is r-value and obviously there is some l-value within the implementation of the assignment operator - and there is no standard conversion between the two. This is probably the most interesting aspect of this example.

The mystery can be solved once it is realized how these two sides of a single operation are connected. On one side there is an assignment expression (with r-value as a prefix expression) and on another side there is an implementation of the assignment operator (where the "this" object behaves as an l-value). These two sides are only connected by the contract of the operator's signature, which can be repeated as:

void operator=(int other) // with additional implicit "this"

This signature isolates the caller from the callee in the sense that only the guarantees of the signature can be relied upon in reasoning about what happens on the other side. For example, the operator's implementation is guaranteed that some integer value (or something convertible to it) was provided by the caller as an actual parameter. Similarly, the caller can assume that the object in question might be modified by the call, as there is nothing in the signature that would prevent such a modification. All such two-sided guarantees are visible in the operation signature.

The point of this reasoning is that there is nothing in this signature that would propagate the distinction between r-value and l-value from the caller to the implementation, which means that the implementation of the assignment operator is not affected by the the r/l-value property of the expression that was used to invoke it. In other words, just as the operation signature introduces some guarantees for both sides, it also introduces some form of isolation between the two and breaks the continuity that would otherwise bind both sides and require formal conversions for every mismatch.

What happened to the r-value in this interesting example is that as the property of the expression that was used to invoke the assignment operator, it was not propagated down the chain to the implementation of that operator. The fact that no conversion from r-value to l-value exists in the language is not a problem here, as these expression properties do not affect each other when they relate to physically separate regions of code.

Readers interested in the gory details of the C++ language standard might find the following paragraphs of interest: