Parameters and Return Values

What is there to say about parameters and return values? Quite a bit as it turns out!

As with everything else, it helps to understand how things work "under the hood" in order to motivate the recommendations.

One important "under the hood" nugget is that x86_64 architecture has 16 general purpose registers that can each hold a 64-bit scalar. These registers are used for local variables, intermediate results, and for passing parameters to functions and grabbing returned results. The x86_64 ABI (application binary interface) specifies that the first six parameters that can fit into registers are passed via six specific registers (doesn't matter what they are, just that there are six of them) and the rest are passed via the stack (which is a part of memory). The same ABI specifies that a single 64-bit result can be returned via a register but that larger results must be returned via the stack. Registers are faster than memory--there is overhead of an additional instruction in both the caller and callee, and these additional instructions are a STORE and a LOAD which are somewhat more expensive than instructions like ADD--and so the goal is to maximize the use of registers and minimize the use of the stack in the function call process.

Another "under the hood" nugget is that passing parameters by pointer/reference interferes with compiler optimizations in the caller because the caller cannot keep a variable that is passed by pointer/reference in a register across a call. There is no such thing as a pointer/reference to a register and so the caller has to keep the variable in memory before the call--this may not cost anything if the variable is already be in memory (e.g., on the heap as part of state) but it may if it is just a simple local variable. In addition, the callee could have changed the value of the variable and so it has to be re-LOAD-ed into a register after the call. The upshot here is that parameters should be passed by value as much as possible.

Let's get to the meat and potatoes.

The Basic Rules

There are three basic ways to pass parameters, by value, by reference, and by constant reference, and here are the rules for them.

Scalars that are strictly inputs to a function should be passed by value.
Non-scalars (i.e., objects) that are strictly inputs to the function should be passed by constant reference (i.e., const &). The two exceptions to this are std::string_view and gsl::span which have two members each and are themselves essentially references.
Scalars that are outputs of a function should be returned via the return value as opposed to reference (i.e., &) parameters.
Scalars that are outputs of a function but cannot be returned via the return value (e.g., because another scalar is already being returned) and non-scalar outputs should be passed as reference (i.e., &) parameters.

Are we done? Mostly, but there are some nuances.

More than Six Scalar Parameters

What if you have more than six scalar (i.e., register eligible) parameters. If a number of them are input parameters, are you better off collecting some of them into an object and passing that object by constant reference or passing the parameters individually by value understanding that some of them will have to be passed via the stack? This is not empirical but my sense is you are better off collecting some of them into a object--if it makes logical sense to do so--and passing that object by reference. Compilers have no latitude to optimize the calling convention or ABI, it is what it is. They have more freedom and are getting better at optimizing local object storage. This would be easier if EnergyPlus classes had better internal organization and made heavier use of "sub-objects" as opposed to be flat lists of fields. A canonical example of this is the PlantLocation struct. This is a logical group of variables, but too often they are represented as individual variables and passed to functions individually rather than by constant reference (or by reference if they need to be written). Representing these as structures more pervasively would facilitate passing them as a structure, reducing function call cost. Here is a canonical example from EnergyPlus:

void 
SetComponentFlowRate(class EnergyPlusData &state,                                                                                  
                     Real64 &CompFlow,      // [kg/s]                                                                        
                     int const InletNode,   // component's inlet node index in node structure                                
                     int const OutletNode,  // component's outlet node index in node structure                               
                     int const LoopNum,     // plant loop index for PlantLoop structure                                      
                     int const LoopSideNum, // Loop side index for PlantLoop structure                                       
                     int const BranchNum, // branch index for PlantLoop                                                    
                     int const CompNum)   // component index for PlantLoop

This function has eight parameters which means that the last two, BranchNum and CompNum have to be pushed to the stack by the caller and that SetComponentFlowRate has to pop them off. Now, the quartet LoopNum, LoopSideNum, BranchNum, and CompNum appear together so often that they are already collected in a struct.

struct PlantLocation {
   int LoopNum;
   int LoopSideNum;
   int BranchNum;
   int CompNumNum;
};

We could then reimplement the function this way.

void 
SetComponentFlowRate(class EnergyPlusData &state,                                                                                  
                     Real64 &CompFlow,      // [kg/s]                                                                        
                     int const InletNode,   // component's inlet node index in node structure                                
                     int const OutletNode,  // component's outlet node index in node structure                               
                     struct PlantLocation const &ploc)    // component index for PlantLoop                                                 
)

But wait a minute, you say. Now SetComponentFlowRate has to load all four members from PlantLocation. Now it has to load four elements, instead of the two it loaded previously off the stack. That is true. But in the previous example, the caller had to load four elements into registers whereas in this one it has to load only one (the address of the PlantLocation struct) and actually loading address into a register usually does not require actually loading something from memory, it usually requires adding a constant offset to an address in another register and register-to-register ADD is cheaper than a memory-to-register LOAD. If you count caller and callee together--and after all you have to, because you need both for a function call--then the example with PlantLocation struct wins four LOADs plus one ADD to six LOADs.

It's even more of a no-brainer to collect a bunch of output parameters into a struct and pass it by reference. If you have multiple output parameters, you are already committed to returning them via memory, and so you may as well combine them into an object. On top of this, you only need to explicitly pass the address of the object rather than the address of every field in the object. The compiler knows the relative positions of fields in the objects and in LOAD and STORE instructions constant offset calculations are essentially "free", i.e., it is no more expensive to STORE a value at a constant offset to an address than it is to store it at that address directly. Collecting related fields into objects for input parameter purposes is usually helpful when it reduces the number of parameters over six. Collecting related fields into objects for output parameter purposes is helpful always.

void 
ScanPlantLoopsForObject(class EnergyPlusData &state,
                        std::string_view CompName,
                        int const CompType,
                        int &LoopNum,
                        int &LoopSideNum,
                        int &BranchNum,
                        int &CompNum,
                        bool &errFlag)

This function actually has a couple of optional arguments that we are ignoring for now. In this implementation, the caller has to execute four ADDs to place the addresses of LoopNum, LoopSideNum, BranchNum, and CompNum into registers. It then has to store the addresses of CompNum and errFlag on the stack because those are parameters seven and eight and there are only six parameter registers. The callee of course, has to LOAD the addresses of CompNum and errFlag from the stack.

In the implementation below, the caller has to execute only one ADD to place the address of ploc into a register, and the stack is not used at all since the number of parameters is five (which is less than six).

In both cases, ScanPlantLoopForObject has to execute four STOREs.

void 
ScanPlantLoopsForObject(class EnergyPlusData &state,
                        std::string_view CompName,
                        int const CompType,
                        struct PlantLocation &ploc,
                        bool &errFlag)

Returning Multiple Values

If you need to return multiple values, which one should you return as the return value? Should you use std::pair or std::tuple to return multiple values? This is a good discussion for a future EnergyPlus Technicalities meeting.

Pointer vs. Reference Arguments

Here's a religious argument for you, the pointer vs. reference argument. The C-language had only pointers, references are a C++ construct. And they are probably the worst C++ construct at that. What is the difference between a reference and a pointer, you ask? Nothing except for syntactic sugar and the fact that references can technically not be nullptr. Other than that, all a reference does is obscure the fact that something is actually a pointer. If it were up to me--and it may be--I would say that arguments should be passed by pointer rather than by reference. This would require a & in front of the argument at the call-site, making it obvious that the argument is being passed by pointer rather than by value. And it would require the use of * or -> inside the function, again making it clear that we are dealing with a parameter that was passed by address rather than by value.

Although I am against references in general, I am particularly against this specific use of them. I am fine with local reference variables to shorten what would otherwise be long names, in fact we don't do this enough in EnergyPlus.

ZoneData &zone = state.dataHeatBal->Zone(ZoneNum);

I am also somewhat fine with constant reference function parameters, because there is not a big logical difference between a constant reference parameter and a value parameter, but non-const reference parameters are evil in my opinion.

Optional Arguments

EnergyPlus makes reasonably heavy use of the ObjexxFCL::Optional template and its variants. The C++ standard library also has a std::optional template that is lighter-weight than ObjexxFCL::Optional and essentially combines a value with a present/not-present bool in a std::pair template. Are these things useful?

In my opinion, they are not. The C++ language already has a mechanism for optional parameters and that is default values.

int
aFunction(int requiredArgument, 
          int optionalArgument = -1); // if an argument is not supplied a default value of -1 will be passed instead

If an optional argument is not provided, the function typically adopts some default value. It is both faster and cleaner (in my opinion) to use parameters with appropriate default values than to use any one of several optional containers like std::optional or ObjexxFCL::Optional.

Avoiding the use of optional containers is another reason to use pointer rather than reference parameters. A pointer argument can be given the default value of nullptr, whereas there is no such thing as a nullref. Ironically, the same mechanism is used to implement ObjexxFCL::Optional. The container stores a pointer to the optional parameter and the present() function checks if that pointer is nullptr. Remember, references are just pointers in syntactic disguise and ObjexxFCL undisguises them for this purpose.

Provide feedback

Saved searches