You are using an ad blocker that is interfering with our web typography and internal javascript. Please whitelist our domain to live in a more beautiful world. No ads here, just really great software!

Be an IT Changemaker! Learn, get inspired and inspire others on our new DEX Hub. Visit now

Blog Post|7 minutes

The Best Way to Return Multiple Values from a C++17 Function

The Best Way to Return Multiple Values from a C++17 Function
published
July 1st

This article originally appeared in Medium.

What is the best way to return multiple values from a C++17 function?

  1. Using output parameters:
    auto output_1(int &i1) { i1 = 11; return 12; }
  2. Using a local structure:
    auto struct_2() { struct _ { int i1, i2; }; return _{21, 22}; }
  3. Using an std::pair:
    auto pair_2() { return std::make_pair(31, 32); }
  4. Using an std::tuple:
    auto tuple_2() { return std::make_tuple(41, 42); }

Keep reading for the answer.

A typical example is the std::from_chars(), a C++17 function similar to strtol(). But from_chars() returns 3 values: a parsed number, an error code, and a pointer to the first invalid character.

The function uses a mix of techniques: the number is returned as an output parameter, but the error code and the pointer are returned as a structure. Why is this so? Let’s take a closer look…

Example code:

auto output_1(int &i1) {
  i1 = 11;   // Output first parameter
  return 12; // Return second value
}// Use volatile pointers so compiler could not inline the function
auto (*volatile output_1_ptr)(int &i1) = output_1;int main() {
  int o1, o2;            // Define local variables
  o2 = output_1_ptr(o1); // Output 1st param and assign the 2nd
  printf("output_1 o1 = %d, o2 = %d\n", o1, o2);
}

The code compiles to:

output_1(int&):
  mov [rdi], 11       # Output first param to the address in rdi
  mov eax, 12         # Return second value in eax
  retmain:                 # Note: simplified
  lea rdi, [rsp + 4]  # Load address of the 1st param (on stack)
  call [output_1_ptr] # Call output_1 using a pointer  mov esi, [rsp + 4]  # Load 1st param from the stack
  mov ecx, eax        # Load 2nd param from eax
  call printf

Compiler Explorer: https://godbolt.org/z/Fan8OH

Pros:

  • Classic. Easy to understand.
  • Works with any C++ standard, including C (using pointers).
  • Supports function overloading.

Cons:

  • Address of the first parameter needs to be loaded prior to the function call.
  • First parameter is passed using stack. Slow 🙁
  • Due to System V AMD64 ABI, we can pass in registers up to 6 addresses. The stack must be used to pass more than 6 params. Even slower 🙁

To illustrate the last cons, here is an example code to output 7 params:

// Output more than 6 params
int output_7(int &i1, int &i2, int &i3, int &i4,
             int &i5, int &i6, int &i7) {
  i1 = 11;
  i2 = 12;
  i3 = 13;
  i4 = 14;
  i5 = 15;
  i6 = 16;
  i7 = 17;
  return 18;
}

And the disassembly of the output_7():

output_7(int&, int&, int&, int&, int&, int&, int&):
  mov [rdi], 11      #
  mov [rsi], 12      # Addresses of the first 6 params get passed
  mov [rdx], 13      # via rdi, rsi, rdx, rcx, r8, and r9
  mov [rcx], 14      # according to System V AMD64 ABI
  mov [r8], 15       # (for Linux, macOS, FreeBSD etc)
  mov [r9], 16       #
  mov rax, [rsp + 8] # But address for the 7th is on the stack,
  mov [rax], 17      # which is slow
  mov eax, 18
  ret

The seventh address is passed via stack, so we put the address on the stack, then we read it from the stack, then we output the value to that address… A bit too much memory operations. Slow 🙁

Example code:

auto struct_2() {
  struct _ {        // Declare a local structure with 2 integers
    int i1, i2;
  };
  return _{21, 22}; // Return the local structure
}// Use volatile pointers so compiler could not inline the function
auto (*volatile struct_2_ptr)() = struct_2;int main() {
  auto [s1, s2] = struct_2_ptr(); // Structured binding declaration
  printf("struct_2 s1 = %d, s2 = %d\n", s1, s2);
}

Disassembly:

struct_2():
  movabs rax, 0x1600000015  # Just return 2 integers in rax
  retmain:                 # Note: simplified
  call [struct_2_ptr] # No need to load output param addresses
  mov rdx, rax        # Just use the values returned in rax
  shr rdx, 32         # High 32 bits of rax
  mov rcx, rax
  mov esi, ecx        # Low 32 bits of rax
  call printf

Compiler Explorer: https://godbolt.org/z/Q7P4q0

Pros:

  • Works with any C++ standard, including C, though the structure must be declared outside the function scope.
  • Returns up to 128 bits in registers, no stack is used. Fast!
  • Does not require addresses of the params, which allows compiler to better optimize the code.

Cons:

What happens when we try to return more values? According to the System V AMD64 ABI, values up to 128 bits are stored in RAX and RDX. So up to four 32-bit integers will be returned in registers. One byte more and we have to use the stack.

Still, we don’t need to load output param addresses, so it is faster than the output parameters method.

Example:

auto pair_2() { return std::make_pair(31, 32); } // Just one line!// Use volatile pointers so compiler could not inline the function
auto (*volatile pair_2_ptr)() = pair_2;int main() {
  auto [p1, p2] = pair_2_ptr();  // Structured binding declaration
  printf("pair_2 p1 = %d, p2 = %d\n", p1, p2);
}

The generated assembly code:

pair_2():
  movabs rax, 0x200000001f  # Just return 2 integers in rax
  retmain:                 # Note: simplified
  call [pair_2_ptr]   # Just call the function
  mov rdx, rax        # Use the values returned in rax
  shr rdx, 32
  mov rcx, rax
  mov esi, ecx
  call printf

Compiler Explorer: https://godbolt.org/z/9iXzSb

Pros:

  • Just one line of code!
  • No need to declare the local structure.
  • Just like with the structures, returns up to 128 buts in registers, no stack is used.

Cons:

  • Pair is just two return values.
  • Just like with the structures, the function can’t be overloaded.

Example:

auto tuple_2() { return std::make_tuple(41, 42); } // Just one line!// Use volatile pointers so compiler could not inline the function
auto (*volatile tuple_2_ptr)() = tuple_2;int main() {
  auto [t1, t2] = tuple_2_ptr();  // Structured binding declaration
  printf("tuple_2 t1 = %d, t2 = %d\n", t1, t2);
}

The code compiles to:

tuple_2():
  movabs rax, 0x290000002a. # Good start, but...
  mov [rdi], rax            # Indirect write to a output parameter?
  mov rax, rdi              # Return the address of the parameter
  retmain:                 # Note: simplified
  mov rdi, rsp        # Pass stack pointer as a parameter
  call [tuple_2_ptr]  # Call the function
  mov edx, [rsp]      # Get the values from the stack
  mov esi, [rsp + 4]
  call printf

Compiler Explorer: https://godbolt.org/z/hSVV72

Pros:

  • The source code is one liner, just like with the std::pair.
  • Unlike the std::pair, it is easy to add more values.

Cons:

  • Unfortunately, the disassembly is a mixed bag. We need to pass an address of the output tuple to the function, one per tuple.
  • Even for two integers (64 bits), the return values are always on the stack. Slow 🙁

What if we return more values in the tuple? Adding more values does not change the disassembly much: we still pass just one address pointing to the stack, then we put the values under that address (on stack), and then we load them back from the stack to use for printf().

It’s slower than the pair and the structure, which both return up to 128 bits in the registers. But it’s faster than the output parameters, where we need to pass a few addresses to the function, not just one.

  1. The fastest methods to return multiple parameters in C++17 are by using local structure and std::pair.
  2. The std::pair must be preferred to return two values as the most convenient and fastest method.
  3. Use output parameters when the function overload is needed. That’s why std::from_chars() uses output parameters and a return structure.

Full source code: https://github.com/berestovskyy/applied-cpp

The std::pair is the most convenient and fastest method to return two values. If we need to return more than two values, local structure (faster) or std::tuple (convenient) must be used instead.