Les Imbroglios d'Alexis Breust

code

Alexis Breust - le 28 mai 2018

Ever now and then, I find myself looking for a way to log an enum. Fun fact is, even if it is easy to actually log it, it is hard to maintain. Having to keep a switch instruction in sync with code content, or generating a new file during a pre-build step, is not exciting. We can do better, right?

Expected behavior

Imagine some natural developper code:

#include <iostream>

enum PowerLevel {
    High,
    Low,
};

int main(void)
{
    PowerLevel powerLevel = PowerLevel::High;
    std::cout << "Current power level: " << powerLevel << std::endl;

    return 0;
}

Fact is, this code logs "Current power level: 0". And that is more than often useless.

I would love to get "Current power level: PowerLevel::High", and just explicit cast to an int when needed. It would make C++ feel like what it should be.

Being a developper-friendly low-level language is basically what drives Jai language.
But that is another topic...

1st solution: DIY

One straight-forward way to get this behaviour is the implement everything manually:

constexpr const char* stringify(PowerLevel value) {
    switch (value) {
        case PowerLevel::High: return "PowerLevel::High";
        case PowerLevel::Low: return "PowerLevel::Low";
        // No default case so that a compiler warning
        // can be emitted when missing something.
    }
    return "PowerLevel::<Unknown>";
}

Convert a PowerLevel value to a string.

inline std::ostream& operator<<(std::ostream& os, PowerLevel value) {
    os << stringify(value);
    return os;
};

Overload ostream.operator<<.

That's the simpliest way, but that stringify function is hard to maintain and prone to copy-paste errors.

Thanksfully (?), C++20 has a reflection proposals, but that might not be for now. And, as usual with C++ metaprogramming, it will be ugly and unpratical.

std::meta::get_base_name_v<
    std::meta::get_element_m<
        std::meta::get_enumerators_m<reflexpr(PowerLevel)>, 0
    >
>

C++20 way of getting name of element 0 of PowerLevel: "High".

2nd solution: Building task

One way to handle the problem would be to generate code during your building process.

But, as you know, whatever tool you use, this is always a bad idea, as you would have to parse C++ and generate a new file.

That might sound fun but:

If you're using bash, python or whatever scripted language, it will add that dependency to your project.
If you're using cmake… "Ah ! Je ris !" (Seriously, don't use cmake anymore.)
If you're using premake, that's a bit better but it might be some work to parse the C++.
If you're creating a dedicated C++ binary during the first step of your building process, I'm impressed, but that sounds like it is a bit overkilled.
One cannot just take the include/ folder content from your repo to link to pre-build binaries, as he will need to build your project first.

So, I wouldn't recommand that to anyone.

3rd solution: Macros

The previous solutions are not perfect, and the following will not be either. Here, I will expose a solution based on macros, if you are a bit like me, you might learn a lot about them.

I always say that the plural of macro should be macry, and I am definitely not the only one to avoid them as much as possible. Because, as you know, macros have the tendency to invade your coding base once you started using them. It's like a broken window, but you won't fix it, it's so useful. And it ends like Unreal Engine C++ interface.

Wanted API

As you design anything, start by thinking how you would like to use the feature you are building. In our case, the perfect API would surely be something like:

$enum PowerLevel {
    High,
    Low,
};

But if our macro token is only $enum we will have a hard time generating code that checks each enum type, as it is not forwarded to the macro context via arguments.

I prefix all my macros with $. It is clearly an arbitrary convention, but I like it better than full uppercase words. However, this is not MSVC-friendly. Be aware of that.
Moreover the dollar $ might become the a "reflection operator" in future C++. So, that is certainly not safe to use.

Thus, the way macros work forces us to use:

$enum(PowerLevel,
    High,
    Low,
);

And, somehow, that does not feel too bad to me.

We start our code with the structure of what we need:

#define $enum(...)                  \
    $enum_enum(__VA_ARGS__)         \
    $enum_stringify(__VA_ARGS__)    \
    $enum_ostream(__VA_ARGS__)

#define $enum_enum(...)
#define $enum_stringify(...)
#define $enum_ostream...)

$enum_enum will generate the pure enum PowerLevel ;
$enum_stringify will generate the stringify function ;
$enum_ostream will generate the operator<< overload.

Generating enum

That's the easiest part of all:

#define $enum_enum(Enum, ...)   \
    enum Enum {                 \
        __VA_ARGS__             \
    };

The variadic arguments will just be pasted as they were, and a nice enum is generated. That is what we started with, that's a good step in the right direction.

The hard part is the following section.

Generating stringify overload

We want to generate the way-above stringify function. Let's start with a simple base.

#define $enum_stringify(Enum, ...)                    \
    constexpr const char *stringify(Enum value) {     \
        switch (value) {                              \
            $enum_stringify_cases(Enum, __VA_ARGS__); \
        }                                             \
        return #Enum "::<Unknown>";                   \
    }

The #Enum is will be translated to "PowerLevel", and C++ accepts two consecutives strings as just one, so "PowerLevel" "::<Unknown>" is all right.

We just have to generate all individual cases: case PowerLevel::High: return "PowerLevel" "::" "High";. And that's hard because there is no while loop nor recursiveness with macros.

At least, not officially.

Without recursivity in macros

Basically this code:

#define $rec(Value, ...) \
    Value;            \
    $rec(__VA_ARGS__)

$rec(a, b, c);

Just expands to:

a;
$rec(b, c);

And that's sad… Fact is $rec is marked as already seen, and the preprocessor won't expand it anymore. Moreover $rec() expands to ; $rec(...), we would need to handle this end-case somehow.

Trying something new, we could do something like:

#define $rec_3(Value, ...) \
    Value;                 \
    $rec_2(__VA_ARGS__)
#define $rec_2(Value, ...) \
    Value;                 \
    $rec_1(__VA_ARGS__)
#define $rec_1(Value) \
    Value;

$rec_3(a, b, c) expands to a; b; c; as expected.

That's nice but the user has to know how many arguments there are. That's no fun.

But being clever helps us a lot:

#define $args_count(...) $args_count_(__VA_ARGS__, 4, 3, 2, 1, 0)
#define $args_count_(_1, _2, _3, _4, n, ...) n

$args_count(a, b, c) expands to 3. (Yeah, that's clever.)

Yeah, we can count arguments ourself! Let's use that:

#define $rec(...) $rec_$args_count(__VA_ARGS__)(__VA_ARGS__)

Sadly, $rec(a, b, c) expands to $rec_$args_count(a, b, c)(a, b, c). That's because $rec_$args_count is seen as a single token, why would it expand?

Trying $rec_##$args_count exploiting macros concatenation ## won't work, because it is concatening before evaluating $args_count(a, b, c).

#define $rec(...) $rec_($args_count(__VA_ARGS__))(__VA_ARGS__)
#define $rec_(...) $rec_##__VA_ARGS__

Trying to force evaluation of $args_count(__VA_ARGS__).

However, it still evaluates to $rec_$args_count(a, b, c)(a, b, c).

#define $rec(...) $rec_($args_count(__VA_ARGS__))(__VA_ARGS__)
#define $rec_(...) $rec__(__VA_ARGS__)
#define $rec__(...) $rec_##__VA_ARGS__

Trying to really force evaluation of $args_count(__VA_ARGS__).

Somehow, that works, and $rec(a, b, c) expands to a; b; c; thanks to the the above $rec_3.

And we could stop there. You would just have to add more numbers to $args_count and $rec. But we won't, as enums tend to grow arbitrarily fast, writing all these cases are not statisfying, not enough concise.

Fact is, if you have three functions that handle 50 arguments each, that will add at least 150 code lines to your code base. Still no fun.

Let's try something else, more complex, but easier to maintain.

Pseudo-recursivity with macros

First, we need to experiment:

#define $rec(...) $rec_(__VA_ARGS__)
#define $rec_(Value, ...) \
    Value;                \
    $rec_(__VA_ARGS__)

$rec(a, b, c) expands to a; $rec_(b, c).

Even with this one level of indirection, it is still not expanding fully. But, like we did for $args_count not evaluating, we could try some trick:

#define $eval_once(...) __VA_ARGS__

#define $rec(...) $eval_once($rec_(__VA_ARGS__))
#define $rec_(Value, ...) \
    Value;                \
    $rec_(__VA_ARGS__)

Trying to force pseudo-recursive evaluation.

Fact is that it does not work neither. The preprocessor prevents expansion of the second $rec_ because the first one is known to be the direct source of the second.

We need to trick the preprocessor so that it is not $rec_ that is generated after first step.

#define $rec(...) $rec_(__VA_ARGS__)
#define $rec_(Value, ...) \
    Value;                \
    $rec_indirect()(__VA_ARGS__)
#define $rec_indirect() $rec_

Won't work, $rec_indirect is evaluated directly.

After the first evaluation, $rec(a, b, b) still expands to a; $rec_(b, c). We have to somehow delay the evaluation of $rec_indirect.

#define $void()

#define $rec(...) $rec_(__VA_ARGS__)
#define $rec_(Value, ...) \
    Value;                \
    $rec_indirect $void()()(__VA_ARGS__)
#define $rec_indirect() $rec_

Good start, $rec(a, b, c) expands to a; $rec_indirect ()(b, c).

Please note that $rec_indirect has to be a function macro, if it was defined without parentheses, it would evaluate immediately.

The first pass evaluates $void to nothing but didn't see the $rec_indirect as something interesting.

#define $rec(...) $eval_once($eval_once($eval_once($rec_(__VA_ARGS__))))

By forcing evaluation, $rec(a, b, c) now expands to a; b; c; ; $rec_indirect ()().

Oh boy, we are so close! We only miss two things:

an end-case;
a way to not right thousand of $eval_once.

#define $check_(x, n, ...) n
#define $check(...) $check_(__VA_ARGS__, 0)
#define $probe(x) x, 1,

#define $is_string_empty(s) $check($is_string_empty_##s)
#define $is_string_empty_ $probe(~)

#define $rec(...) $eval_once($eval_once($eval_once($rec_(__VA_ARGS__))))
#define $rec_(Value, ...) $rec__($is_string_empty(Value), Value, __VA_ARGS__)
#define $rec__(n, ...) $rec___(n, __VA_ARGS__)
#define $rec___(n, ...) $rec_check_##n(__VA_ARGS__)
#define $rec_indirect() $rec_

#define $rec_check_0(...) $rec_base(__VA_ARGS__)
#define $rec_check_1(...) /* Empty end-case */

#define $rec_base(Value, ...) \
    Value;                    \
    $rec_indirect $void()()(__VA_ARGS__)

Handling end-case for pseudo-recursivity.

I won't spend too long on this code as the $rec__{_} thing is what we already seen to really force evaluation of the argument. And the $check/$probe is some clever classic macro code that would certainly need their own blog article.

Anyway, with this, $rec(a, b, c) now expands to a; b; c;. That's nice, except we have a limit of number of arguments because of $eval_once. But that is easily fixed:

#define $eval(...) $eval_1($eval_1($eval_1(__VA_ARGS__)))
#define $eval_1(...) $eval_2($eval_2($eval_2(__VA_ARGS__)))
#define $eval_2(...) $eval_3($eval_3($eval_3(__VA_ARGS__)))
#define $eval_3(...) $eval_4($eval_4($eval_4(__VA_ARGS__)))
#define $eval_4(...) $eval_5($eval_5($eval_5(__VA_ARGS__)))
#define $eval_5(...) $eval_6($eval_6($eval_6(__VA_ARGS__)))
#define $eval_6(...) __VA_ARGS__

#define $rec(...) $eval($rec_(__VA_ARGS__))
/* ... */

A way to force a high amount of $eval_once.

That's why it's "pseudo-recursivity". It has a "callstack limit", but this one is set to 3^6 = 729, which seems enough and can be easily expanded if needed.

We did it, we tackle down pseudo-recursivity!

Use pseudo-recursivity on stringify

Basically, we have just need to adapt our $rec_base to generate all the switch-cases:

#define $rec_base(Value, ...) \
    case Value:               \
        return #Value;        \
        $rec_indirect $void()()(__VA_ARGS__)

And that's it! Our $rec is our $enum_stringify_cases and eveything is fine.

Generating ostream.operator<< overload

So… we worked a lot so far. Let's go back to something simple:

#define $enum_ostream(Enum, ...)                                    \
    inline std::ostream& operator<<(std::ostream& os, Enum value) { \
        os << stringify(value);                                     \
        return os;                                                  \
    }

We just use that stringify function!

Full example

Download the full macros-based example:

enum-to-string.cpp

This example uses some more complex macros ($cat, $bool) that I didn't explain in this article. But if you followed and understood a bit how things work, you should be able to get it without too much trouble.

#include <iostream>

$enum(PowerLevel,
      High,
      Low
);

int main(void)
{
    PowerLevel powerLevel = PowerLevel::High;
    std::cout << "Current power level: " << powerLevel << std::endl;

    return 0;
}

Logs "Current power level: PowerLevel::High".

Please note that this does not handle custom values for enums nicely. Some more work has to be done to handle that, try it out!

References

pfultz2's C Preprocessor tricks, tips and idioms

C++ enum to string: the Macros Way