Vaxry's blog

assembly?

C++ Interacting With C? Wrap it!

19 VII 2023

2.4k

C++ is a generally great language, offering versatility, speed and power to those who have time to learn it well. Although it does have some weaknesses, like the known "10 ways to accomplish one thing, C++ has them all", one of the most critiqued weak spot of C++ is the apparent lack of memory safety.

Although true, as in, no C++ compiler will stop you from making memory faults unlike languages like Java (where the language does not allow that by default) or Rust (where the compiler does not allow that by default) C++ is a pretty memory-safe language as long as you understand what you are writing and follow basic C++ etiquette.

Most C++ memory issues arise from two main issues:

I will not focus on the first here, because that might be too obvious, but rather on the latter.

If we show a very basic example of C vs C++:

int main() {
    char* myString;
    printf("%s\n", myString);
    return 0;
}

This is obviously not memory-safe and will either segfault or print garbage.

int main() {
    std::string myString;
    std::cout << myString << "\n";
    return 0;
}

This is perfectly memory safe and will print a single newline, because the string is empty. Although the C code is also valid in C++, it will carry with it the lack of memory safety.

As an another example, it's common practice to return a nullptr (in C, NULL) from a function to indicate that no object was returned. This, obviously, can lead to various segfaults if we do not check that the object was returned. There is a solution to that, however, in C++:

std::optional<SomeClass*> getObject(int someCriteria);

This will ensure that the programmer sees it might return no value, and additionally, will prohibit directly using the object and encourage you to do .has_value() to ensure its validity, or use something like .value_or(). You can still ignore that and call .value() directly, but then you're just fully consciously asking for trouble. Even if, accessing an std::optional with no value with .value() will throw an exception, and not a segfault, which is easier to debug, is catchable, etc.

We could go on and on about the various examples, but I think you get the point by now.

The meat of the article

The title is about wrapping, so let's get on to that.

Many of the issues also arise from interacting with C libraries. As an example, wayland.

In wayland, you get a bunch of so-called "signals", which are basically events. You can register your functions to listen to them.

Cool, now what?

Since these are C, they will store your pointer until you tell them not to anymore. This is obviously not very memory-safe, as if you register a pointer to something and then that thing gets destroyed, the pointer won't go away. Why would it? The library does not know.

This is where wrapping comes into play.

As an example, let's look at Hyprland's wayland signal wrapper.

CHyprWLListener::CHyprWLListener(wl_signal* pSignal,
        std::function<void(void*, void*)> callback, void* pOwner) {
    initCallback(pSignal, callback, pOwner);
}

CHyprWLListener::~CHyprWLListener() {
    removeCallback();
}

Simple. When the listener is created, it gets registered, when it gets destroyed, it gets unregistered. Great, memory issues avoided, right?

Well, no.

The importance of proper wrapping

Wrappers are great, especially for cases like above, where you have to explicitly remove something when the object gets destroyed.

Why then, during the last month, Hyprland has had two separate issues where the callback was not unregistered when the object was destroyed? What if I told you that unregistering of these callbacks could be a memory safety violation?

That is why proper wrapping is important.

If I could tell my younger self starting Hyprland one thing, it would be to wrap stuff better. I will be getting to it some day, but for now I am too lazy to recode such a core part of Hyprland.

It's important to understand that these callbacks also reside somewhere. Removing them means removing an object from a list of callbacks, somewhere.

Both of these issues arose from the fact that the object holding that list would be destroyed, but in the destroy event, the C++ object would not be destroyed.

Picture this:

A C struct, wlr_output has some events, for example, .frame. That is a member of the struct. We register our event to that handler from a C++ wrapper called CMonitor.

wlr_output is being destroyed. It sends us the event that it's gonna be destroyed soon, so we should unregister the events. However, due to poor design decisions, our C++ CMonitor wrapper stays around for a bit longer, maybe for compatibility reasons, maybe for some other reasons. It gets marked as disabled or whatever, but stays around. The callbacks do not get unregistered, as CMonitor continues living.

The C struct member holding our callbacks gets destroyed.

Now, a while later, CMonitor gets destroyed, so we removeCallback(). But wait, the list holding the callbacks is destroyed, so we have a memory violation! We are writing to an object that no longer exists...

This is why it's important to have your wrappers live if and only if the thing they are wrapping also is alive. Otherwise, if possible, you could also listen to some event telling you it's being destroyed and unregister stuff in that handler, however, it might become confusing quickly.

The moral of the story

Wrap your C stuff well to avoid headaches in the future, as pure, modern C++ will rarely make memory faults as long as your code is not doing stupid stuff because you asked for it :)

Small closer

I will talk a bit more about avoiding footguns in C++ in a later blogpost, stay tuned :)


Questions, comments, mistakes? Ping me a mail at vaxry [at] vaxry.net and I'll get back to ya.