Vaxry's blog

I code, sometimes.

Writing my own malloc to fix my trampolines

14 IV 2024

7.5k

This is a continuation to a blogpost I made quite a while back about x86 trampolines in Linux, read it before this one: link.

A while back I made a tweet about a bit of code that fixed almost all my problems with trampoline hooks, and it's been the most successful tweet of mine of all time, probably because I consciously made a memory leak, and made a meme out of it:

Let's explain what happened and why it's so cool.

The key problem we had

The key problem we had was how we needed to re-write assembly because all opcodes taking addresses relative to the instruction pointer register (%rip) would not handle such a long jump. Our memory would be allocated at a virtual address that was more than 2.1GB away (that's just kinda how virtual memory in Linux works)

Thus, we had to change the operations.

This, however, is quite tedious and leads to bugs, as I've mentioned previously, as pretty much all instructions had to be manually examined, and get a specific fix written for.

Well, not anymore!

A custom malloc

All in all, malloc is not a syscall. It's just a C library function, written in C, by someone that wrote the C standard library. Probably the GNU guys in most cases.

This means that nothing is stopping us from rewriting it ourselves, in C or C++. That's precisely what I did.

See, by default virtual memory works a bit like this:

0x4000000000000000:
<program code>
<grows down>

...


<grows up>
<heap, stack, data, etc>
0x7fffffffffffffff:

(This is a simplified, platform-dependent example, please do not take it as 100% accurate)

So, if we just malloc some data for our trampoline, it gets sent all the way down to hell somewhere around 0x7fffffffffffffff.

However, nobody is saying that we have to abide by the "heap grows up" rule. There is no such "rule", it's merely a convention.

So, if we are able to allocate memory closer to program code, we won't need to rewrite anything, as we can just add the offset to all %rip uses and call it a day!

Writing your own malloc

The syscalls that govern memory allocation are mmap and munmap.

Unfortunately, in order to make our systems run smoothly, we can't allocate arbitrary amounts of memory. We need to allocate in "pages", which are a fixed size. In Linux, it's 4KiB.

The malloc for trampolines here is simple. Whenever a new call to "getAddressForTrampo" is made, we:

As you can see, we never "free" the allocated memory. Why? Simple.

Every page will have hundreds of users using a few bytes each at best. Tracking every user would be painful, and ultimately not worth it. Every hook is 64B. Nobody hooks more than 100 times, like, ever.

And to all people who suggest shared_ptr and stuff, man, we're writing our own malloc here. You can't just "free()".

Allocating a new page

Since we're only dealing with Linux (this probably doesn't work on BSD but whatever) we can do it the Linux way.

Open a terminal and type cat /proc/self/maps:

These are all the allocated pages. As you can see, we got the program code for cat at around 5b7... and other stuff like the stack, heap, libraries, etc, at 7ff... and 797...

This means that all we need to do is start reading line by line from the top, and find the first available place in our memory after the program code.

Once we find it (in this case, it's from 5b712b54c000 until 79786e400000) we can pass it onto mmap as our desired place in memory, and mmap will grant you that memory as long as it's not already allocated.

Bingo! We got 4KiB very close to our program code that we can now use for our trampoline code.

All that's left is to re-calculate all offsets for %rip calls and we're done. No more assembly code rewriting.

You can check out the actual commit for the exact code here, on github.

Closer

This is not undefined behavior, nor memory-unsafe code, surprisingly. Memory leaks are not memory-unsafe, and although a bad actor technically can write to the memory addresses if they can add a hook, at this point, you are running a .so from someone malicious, which means they can also do std::filesystem::remove_all("/home/you/") in the code anyways.

No other app running on your system can write there, so it's not a vulnerability.

Rust would also not have fixed this, we'd be writing the same thing, just in an unsafe{} block.

Special thanks to all the developers telling me the hook system is cursed. It's the most used feature of the plugin API. :P


Questions, comments, mistakes? Ping me a mail at vaxry [at] vaxry.net and I'll get back to ya.