Writing my own malloc to fix my trampolines
14 IV 2024
8.3k
This is a continuation to a blogpost I made quite a while back about x86 trampolines in Linux, read it before this one: link.
A while back I made a tweet about a bit of code that fixed almost all my problems with trampoline hooks, and it's been the most successful tweet of mine of all time, probably because I consciously made a memory leak, and made a meme out of it:
Let's explain what happened and why it's so cool.
The key problem we had was how we needed to re-write assembly because all
opcodes taking addresses relative to the instruction pointer register (%rip
)
would not handle such a long jump. Our memory would be allocated at a virtual
address that was more than 2.1GB away (that's just kinda how virtual memory in Linux works)
Thus, we had to change the operations.
This, however, is quite tedious and leads to bugs, as I've mentioned previously, as pretty much all instructions had to be manually examined, and get a specific fix written for.
Well, not anymore!
All in all, malloc is not a syscall. It's just a C library function, written in C, by someone that wrote the C standard library. Probably the GNU guys in most cases.
This means that nothing is stopping us from rewriting it ourselves, in C or C++. That's precisely what I did.
See, by default virtual memory works a bit like this:
0x4000000000000000:
<program code>
<grows down>
...
<grows up>
<heap, stack, data, etc>
0x7fffffffffffffff:
(This is a simplified, platform-dependent example, please do not take it as 100% accurate)
So, if we just malloc
some data for our trampoline, it gets sent all the way down to hell
somewhere around 0x7fffffffffffffff
.
However, nobody is saying that we have to abide by the "heap grows up" rule. There is no such "rule", it's merely a convention.
So, if we are able to allocate memory closer to program code, we won't need to rewrite anything, as we can just add
the offset to all %rip
uses and call it a day!
The syscalls that govern memory allocation are mmap
and munmap
.
Unfortunately, in order to make our systems run smoothly, we can't allocate arbitrary amounts of memory. We need to allocate in "pages", which are a fixed size. In Linux, it's 4KiB.
The malloc for trampolines here is simple. Whenever a new call to "getAddressForTrampo" is made, we:
As you can see, we never "free" the allocated memory. Why? Simple.
Every page will have hundreds of users using a few bytes each at best. Tracking every user would be painful, and ultimately not worth it. Every hook is 64B. Nobody hooks more than 100 times, like, ever.
And to all people who suggest shared_ptr and stuff, man, we're writing our own malloc here. You can't just "free()".
Since we're only dealing with Linux (this probably doesn't work on BSD but whatever) we can do it the Linux way.
Open a terminal and type cat /proc/self/maps
:
These are all the allocated pages. As you can see, we got the program code for cat at around 5b7... and other stuff like the stack, heap, libraries, etc, at 7ff... and 797...
This means that all we need to do is start reading line by line from the top, and find the first available place in our memory after the program code.
Once we find it (in this case, it's from 5b712b54c000
until 79786e400000
) we can pass it onto mmap
as our desired place in memory,
and mmap will grant you that memory as long as it's not already allocated.
Bingo! We got 4KiB very close to our program code that we can now use for our trampoline code.
All that's left is to re-calculate all offsets for %rip
calls and we're done. No more assembly code rewriting.
You can check out the actual commit for the exact code here, on github.
This is not undefined behavior, nor memory-unsafe code, surprisingly. Memory leaks are not memory-unsafe,
and although a bad actor technically can write to the memory addresses if they can add a hook,
at this point, you are running a .so from someone malicious, which means they can also do std::filesystem::remove_all("/home/you/")
in the code
anyways.
No other app running on your system can write there, so it's not a vulnerability.
Rust would also not have fixed this, we'd be writing the same thing, just in an unsafe{}
block.
Special thanks to all the developers telling me the hook system is cursed. It's the most used feature of the plugin API. :P