Case of Study: Buffer Overflow and arbitrary code execution in C.
Vinicius Aragão
Posted on October 21, 2022
Hi everyone, first of all, sorry for my poor english.
It's a simple concept with a more simple case but it can start some spark in some people who are not aware of security implementation of their codes and programs, this is more directed to new programmers who are starting their studies and carrers. (But of course, even the experienced can have some new knowledge today 😁, this is what a community is made for).
Ok, to be more specific about this case, the versions of the...
C -> 201112L (C11),
GCC -> 6.3.0,
x32dbg -> Jan 1 2022 20:06:58,
Python -> 3.10.2
So, to start this...
There is this simple code in C that has only 4 lines inside the main() function and another function declared above called vitoria(), ok! very simple...
As you can see, the function called vitoria()
is not being summoned into main()
function, so how we can get access to this function without calling it inside in main function?
When the program is running in our RAM, all the content is loaded in memory without any exception, even the function vitoria() is somewhere in memory, even if won't be accessed throw ordinary ways, somewhere in some memory address...,
To locate it, we need to use the debugger and search for the string in it, but beware, that's the address of the string in memory, not the entry point of the function that usually starts with push ebp
if compiled with GCC, it should be somewhere near the string address.
(this is where the string inside the function in interest is located, but we should look for the push ebp
of it, wich is the start of th function vitoria())
When we hit the part of the program that waits for our input, expressed with call <JMP.&gets>
inside the debugger, here is the part that we enter the exploit to overwrite the return address to enter the new one that should be the address of the entry point of the vitoria() function, so what we need to do now is enter a big enough input to overwrite the return address with "special characters" that, in hexadecimal, will now make the top of the stack point to the address of the vitoria() function.
(here i "inputed" 66 bytes of 'a' to reach the exact region of the return address, to leave the gets() function...)
Depending if your system is 32 or 64 bits, the "payload" will have different sizes and other characters in order to do the correct buffer overflow... just count how many bytes will be necessary to overwrite the return address and what should be the last caracters to access the desired address.
After that input, we should see a bunch of "61" inside of our stack... wich means the bunch of 'a' that i've inputed...
(this sausage of "61 bytes" runs all the way down to the return address...)
Now check this highlighted address in the picture below, see this little red thing on the left? this indicates the return address, i overwrited with the huge input that i've sent and if the program proceeds with this new information, it should get the "Segmentation fault" or Exception Access Violation, or simply it has returned to some address that wasn't supposed to return, resulting in crash...
Ok, by far we know how many bytes we have to send to overwrite the return address, but with what bytes should we send to get the correct address and display the desired message?
Going back to where we find the desired string, we can see with just some few addresses above the push ebp
instruction, now if we look at the address that this instuction is placed, this are the bytes that we need to send in the input, just within the last 3 bytes, we need to change it to be exact the same address, in hexadecimal of course.
In order to get this correctly done, if we replace the last 3 'a' in the input with the same corresponding bytes in ASCII table (which is \x10\x14\x40) we should get the right address to return, so lets give it a try... (notice that i've putted the bytes from back to front, this is because of the little endian organization of bytes, almost everything is organized like this in little endian, wich means, that the most valuable bytes are in the most right position)
If you are familiar with ASCII table, you will know that both \x10 and \x14 bytes are not printable characters, so we will need to input with ^P and ^T, that translates to \x10 and \x14 (the ^ simbol is "control", so ctrl + shift + P and ctrl + shift + T), the \x40 is printable, is the '@' simbol, now lets get everything together...
Running the program again but sending the correct input to overwrite the return address to where i desire to be...
And we got it!! Now if i do some steps we should get the message from vitoria() function without calling it inside the main() function...
Ok, nice... but how do we fix this vulnerability? Well, using safety input functions such as fgets or fscanf, the syntax is a little bit different but not really complicated, google a little bit and it should do. That's it for today, hope you learned something new today! please feel free to comment below.
Posted on October 21, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.