Tut04: Bypassing Stack Canaries

In this tutorial, we'll explore a defense mechanism against stack overflows; namely, the stack canary. Although it's the most primitive form of defense, it's powerful and performant, which is why it's very popular in most, if not all, binaries you can find in modern systems. This lab's challenges showcase a variety of stack canary designs, and highlight their subtle pros and cons in various target applications.

Step 0. Revisiting "crackme0x00"

This is the original source code of the crackme0x00 challenge that we're quite familiar with by now:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

int main(int argc, char *argv[])
{
  setreuid(geteuid(), geteuid());
  char buf[16];
  printf("IOLI Crackme Level 0x00\n");
  printf("Password:");

  scanf("%s", buf);

  if (!strcmp(buf, "250382"))
    printf("Password OK :)\n");
  else
    printf("Invalid Password!\n");

  return 0;
}

We're going to compile this source code into four different binaries, with different options:

$ make
cc -m32 -g -O0 -mpreferred-stack-boundary=2 -no-pie -fno-stack-protector -z execstack -o crackme0x00-nossp-exec crackme0x00.c
checksec --file crackme0x00-nossp-exec
[*] '/tmp/.../tut04-ssp/crackme0x00-nossp-exec'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX disabled
    PIE:      No PIE (0x8048000)
    RWX:      Has RWX segments
cc -m32 -g -O0 -mpreferred-stack-boundary=2 -no-pie -fno-stack-protector -o crackme0x00-nossp-noexec crackme0x00.c
checksec --file crackme0x00-nossp-noexec
[*] '/tmp/.../tut04-ssp/crackme0x00-nossp-noexec'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x8048000)
cc -m32 -g -O0 -mpreferred-stack-boundary=2 -no-pie -fstack-protector -o crackme0x00-ssp-exec -z execstack crackme0x00.c
checksec --file crackme0x00-ssp-exec
[*] '/tmp/.../tut04-ssp/crackme0x00-ssp-exec'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX disabled
    PIE:      No PIE (0x8048000)
    RWX:      Has RWX segments
cc -m32 -g -O0 -mpreferred-stack-boundary=2 -no-pie -fstack-protector -o crackme0x00-ssp-noexec crackme0x00.c
checksec --file crackme0x00-ssp-noexec
[*] '/tmp/.../tut04-ssp/crackme0x00-ssp-noexec'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      No PIE (0x8048000)

Our goal is to test the effects of two interesting compilation options:

-fno-stack-protector: do not use a stack protector
-z execstack: make the binary's stack "executable"

We name the four binaries using the following convention:

crackme0x00-{ssp|nossp}-{exec|noexec}

Step 1. Crashing the "crackme0x00" binary

crackme0x00-nossp-exec behaves exactly the same as the original crackme0x00. Unsurprisingly, it crashes on a long input:

$ echo aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | ./crackme0x00-nossp-exec
IOLI Crackme Level 0x00
Password:Invalid Password!
Segmentation fault

What about crackme0x00-ssp-exec, compiled with a stack protector?

$ echo aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | ./crackme0x00-ssp-exec
IOLI Crackme Level 0x00
Password:Invalid Password!
*** stack smashing detected ***: <unknown> terminated
Aborted

The binary detects "stack smashing", and simply terminates itself to prevent possible exploitation, resulting in a crash instead of being hijacked.

You might want to run GDB to figure out what's going on in this binary:

$ gdb  ./crackme0x00-ssp-noexec
Reading symbols from ./crackme0x00-ssp-noexec...done.
(gdb) r
Starting program: crackme0x00-ssp-noexec
IOLI Crackme Level 0x00
Password:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Invalid Password!
*** stack smashing detected ***: <unknown> terminated

Program received signal SIGABRT, Aborted.
0xf7fd5079 in __kernel_vsyscall ()
(gdb) bt
#0  0xf7fd5079 in __kernel_vsyscall ()
#1  0xf7e14832 in __libc_signal_restore_set (set=0xffffd1d4) at ../sysdeps/unix/sysv/linux/nptl-signals.h:80
#2  __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:48
#3  0xf7e15cc1 in __GI_abort () at abort.c:79
#4  0xf7e56bd3 in __libc_message (action=do_abort, fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:181
#5  0xf7ef0bca in __GI___fortify_fail_abort (need_backtrace=false, msg=0xf7f677fa "stack smashing detected") at fortify_fail.c:33
#6  0xf7ef0b7b in __stack_chk_fail () at stack_chk_fail.c:29
#7  0x080486e4 in __stack_chk_fail_local ()
#8  0x0804864e in main (argc=97, argv=0xffffd684) at crackme0x00.c:21

Step 2. Let's analyze!

To figure out how two binaries are different, we've kindly provided you a script, "diff.sh", that can help you compare the disassemblies of two binaries.

$ ./diff.sh crackme0x00-nossp-noexec crackme0x00-ssp-noexec
--- /dev/fd/63  2019-09-16 16:31:16.066674521 -0500
+++ /dev/fd/62  2019-09-16 16:31:16.066674521 -0500
@@ -3,38 +3,46 @@
        mov    ebp,esp
        push   esi
        push   ebx
-       sub    esp,0x10
-       call   0x8048480 <__x86.get_pc_thunk.bx>
-       add    ebx,0x1aad
-       call   0x80483d0 <geteuid@plt>
+       sub    esp,0x18
+       call   0x80484d0 <__x86.get_pc_thunk.bx>
+       add    ebx,0x1a5d
+       mov    eax,DWORD PTR [ebp+0xc]
+       mov    DWORD PTR [ebp-0x20],eax
+       mov    eax,gs:0x14
+       mov    DWORD PTR [ebp-0xc],eax
+       xor    eax,eax
+       call   0x8048420 <geteuid@plt>
        mov    esi,eax

...

        add    esp,0x4
        mov    eax,0x0
+       mov    edx,DWORD PTR [ebp-0xc]
+       xor    edx,DWORD PTR gs:0x14
+       call   0x80486d0 <__stack_chk_fail_local>
        pop    ebx
        pop    esi
        pop    ebp

Some of the function addresses have changed due to main() growing longer; those differences can be ignored. The two notable differences are in main()'s prologue and epilogue. First, in the prologue, there's an extra value (gs:0x14) placed after the frame pointer on the stack:

+       mov    eax,gs:0x14
+       mov    DWORD PTR [ebp-0xc],eax
+       xor    eax,eax

And the epilogue later validates that the inserted value is the same, right before returning to the caller:

+       mov    edx,DWORD PTR [ebp-0xc]
+       xor    edx,DWORD PTR gs:0x14
+       call   0x7c0 <__stack_chk_fail_local>

__stack_chk_fail_local() is the function you observed in GDB's backtrace.

As a result of __stack_chk_fail_local(), the process simply halts (via abort()), as you can see in this code from the glibc library:

void
__attribute__ ((noreturn))
__fortify_fail (const char *msg)
{
  /* The loop is added only to keep gcc happy.  */
  while (1)
    __libc_message (do_abort, "*** %s ***: terminated\n", msg);
}

void
__attribute__ ((noreturn))
__stack_chk_fail (void)
{
  __fortify_fail ("stack smashing detected");
}

Step 3. Stack Canary

This extra value is called a "canary" (a bird? why?). What is its value, precisely?

$ gdb ./crackme0x00-ssp-exec
(gdb) br *0x0804863d
(gdb) r
...
(gdb) x/1i $eip
=> 0x0804863d <main+167>: mov    edx,DWORD PTR [ebp-0xc]
(gdb) si
(gdb) info r edx
edx            0xcddc8000 -841187328

(gdb) r
...
(gdb) x/1i $eip
=> 0x0804863d <main+167>: mov    edx,DWORD PTR [ebp-0xc]
(gdb) si
(gdb) info r edx
edx            0xe4b8800  239831040

Have you noticed that the canary value keeps changing with every execution? This is great, because it means that attackers would need to truly guess (or bypass) the canary value before launching an exploit.

pwndbg also provides a way to look up a process's canary value, with the "canary" command:

...
(gdb) canary
AT_RANDOM = 0xffffcffb # points to (not masked) global canary value
Canary    = 0x724bdc00
Found valid canaries on the stacks:
00:0000|   0xffffcd8c <- 0x724bdc00

You might also be wondering what exactly the gs register is, and the immediate offset like gs:0x14. The gs register is one of the "segment registers" that contain, by the ABI specification, a base address for thread local storage (TLS). TLS contains thread-specific information, such as errno (the most recent error number). The immediate value (e.g., 0x14) simply represents the offset from the TLS base address -- in our case, the offset to the canary value.

Below is an actual definition of the TLS information in glibc. stack_guard contains the canary value. We will later check the other guard, pointer_guard, for hijacking other function pointers in glibc (e.g., atexit).

// @glibc/sysdeps/i386/nptl/tls.h
typedef struct
{
  void *tcb;                /* Pointer to the TCB.  Not necessarily the
                               thread descriptor used by libpthread.  */
  dtv_t *dtv;
  void *self;                /* Pointer to the thread descriptor.  */
  int multiple_threads;
  uintptr_t sysinfo;
  uintptr_t stack_guard;
  uintptr_t pointer_guard;
  int gscope_flag;
  /* Bit 0: X86_FEATURE_1_IBT.
     Bit 1: X86_FEATURE_1_SHSTK.
   */
  unsigned int feature_1;
  /* Reservation of some values for the TM ABI.  */
  void *__private_tm[3];
  /* GCC split stack support.  */
  void *__private_ss;
  /* The lowest address of shadow stack,  */
  unsigned long ssp_base;
} tcbhead_t;

pwndbg provides a way to look up the base address of gs (in i386) and fs (in x86_64), with the gsbase and fsbase commands.

Step 4. Bypassing a Stack Canary

What if the stack canary implementation wasn't "perfect"; that is, an attacker could perhaps guess it (i.e., guess gs:0x14)?

Let's check out this week's tutorial challenge binary:

$ objdump -M intel -d ./target-ssp
...

What if, instead of this like before...

mov    eax,gs:0x14
mov    DWORD PTR [ebp-0xc],eax
xor    eax,eax

...there was now this?

mov    DWORD PTR [ebp-0xc],0xdeadbeef

This implementation uses a known value (0xdeadbeef) as its stack canary.

Recall that the stack has this layout:

esp                                   ebp
 V                                     V
  ... [    buf   ] [canary] [(unused)] [bp] [ra] ...
      |<- 0x10 ->|
      |<-------------- 0x20 -------------->|

[Task] How could we exploit this program (otherwise the same as last week's tutorial)? Try it out and get this week's tutorial flag!

CS6265: Information Security Lab