# designing a memory allocator - interfaces malloc(sz) -> ptr free(ptr) Q1. ptr = malloc(0); ptr == NULL? Q2. ptr = malloc(-1); ptr == NULL? Q3. ptr = malloc(size); *ptr? Q4. *(ptr +/- sz')? ; overflow (O) Q5. free(NULL)? Q6. free(invalid ptr)? Q7. free(ptr); free(ptr)? ; double free (DF) Q8. free(ptr); ptr == NULL? Q9. free(ptr); *ptr? ; use-after-free (UAF) - design goals 1. performance 1. fragmentation (or memory utilization) - internal vs. external 1. security V0. simplest, yet secure, malloc (w/ syscall) ============================================= malloc -> mmap free -> munmap - pros/cons -- P (invoking syscall is slow) - IF (< page size) + EF (reusing, os handles it for you) ++ S (+O, +DF, +UAF)! V1. fastest malloc ================== - introduce: gptr -> indicating the top of the heap | | | | e.g., gptr'->+----------+ <- ptr1 ptr1 = malloc(sz1) | | ptr2 = malloc(sz2) | | free(ptr2) | | ptr3 = malloc(sz3) | | gptr'->+----------+ <- ptr2 | | | | | | | | gptr ->+----------+ <- ptr3 | | | | | | | | +----------+ | | | | - impl: def malloc(sz): ptr = gptr gptr += sz return ptr def free(ptr): pass - pros/cons: ++ P (just add) -- F (wasted) + S (-O, +DF, +UAF) V2. reusing freed memory ======================== - Idea 1. freelist -> a linked list for freed objs (e.g., freelist -> ptr2 -> ptr3 ...) | | | | e.g., gptr'->+----------+ <- ptr1 ptr1 = malloc(sz1) | | ptr2 = malloc(sz2) | | free(ptr2) | | ptr3 = malloc(sz3) | | free(ptr3) gptr'->+----------+ <- ptr2 | FREED | | | | | | | gptr'->+----------+ <- ptr3 | FREED | | | | | | | gptr-> +----------+ | | | | def free(ptr): append(freelist, ptr) ; Q. how to implement this? return fn malloc(sz): for f in freelist: if sz <= f->sz: ; Q. how to implement this? unlink(f) ; Q. how to implement this? return f ptr = gptr gptr += sz return ptr - Idea 2. puting metadata next to the obj for quick lookup (in-place vs. out-of-band) | | +----------+ | SZ1 | gptr'->+----------+ <- ptr1 | | | | | | +----------+ | SZ2 | gptr'->+----------+ <- ptr2 <-- freelist | ptr3 --|------------+ | | | | | | +----------+ | | SZ3 | | gptr'->+----------+ <- ptr3 <--+ | NULL | | | | | | | gptr ->+----------+ | | | | - Idea 3. using "freed" memory for freelist - freelist ---> ptr2: +-> ptr3: sz:[SZ2 ] | [SZ3 ] fd:[ptr3]--+ [NULL] - impl: fn malloc(sz): pf = freelist for f in freelist: if sz <= f->sz: ; f->sz = *(f-sizeof(ptr))) pf->fd = f return f pf = f ptr = gptr gptr += sz + 4 return ptr fn free(ptr): ptr->fd = freelist freelist = ptr return - pros/cons: - P: slow malloc (iterating freelist) + P: fast free (append) + F: reusing memory (==: no internal) (<=: internal) (yet, external fragmentation; |freed region| > sz) - S: unlink? overflow? V3. handling fragmentation better ================================= - P1. internal fragmentation: e.g., requesting < sz -> splitting - P2. external fragmentation: e.g., requesting >>sz -> merging (aka, fw/bk consolidation) fn malloc(sz): for f in freelist: if sz <= f->sz: unlink(f) rsz = f->sz - sz - 4 if rsz > 0: newobj = f + sz + 4 newobj->sz = rsz append(freelist, newobj) return f ptr = gptr gptr += sz + 8 return ptr - P2-1. how to know prev/next objs? - next: ptr + sz - prev: ??? - P2-2. how to know prev/next objs are free? | ... | +----------+ | SZ1 | gptr'->+----------+ <- ptr1 | | | | | | +----------+ | PSZ1 | +----------+ | SZ2 | gptr'->+----------+ <- ptr2 | FREED | | | | | +----------+ | PSZ2 | +----------+ | SZ3 | gptr'->+----------+ <- ptr3 | FREED | | | | | | | gptr ->+----------+ | | | | - introduce psz: - next: ptr + ps - prev: ptr - psz - merging two freed objs fn merge(ptr1, ptr2): // both are continuous objs ptr1->sz += ptr2->sz ptr2->psz += ptr2->sz - impl: fn free(ptr): fd = ptr - ptr->psz bk = ptr + ptr->sz if freed?(fd) and freed?(bk): pop bk from freelist merge(ptr, bk) merge(fd, ptr) return if fd is freed? merge(fd, ptr) return if bk is freed? merge(ptr, bk) return ptr->fd = freelist freelist = ptr return # slow! fn freed?(ptr): for f in freelist: if f == ptr: return true return false - pros/cons: - P: slow malloc() - P: slow! free() + F: no wasted memory! + F: handling fragmentation -- S: ... V4. optimization ================ - P1. checking an object if freed or used is slow! (i.e., iterating freelist) -> introduce In-Use flag (U) to indicate an obj's status but where? as part of SZ! | ... | +----------+ | SZ1 U| = 1 gptr'->+----------+ <- ptr1 | | | | | | +----------+ | PSZ1 | +----------+ | SZ2 U| = 0 gptr'->+----------+ <- ptr2 | FREED | | | | | +----------+ | PSZ2 | +----------+ | SZ3 U| = 0 gptr'->+----------+ <- ptr3 | FREED | | | | | | | gptr ->+----------+ | | | | - new in-use check: # fast! fn freed?(ptr): return !(ptr->sz & U) next: freed?(ptr + ptr->sz + 8) prev: freed?(ptr - ptr->prev_sz - 8) - P2. 2 x sizeof(ptr) allocations (SZ/PSZ), but PSZ is only used for freed obj -> introduce: in-place "prev" in-use bit | ... | +----------+ | SZ1 PU| gptr'->+----------+ <- ptr1 | | | | | | +----------+ | SZ2 PU| = 1 gptr'->+----------+ <- ptr2 | FREED | | | | | +----------+ | PSZ2 | +----------+ | SZ3 PU| = 0 gptr'->+----------+ <- ptr3 | FREED | | | | | +----------+ | PSZ3 | gptr ->+----------+ | | | | - new check: (ptr->sz & PU) is used in practice - pros/cons: -P: slow malloc() +P: fast free() +F: no wasted memory +F: handling fragmentation --S: ... P3. O(n): iterating freelist Idea 1. sorted, tree-list freelist (fd/bk) - O(log N) Idea 2. binning - O(1) if exists - bins for free objs of known sizes bin[0]: > 10 bin[1]: > 20 bin[2]: > 30 .. pick one from a first fetch - what if bin[1] is empty for malloc(20) request? 1) keep checking the rest -> worse cast higher but better mem usage 2) skip, then use gptr Ideas! even more - bitmap for optimization - fastbins for smaller objs (e.g., single link) - caching freed objs (i.e., unsorted bin) - per-cpu cache! - mmap() for larger objs - pros/cons: +P: fast malloc/free +F: no wasted memory ---S: fastbin? security bugs ============= 0. heap overflows (crafted in-place metadata) 1. use-after-free (recycled for memory utilization) int *ptr = malloc(size); free(ptr); *ptr; // BUG. use-after-free! 2. double free (binning) char *ptr = malloc(size); free(ptr); free(ptr); // BUG!