Skip to content

Latest commit

 

History

History
448 lines (368 loc) · 19.3 KB

lessonplan.md

File metadata and controls

448 lines (368 loc) · 19.3 KB

Introduction to x86 Assembly

Table of Contents

  1. Why Assembly
  2. Binary, Hex and Endians
  3. How Does A Processor Work
  4. Basic Registers
  5. Basic x86 Instructions
  6. Calling Conventions and Stack Frames
  7. Hands-on Exercises
  8. Resources
  9. References

Why Assembly

To understand why assembly is useful, we must be aware of how a programme is created. First, we humans code it in a high level language, then compile it. The compiler then translates that into lower level languages, and finally into binary, which the computer understands.

In between the high level code and binary is the assembly. Oftentimes, we may want to find out what a programme does, but the source code may not be freely available to us. By analysing the assembly through an object dump of an executable programme, we may trace line by line what a programme does, and then reverse engineer it to get the source code.

Binary, Hex, and Endians

Our layman numbers are in base 10 form. Hexadecimal numbers, like its name suggests, are numbers in the base 16 form.

+-------------+----------+--------------+------------+
| Hexadecimal |  Decimal |  Hexadecimal |   Decimal  |
+-------------+----------+--------------+------------+
|      0      |     0    |       A      |      10    |
|      1      |     1    |       B      |      11    |
|      2      |     2    |       C      |      12    |
|      3      |     3    |       D      |      13    |
|      4      |     4    |       E      |      14    |
|      5      |     5    |       F      |      15    |
|      6      |     6    |              |            |
|      7      |     7    |              |            |
|      8      |     8    |              |            |
|      9      |     9    |              |            |
+-------------+----------+--------------+------------+

Decimal

The value of a decimal number is as follows. Considering the number: (1234) 10

(1234) 10 = 1(10)3 + 2(10)2 + 3(10)1+ 4(10)0 = (1234) 10

###Hexadecimal Similarly, the value of hexadecimals can be derived the same way. Considering the number: (3D2F) 16

(3D2F) 16 = 3(16)3 + 13(16)2 + 2(16)1+ 15(16)0 = (15663) 10

###0x prefix In C based languages, hexadecimal numbers are prepended with a 0x to tell the compiler that the number following is a hexadecimal. (3D2F) 16 will therefore appear as 0x3D2F

##Little Endian and Big Endian These two standards describe the order in which a sequence of bytes are stored in a computer’s memory.

Little vs Big Endian

This order is important and must be known to the language compilers in order for values to be stored.

Most of the time, the Big Endian standard is adopted as it is more intuitive. However, in certain cases like in Intel CPUs the Little Endian standard may be used, as this simplifies certain operations, thus improving efficiency.

###Big Endian The ‘Big End’, ie. the most significant value in the sequence of bytes is stored into the smallest address, at the top of the stack.

###Little Endian The ‘Little End’ ie. the least significant value in the sequence of bytes, is stored first into the smallest address, at the top of the stack.

The Little Endian standard is used in x86.

How Does A Processor Work

This is not a computer architecture class. I am just going to tell you that a processor has a number of registers that can be used as temporary store of values, and one or more arithmetic and logic units (ALUs) that can perform operations on the registers.

The processor fetches instructions from the memory, determines what to do, be it adding two numbers or loading data from the memory, and executes it.

Basic Registers

In x86, there are a lot of registers. We will only focus on the ones that we can usually see in assembly codes, including 8 of the 16 general purpose registers and 2 status registers.

General Purpose Registers

First there are the 4 classical A, B, C, D registers. Take A as an example, the 64-bit version is named rax, the 32-bit version is named eax, the 16-bit version is named ax, the higher 8-bit of the 16-bit is named ah and the lower 8-bit of the 16-bit is named al. This is how it looks like:

                           rax
  _________________________/ \_________________________
 /                                                     \
+---------------------------+---------------------------+
|                           |                           |
|                           |            eax            |
|                           |                           |
|                           |             +-------------+
|                           |             |     ax      |
|                           |             +------+------+
|                           |             |  ah  |  al  |
+---------------------------+-------------+------+------+

It is the same for all the four A, B, C, D registers, even for the next four general purpose registers that we are going to talk about: rsi, rdi, rbp, and rsp.

rsi and rdi are index registers, often used for indexed addressing.

rbp and rsp are pointer registers. Although they are general purpose registers, they are often reserved for stack operations: rbp for base pointer, and rsp for stack pointer. We will talk more about stacks later.

Status Registers

There are only two status registers in x86.

The first one is rflags / eflags / flags. It holds a lot of information about the ongoing processing, such as the last comparison result, last computation result, whether there was an overflow, whether interrupts should be ignored, etc. For example, if you have something in your code like if (a > b) { ... }, it will be compiled into two instructions. The first instruction compares a and b and store the result in rflags, the second instructions checks rflags and decide whether to branch or not.

The second one is rip / eip / ip, which is the instruction pointer. It points to the next instruction that the processor will be executing. If we can control where the instruction pointer poionts to, we can manipulate the control flow of the program.

Basic x86 Instructions

Syntax

There are two syntaxes of x86 Assembly:
AT&T Intel
Parameter Order instruction src dst instruction dst src
Sigils Constants are prefixed with $
Registers are prefixed with %
Assembler automatically detects
In this lesson we’ll be referencing the AT&T syntax.

Memory Addressing Modes

There are 3 basic modes of addressing:
- Immediate addressing - Register addressing - Memory addressing
     $0x1  = constant value 1
     %eax  = register value
 0x4(%eax) = memory address(register value + 4)

Data Movement Instructions

Commonly seen instruction suffixes:
- b (byte = 8 bits) - w (word = 2 bytes = 16 bits) - l (long = 4 bytes = 32 bits)

E.g. movw, pushl

mov - Move
Copies the data from src to dst.

mov %eax, %edx       -->  edx = eax
mov $0x4, %edx       -->  edx = 0x4
mov $0x4, 0x4(%edx)  -->  mem(0x8) = 0x4

However memory-to-memory moves are not possible, you will have to load the value into register before copying into the destination memory.

Cannot be done:

mov 0x4(%eax), 0x8(%eax)

Correct method:

mov 0x4(%eax), %edx
mov %edx     , 0x8(%eax)

push - Push stack
Writing a value into the stack.

push $0x4  -->  0x4 is now in stack

pop - Pop stack
Restoring value that is on top of the stack

pop %eax   -->  eax = 0x4, 0x4 is gone from stack

Arithmetic and Logic Instructions

add - Integer Addition
Adds src and dst, and store results in dst ``` mov $0x4, %eax --> eax = 4 add $0x4, %eax --> eax = 8 ```

sub - Integer Subtraction
Subtracts src from dst, and store results in dst

mov $0x8, %eax  -->  eax = 8
sub $0x4, %eax  -->  eax = 4

imul - Integer Multiplication
imul instruction has two formats - two-operands format and three-operands format. For both, the dst must be a register.

Two operands:

mov  $0x4, %eax  -->  eax = 4
imul $0x4, %eax  -->  eax = 16

Three operands:

mov  $0x4, %eax        -->  eax = 4
imul $0x4, %eax, %edx  -->  edx = 16

idiv - Integer Division
idiv instruction takes in only one operand, either register or memory address. It only works with signed numbers. Dividend is always either AH:AL (byte), DX:AX (word) or EDX:EAX (long). Quotient results will be stored in EAX and remainder is stored in EDX.

mov  $0x2, %ebx  -->  ebx = 2
mov  $0x4, %eax  -->  eax = 4
cltd             -->  convert signed long to signed double long (EAX -> EDX:EAX)
idiv %ebx        -->  eax = 2, edx = 0

and, or, xor - Bitwise Operations
Logical bitwise and, or, and exclusive or operations.

mov $0x4, %eax  -->  eax = 4
and $0xf, %eax  -->  eax = 4
and $0x0, %eax  -->  eax = 0

mov $0x4, %eax  -->  eax = 4
or  $0xf, %eax  -->  eax = 15
or  $0x0, %eax  -->  eax = 4

mov $0x4, %eax  -->  eax = 4
xor $0xf, %eax  -->  eax = 11
xor $0x0, %eax  -->  eax = 4

not - Bitwise Logical Not
not instruction takes in only one operand. Flips all bit values in the operand.

mov $0x4, %eax  -->  eax = 4
not %eax        -->  eax = 4,294,967,291 (-5 if signed)

neg - Negate
Performs two’s complement negation on operand contents, which is essentially ‘not + 1’.

mov $0x4, %eax  -->  eax = 4
neg %eax        -->  eax = 4,294,967,292 (-4 if signed)

shl, shr - Shift Left, Shift Right
Shift bits left and shift bits right.

mov $0x4 , %eax  -->  eax = 4
shl $0x1 , %eax  -->  eax = 8
shl $0x1c, %eax  -->  eax = 2,147,483,648 (-2,147,483,648 if signed)
sh1 $0x1 , %eax  -->  eax = 0

mov $0x4,  %eax  -->  eax = 4
shr $0x1,  %eax  -->  eax = 2
shr $0x1c, %eax  -->  eax = 0

Control Flow Instructions

Control flow instructions allow you to make the program execute instructions at another address other than the next instruction in memory. The label notations are symbolic names for a memory address.

jmp - Jump
Causes the program to execute instruction at another memory address.

HelloWorld: mov $0x4, %eax
            jmp HelloWorld  --> Infinite loop, don’t do this.

cmp - Compare
Compare values of src and dst by using sub instruction and stores into a special register called ‘Machine Status Word’ (MSW). cmp instruction is usually used before conditional jump instructions, and the result stored in the special register is referred to by them.

mov $0x4, %eax  -->  eax = 4
mov $0x1, %edx  -->  edx = 1
cmp %eax, %edx  -->  msw = -3

jcondition - Conditional Jump

  • je - Jump when Equal
  • jz - Jump when Zero
  • jne - Jump when Not Equal
  • jg - Jump when Greater Than
  • jge - Jump when Greater Than or Equal To
  • jl - Jump when Less Than
  • jle - Jump when Less Than or Equal To
mov $0x4, %eax  -->  eax = 4
mov $0x1, %edx  -->  edx = 1
cmp %eax, %edx  -->  msw = -3
je  HelloWorld  -->  don’t jump
jz  HelloWorld  -->  don’t jump
jne HelloWorld  -->  jump
jg  HelloWorld  -->  don’t jump
jge HelloWorld  -->  don’t jump
jl  HelloWorld  -->  jump
jle HelloWorld  -->  jump

call, ret - Subroutine call and Return
call instruction will push current code location onto stack.
ret will pop the code location and perform an unconditional jump to it.

call HelloWorld
ret

Calling Conventions and Stack Frames

To allow programs written by different programmers to be able to call one another, and to simplify the use of subroutines in general, programmers and compilers typically adopt a common calling convention. The calling convention is a protocol about how the call and return from subroutines, such as which registers are for arguments and which register stores the return value. With that, a programmer does not need to read the definition of a subroutine to determine how to pass parameters to that subroutine and how to get the return value from it.

Although we are not writing assembly, knowing the calling convention helps us to understand typical assembly code more easily. In this section, we will learn about the widely used C language calling convention.

These are the steps to call a subroutine:

  1. Passing parameters a. On 32-bit systems, all arguments are pushed onto the stack in reverse order b. On 64-bit systems, the first 6 arguments are moved into registers in the following order: rdi, rsi, rdx, rcx, r8, r9. Then the rest of the arguments are pushed onto the stack in reverse order
  2. Invode call / callq, which will push the return address onto the stack, then jump to the subroutine (the callee)
  3. The callee pushes the value of rbp / ebp onto the stack, then copies the value ofrsp / esp into rbp / ebp
  4. The callee allocates local variables by making space on the stack
  5. When the callee returns, it leaves the return value in rax / eax, copies rbp / ebp back to rsp / esp, then pops the saved rbp / ebp and transfer control to the caller by invoking ret.

This is the stack frame of a subroutine that has 3 parameters and 3 local variables: A single stack frame

Note that the stack grows from higher addresses to lower addresses. The figure above is for a 32-bit system, so each cell in the stack is 32-bit wide, therefore the memory addresses are 4 bytes apart. (Use of e-prefixed registers also shows that it is 32-bit.)

Below is the assembly of a program that adds three hardcoded numbers and prints out the result, compiled in 32-bit.

0804840b <add_three>:
 804840b:       55                      push   %ebp
 804840c:       89 e5                   mov    %esp,%ebp
 804840e:       83 ec 10                sub    $0x10,%esp
 8048411:       8b 45 08                mov    0x8(%ebp),%eax
 8048414:       89 45 f4                mov    %eax,-0xc(%ebp)
 8048417:       8b 45 0c                mov    0xc(%ebp),%eax
 804841a:       89 45 f8                mov    %eax,-0x8(%ebp)
 804841d:       8b 45 10                mov    0x10(%ebp),%eax
 8048420:       89 45 fc                mov    %eax,-0x4(%ebp)
 8048423:       8b 55 f4                mov    -0xc(%ebp),%edx
 8048426:       8b 45 f8                mov    -0x8(%ebp),%eax
 8048429:       01 c2                   add    %eax,%edx
 804842b:       8b 45 fc                mov    -0x4(%ebp),%eax
 804842e:       01 d0                   add    %edx,%eax
 8048430:       c9                      leave
 8048431:       c3                      ret

08048432 <main>:
 8048432:       8d 4c 24 04             lea    0x4(%esp),%ecx
 8048436:       83 e4 f0                and    $0xfffffff0,%esp
 8048439:       ff 71 fc                pushl  -0x4(%ecx)
 804843c:       55                      push   %ebp
 804843d:       89 e5                   mov    %esp,%ebp
 804843f:       51                      push   %ecx
 8048440:       83 ec 14                sub    $0x14,%esp
 8048443:       6a 03                   push   $0x3
 8048445:       6a 02                   push   $0x2
 8048447:       6a 01                   push   $0x1
 8048449:       e8 bd ff ff ff          call   804840b <add_three>
 804844e:       83 c4 0c                add    $0xc,%esp
 8048451:       89 45 f4                mov    %eax,-0xc(%ebp)
 8048454:       83 ec 08                sub    $0x8,%esp
 8048457:       ff 75 f4                pushl  -0xc(%ebp)
 804845a:       68 00 85 04 08          push   $0x8048500
 804845f:       e8 7c fe ff ff          call   80482e0 <printf@plt>
 8048464:       83 c4 10                add    $0x10,%esp
 8048467:       b8 00 00 00 00          mov    $0x0,%eax
 804846c:       8b 4d fc                mov    -0x4(%ebp),%ecx
 804846f:       c9                      leave
 8048470:       8d 61 fc                lea    -0x4(%ecx),%esp
 8048473:       c3                      ret

Here's the 64-bit version of the same program. Take note that the arguments are passed using registers rather than pushing them onto the stack.

0000000000400526 <add_three>:
  400526:       55                      push   %rbp
  400527:       48 89 e5                mov    %rsp,%rbp
  40052a:       89 7d ec                mov    %edi,-0x14(%rbp)
  40052d:       89 75 e8                mov    %esi,-0x18(%rbp)
  400530:       89 55 e4                mov    %edx,-0x1c(%rbp)
  400533:       8b 45 ec                mov    -0x14(%rbp),%eax
  400536:       89 45 f4                mov    %eax,-0xc(%rbp)
  400539:       8b 45 e8                mov    -0x18(%rbp),%eax
  40053c:       89 45 f8                mov    %eax,-0x8(%rbp)
  40053f:       8b 45 e4                mov    -0x1c(%rbp),%eax
  400542:       89 45 fc                mov    %eax,-0x4(%rbp)
  400545:       8b 55 f4                mov    -0xc(%rbp),%edx
  400548:       8b 45 f8                mov    -0x8(%rbp),%eax
  40054b:       01 c2                   add    %eax,%edx
  40054d:       8b 45 fc                mov    -0x4(%rbp),%eax
  400550:       01 d0                   add    %edx,%eax
  400552:       5d                      pop    %rbp
  400553:       c3                      retq

0000000000400554 <main>:
  400554:       55                      push   %rbp
  400555:       48 89 e5                mov    %rsp,%rbp
  400558:       48 83 ec 10             sub    $0x10,%rsp
  40055c:       ba 03 00 00 00          mov    $0x3,%edx
  400561:       be 02 00 00 00          mov    $0x2,%esi
  400566:       bf 01 00 00 00          mov    $0x1,%edi
  40056b:       e8 b6 ff ff ff          callq  400526 <add_three>
  400570:       89 45 fc                mov    %eax,-0x4(%rbp)
  400573:       8b 45 fc                mov    -0x4(%rbp),%eax
  400576:       89 c6                   mov    %eax,%esi
  400578:       bf 14 06 40 00          mov    $0x400614,%edi
  40057d:       b8 00 00 00 00          mov    $0x0,%eax
  400582:       e8 79 fe ff ff          callq  400400 <printf@plt>
  400587:       b8 00 00 00 00          mov    $0x0,%eax
  40058c:       c9                      leaveq
  40058d:       c3                      retq

Hands-on Exercises

Under ./build/ there are three programs whose names has the word "secret" in them. Try disassemble them and figure out what they do. Both 32-bit and 64-bit binaries are provided. The source code are in the ./src/ folder but please do not look at them before you attempt.

If you do not know how to disassemble a binary yet, one way to do it is:

$ objdump -d <binary_file>

Resources

References