CTranslationAp.xml

<appendix id="ctranslationap">
<title>C Idioms in Assembly Language</title>

<!--

Copyright 2002 Jonathan Bartlett

Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.1 or any later version published by the Free Software
Foundation; with no Invariant Sections, with no Front-Cover Texts,
and with no Back-Cover Texts.  A copy of the license is included in fdl.xml

-->

<para>
<indexterm><primary>C programming language</primary></indexterm>
This appendix is for C programmers learning assembly language.  It is meant to
give a general idea about how C constructs can be implemented in assembly 
language.
</para>

<simplesect>
<title>If Statement</title>

<para>
In C, an if statement<indexterm><primary>if statement</primary></indexterm> consists of three parts - the condition<indexterm><primary>condition</primary></indexterm>, the true branch<indexterm><primary>true branch</primary></indexterm>, and the false branch<indexterm><primary>false branch</primary></indexterm>.  However, since assembly language is not
a block structured language<indexterm><primary>block structured language</primary></indexterm>, you have to work a little to implement
the block-like nature of C.  For example, look at the following C code:
</para>

<programlisting>
if(a == b)
{
	/* True Branch Code Here */
}
else
{
	/* False Branch Code Here */
}

/* At This Point, Reconverge */
</programlisting>

<para>
In assembly language, this can be rendered as:
</para>

<programlisting>
	#Move a and b into registers for comparison
	movl a, %eax
	movl b, %ebx

	#Compare
	cmpl %eax, %ebx

	#If True, go to true branch
	je true_branch

false_branch:  #This label is unnecessary, 
               #only here for documentation
	#False Branch Code Here

	#Jump to recovergence point
	jmp reconverge


true_branch:
	#True Branch Code Here


reconverge:
	#Both branches recoverge to this point
</programlisting>

<para>
As you can see, since assembly language is linear, the blocks have to
jump around each other.  Recovergence is handled by the programmer, not
the system.
</para>

<para>
A case statement<indexterm><primary>case statement</primary></indexterm> is written just like a sequence of if statements.
</para>

</simplesect>

<simplesect>
<title>Function Call</title>

<para>
A function call<indexterm><primary>function call</primary></indexterm> in assembly language simply requires pushing the arguments
to the function onto the stack in <emphasis>reverse</emphasis> order, and
issuing a <literal>call<indexterm><primary>call</primary></indexterm></literal> instruction.  After calling, the arguments
are then popped back off of the stack.  For example, consider
the C code:
</para>

<programlisting>
	printf("The number is %d", 88);
</programlisting>

<para>
In assembly language, this would be rendered as:
</para>

<programlisting>
	.section .data
	text_string:
	.ascii "The number is %d\0"

	.section .text
	pushl $88
	pushl $text_string
	call  printf
	popl  %eax
	popl  %eax      #%eax is just a dummy variable,
	                #nothing is actually being done 
	                #with the value.  You can also 
	                #directly re-adjust %esp to the
	                #proper location.
</programlisting>

</simplesect>

<simplesect>
<title>Variables and Assignment</title>

<para> 
<indexterm><primary>global variables</primary></indexterm> 
<indexterm><primary>static variables</primary></indexterm> 
<indexterm><primary>local variables</primary></indexterm> 
Global and static variables are declared using
<literal>.data</literal> or <literal>.bss</literal> entries.  Local
variables are declared by reserving space on the stack at the
beginning of the function.  This space is given back at the end of the
function.  
</para>

<para>
Interestingly, global variables are accessed differently than local variables
in assembly language.  Global variables are accessed using direct addressing<indexterm><primary>direct addressing mode</primary></indexterm>,
while local variables are accessed using base pointer addressing<indexterm><primary>base pointer addressing mode</primary></indexterm>.  For
example, consider the following C code:
</para>

<programlisting>
int my_global_var;

int foo()
{
	int my_local_var;

	my_local_var = 1;
	my_global_var = 2;

	return 0;
}
</programlisting>

<para>
This would be rendered in assembly language as:
</para>

<programlisting>
	.section .data
	.lcomm my_global_var, 4

	.type foo, @function
foo:
	pushl %ebp            #Save old base pointer
	movl  %esp, $ebp      #make stack pointer base pointer
	subl  $4, %esp        #Make room for my_local_var
	.equ my_local_var, -4 #Can now use my_local_var to 
	                      #find the local variable


	movl  $1, my_local_var(%ebp)
	movl  $2, my_global_var

	movl  %ebp, %esp      #Clean up function and return
	popl  %ebp
	ret
</programlisting>

<para>
What may not be obvious is that accessing the global variable takes fewer
machine cycles than accessing the local variable.  However, that may not
matter because the stack is more likely to be in physical memory<indexterm><primary>physical memory</primary></indexterm> (instead of swap)
than the global variable is.
</para>

<para>
Also note that in the C programming language, after the compiler loads a value into a register, that value will likely
stay in that register until that register is needed for something else.  It
may also move registers.  For example, if you have a variable 
<literal>foo</literal>, it may start on the stack, but the compiler will
eventually move it into registers for processing.  If there aren't many
variables in use, the value may simply stay in the register until it is
needed again.  Otherwise, when that register is needed for something else,
the value, if it's changed, is copied back to its corresponding memory 
location.  In C, you can use the keyword <literal>volatile<indexterm><primary>volatile</primary></indexterm></literal> to
make sure all modifications and references to the variable are done to
the memory location itself, rather than a register copy of it, in case
other processes, threads, or hardware may be modifying the value while
your function is running.
</para>

</simplesect>

<simplesect>
<title>Loops</title>

<para>
Loops<indexterm><primary>loops</primary></indexterm> work a lot like if statements in assembly language - the blocks are
formed by jumping around.  In C, a while loop consists of a loop body, and a
test to determine whether or not it is time to exit the loop. A for loop
is exactly the same, with optional initialization and counter-increment 
sections.  These can simply be moved around to make a while loop<indexterm><primary>while loop</primary></indexterm>.
</para>

<para>
In C, a while loop looks like this:
</para>

<programlisting>
	while(a &lt; b)
	{
		/* Do stuff here */
	}

	/* Finished Looping */
</programlisting>

<para>
This can be rendered in assembly language like this:
</para>

<programlisting>
loop_begin:
	movl  a, %eax
	movl  b, %ebx
	cmpl  %eax, %ebx
	jge   loop_end

loop_body:
	#Do stuff here
	
	jmp loop_begin

loop_end:
	#Finished looping
</programlisting>

<para>
The x86 assembly language has some direct support for looping as well.
The &ecx-indexed; register can be used as a counter that 
<emphasis>ends</emphasis> with zero.  The <literal>loop<indexterm><primary>loop</primary></indexterm></literal> instruction
will decrement &ecx; and jump to a specified address unless &ecx; is zero.
For example, if you wanted to execute a statement 100 times, you would
do this in C:
</para>

<programlisting>
	for(i=0; i &lt; 100; i++)
	{
		/* Do process here */
	}
</programlisting>

<para>
In assembly language it would be written like this:
</para>

<programlisting>
loop_initialize:
	movl $100, %ecx
loop_begin:
	#
	#Do Process Here
	#

	#Decrement %ecx and loops if not zero
	loop loop_begin 

rest_of_program:
	#Continues on to here
</programlisting>

<para>
One thing to notice is that the <literal>loop</literal> instruction 
<emphasis>requires you to be counting backwards to zero</emphasis>.  If
you need to count forwards or use another ending number, you should use
the loop form which does not include the <literal>loop</literal> instruction.
</para>

<para>
For really tight loops of character string operations, there is also the 
<literal>rep<indexterm><primary>rep</primary></indexterm></literal> instruction, but we will leave learning about
that as an exercise to the reader.
</para>

</simplesect>

<simplesect>
<title>Structs</title>

<para>
Structs<indexterm><primary>structs</primary></indexterm> are simply descriptions of memory blocks.  For example,
in C you can say:
</para>

<programlisting>
struct person {
	char firstname[40];
	char lastname[40];
	int age;
};
</programlisting>

<para>
This doesn't do anything by itself, except give you ways of intelligently 
using 84 bytes of data.  You can do basically the same thing using
<literal>.equ<indexterm><primary>.equ</primary></indexterm></literal> directives in assembly language.  Like this:
</para>

<programlisting>
	.equ PERSON_SIZE, 84
	.equ PERSON_FIRSTNAME_OFFSET, 0
	.equ PERSON_LASTNAME_OFFSET, 40
	.equ PERSON_AGE_OFFSET, 80
</programlisting>

<para>
When you declare a variable of this type, all you are doing is reserving 84
bytes of space.  So, if you have this in C:
</para>

<programlisting>
void foo()
{
	struct person p;

	/* Do stuff here */
}
</programlisting>

<para>
In assembly language you would have:
</para>

<programlisting>
foo:
	#Standard header beginning
	pushl %ebp
	movl %esp, %ebp

	#Reserve our local variable
	subl $PERSON_SIZE, %esp 
	#This is the variable's offset from %ebp
	.equ P_VAR, 0 - PERSON_SIZE

	#Do Stuff Here

	#Standard function ending
	movl %ebp, %esp
	popl %ebp
	ret
</programlisting>

<para>
To access structure members, you just have to use base pointer addressing<indexterm><primary>base pointer addressing mode</primary></indexterm> with 
the offsets defined above.  For example,
in C you could set the person's age like this:
</para>

<programlisting>
	p.age = 30;
</programlisting>

<para>
In assembly language it would look like this:
</para>

<programlisting>
	movl $30, P_VAR + PERSON_AGE_OFFSET(%ebp)
</programlisting>

</simplesect>

<simplesect>
<title>Pointers</title>

<para>
Pointers<indexterm><primary>pointers</primary></indexterm> are very easy.  Remember, pointers are simply the
address<indexterm><primary>address</primary></indexterm> that a value resides at.  Let's start by taking a
look at global variables.  For example:
</para>

<programlisting>
int global_data = 30;
</programlisting>

<para>
In assembly language, this would be:
</para>

<programlisting>
	.section .data
global_data:
	.long 30
</programlisting>

<para>
Taking the address of this data in C:
</para>

<programlisting>
	a = &amp;global_data;
</programlisting>

<para>
Taking the address of this data in assembly language:
</para>

<programlisting>
	movl $global_data, %eax
</programlisting>

<para>
You see, with assembly language, you are almost always accessing memory 
through pointers.  That's what direct addressing is.  To get the pointer
itself, you just have to go with immediate mode addressing<indexterm><primary>immediate mode addressing</primary></indexterm>.
</para>

<para>
Local variables<indexterm><primary>local variables</primary></indexterm> are a little more difficult, but not much.  Here is
how you take the address of a local variable in C:
</para>

<programlisting>
void foo()
{
	int a;
	int *b;

	a = 30;

	b = &amp;a;

	*b = 44;
}
</programlisting>

<para>
The same code in assembly language:
</para>

<programlisting>
foo:
	#Standard opening
	pushl %ebp
	movl  %esp, %ebp

	#Reserve two words of memory
	subl  $8, $esp
	.equ A_VAR, -4
	.equ B_VAR, -8

	#a = 30
	movl $30, A_VAR(%ebp)

	#b = &amp;a
	movl $A_VAR, B_VAR(%ebp)
	addl %ebp, B_VAR(%ebp)

	#*b = 30
	movl B_VAR(%ebp), %eax
	movl $30, (%eax)

	#Standard closing
	movl %ebp, %esp
	popl %ebp
	ret
</programlisting>

<para>
As you can see, to take the address of a local variable, the address has to
be computed the same way the computer computes the addresses in base 
pointer addressing.  There is an easier way - the processor provides the
instruction <literal>leal<indexterm><primary>leal</primary></indexterm></literal>, which stands for "load effective address<indexterm><primary>effective address</primary></indexterm>".
This lets the computer compute the address, and then load it wherever you want.
So, we could just say:
</para>

<programlisting>
	#b = &amp;a
	leal A_VAR(%ebp), %eax
	movl %eax, B_VAR(%ebp)
</programlisting>

<para>
It's the same number of lines, but a little cleaner.  Then, to use
this value, you simply have to move it to a general-purpose register
and use indirect addressing<indexterm><primary>indirect addressing mode</primary></indexterm>, as shown in the example above.
</para>

</simplesect>

<simplesect>
<title>Getting GCC to Help</title>

<para>
One of the nice things about GCC<indexterm><primary>GCC</primary></indexterm> is its ability to spit out assembly
language code.  To convert a C language file to assembly, you can simply
do:
</para>

<programlisting>
gcc -S file.c
</programlisting>

<para>
The output will be in <filename>file.s</filename>.  It's not the most
readable output - most of the variable names have been removed and replaced
either with numeric stack locations or references to automatically-generated
labels.  To start with, you probably want to turn off optimizations with
<literal>-O0</literal> so that the assembly language output will follow
your source code better.
</para>

<para>
Something else you might notice is that GCC reserves more stack space
for local variables than we do, and then AND's &esp-indexed;
<footnote>
<para>
Note that different versions of GCC do this differently.  
</para>
</footnote>
This is to increase memory and cache efficiency by double-word aligning 
variables.
</para>


<para>
Finally, at the end of functions, we usually do the following instructions
to clean up the stack before issuing a <literal>ret<indexterm><primary>ret</primary></indexterm></literal> instruction:
</para>

<programlisting>
	movl %ebp, %esp
	popl %ebp
</programlisting>

<para>
However, GCC output will usually just include the instruction 
<literal>leave<indexterm><primary>leave</primary></indexterm></literal>.  This instruction is simply the combination
of the above two instructions.  We do not use <literal>leave</literal>
in this text because we want to be clear about exactly what is happening
at the processor level.
</para>

<para>
I encourage you to take a C program you have written and compile it
to assembly language and trace the logic.  Then, add in optimizations
and try again.  See how the compiler chose to rearrange your program
to be more optimized, and try to figure out why it chose the arrangement
and instructions it did.
</para>

</simplesect>

</appendix>