-
Notifications
You must be signed in to change notification settings - Fork 4
/
LinkingCh.xml
697 lines (591 loc) · 34 KB
/
LinkingCh.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
<chapter id="linking">
<title>Sharing Functions with Code Libraries</title>
<!--
Copyright 2002 Jonathan Bartlett
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.1 or any later version published by the Free Software
Foundation; with no Invariant Sections, with no Front-Cover Texts,
and with no Back-Cover Texts. A copy of the license is included in fdl.xml
-->
<para>
By now you should realize that the computer has to do a lot
of work even for simple tasks. Because of that, you have to
do a lot of work to write the code for a computer to even do
simple tasks. In addition, programming tasks are usually not
very simple. Therefore, we neeed a way to make this process
easier on ourselves. There are several ways to do this, including:
<itemizedlist>
<listitem><para>Write code in a high-level language instead of assembly language</para></listitem>
<listitem><para>Have lots of pre-written code that you can cut and paste into your own programs</para></listitem>
<listitem><para>Have a set of functions<indexterm><primary>functions</primary></indexterm> on the system that are shared among any program that wishes to use it</para></listitem>
</itemizedlist>
All three of these are usually used to some degree in any given project. The
first option
will be explored further in <xref linkend="highlevellanguages" />. The
second option is useful but it suffers from some drawbacks, including:
</para>
<itemizedlist>
<listitem><para>Code that is copied often has to be majorly modified to fit the surrounding code.</para></listitem>
<listitem><para>Every program containing the copied code has the same code in it, thus wasting a lot of space.</para></listitem>
<listitem><para>If a bug is found in any of the copied code it has to be fixed in every application program.</para></listitem>
</itemizedlist>
<para>
Therefore, the second option is usually used sparingly. It is usually
only used in cases where you copy and paste skeleton code<indexterm><primary>skeleton code</primary></indexterm>
for a specific type of task, and add in your program-specific
details. The third option is the one that is used the most often. The third option includes having a central repository of shared code. Then, instead of each program wasting
space storing the same copies of functions, they can simply point to the
<emphasis>dynamic libraries</emphasis> <indexterm><primary>shared libraries</primary></indexterm> <indexterm><primary>dynamic libraries</primary></indexterm>
which contain the functions they need. If a bug is found in
one of these functions, it only has to be fixed within the single function library
file, and all applications which use it are automatically updated. The
main drawback with this approach is that it creates some dependency problems,
including:
</para>
<itemizedlist>
<listitem><para>If multiple applications are all using the same file, how do we know when it is safe to delete the file? For example, if three applications are sharing a file of functions and 2 of the programs are deleted, how does the system know that there still exists an application that uses that code, and therefore it shouldn't be deleted?</para></listitem>
<listitem><para>Some programs inadvertantly rely on bugs within shared functions. Therefore, if upgrading the shared functions fixes a bug that a program depended on, it could cause that application to cease functioning.</para></listitem>
</itemizedlist>
<para>
These problems are what lead to what is known as "DLL hell".
However, it is generally assumed that the advantages outweigh the disadvantages.
</para>
<para>
In programming, these shared code files are referred to as
<emphasis>shared libraries</emphasis> <indexterm><primary>shared libraries</primary></indexterm>, <emphasis>dynamic libraries</emphasis> <indexterm><primary>dynamic libraries</primary></indexterm>,
<emphasis>shared objects<indexterm><primary>shared objects</primary></indexterm></emphasis>,
<emphasis>dynamic-link libraries<indexterm><primary>dynamic-link libraries</primary></indexterm></emphasis>,
<emphasis>DLLs<indexterm><primary>DLLs</primary></indexterm></emphasis>, or <emphasis>.so files</emphasis>.<footnote><para>Each of these terms have slightly different meanings, but most people use them interchangeably anyway. Specifically, this chapter will cover dynamic libraries, but not shared libraries. Shared libraries are dynamic libraries which are built using <emphasis>position-independent code</emphasis><indexterm><primary>position-independent code</primary></indexterm> (often abbreviated PIC<indexterm><primary>PIC</primary></indexterm>) which is outside the scope of this book. However, shared libraries and dynamic libraries are used in the same way by users and programs; the linker just links them differently.</para></footnote> We will refer to all of these as <emphasis>dynamic libraries</emphasis>.
</para>
<sect1>
<title>Using a Dynamic Library</title>
<para>
The program we will examine here is simple - it writes the
characters <literal>hello world</literal> to the screen and
exits. The regular program, <filename>helloworld-nolib.s</filename>,
looks like this:
</para>
<programlisting>
&helloworld-nolib-s;
</programlisting>
<para>
That's not too long. However, take a look at how short
<filename>helloworld-lib</filename> is which uses a library:
</para>
<programlisting>
&helloworld-lib-s;
</programlisting>
<para>
It's even shorter!
</para>
<para>
Now, building programs which use dynamic libraries<indexterm><primary>dynamic libraries</primary></indexterm> is a little
different than normal. You can build the first program normally
by doing this:
</para>
<programlisting>
as helloworld-nolib.s -o helloworld-nolib.o
ld helloworld-nolib.o -o helloworld-nolib
</programlisting>
<para>
However, in order to build the second program, you have to do this:
</para>
<programlisting>
as helloworld-lib.s -o helloworld-lib.o
ld -dynamic-linker /lib/ld-linux.so.2 \
-o helloworld-lib helloworld-lib.o -lc
</programlisting>
<para>
Remember, the backslash in the first line simply means that the command continues on the next line. The option
<literal>-dynamic-linker<indexterm><primary>-dynamic-linker</primary></indexterm>
/lib/ld-linux.so.2</literal> allows our
program to be linked to libraries. This builds the executable so that
before executing, the operating system will load the program
<filename>/lib/ld-linux.so.2</filename> to load in external libraries
and link them with the program. This program is known as a
<emphasis>dynamic linker<indexterm><primary>dynamic linker</primary></indexterm></emphasis>.
</para>
<para>
The <literal>-lc</literal> option
says to link to the <literal>c</literal> library, named
<filename>libc.so</filename> on GNU/Linux systems. Given a library
name, <literal>c</literal> in this case (usually library names are
longer than a single letter), the GNU/Linux linker prepends
the string <literal>lib</literal> to the beginning of the library name and
appends <literal>.so</literal> to the end of it to form the library's
filename. This library contains many functions to automate
all types of tasks. The two we are using are
<literal>printf<indexterm><primary>printf</primary></indexterm></literal>, which prints strings, and
<literal>exit<indexterm><primary>exit</primary></indexterm></literal>, which exits the program.
</para>
<para>
Notice that the symbols <literal>printf</literal> and <literal>exit</literal>
are simply referred to by name within the program. In previous chapters, the
linker would resolve all of the names to physical memory addresses, and the
names would be thrown away. When using dynamic linking<indexterm><primary>dynamic linking</primary></indexterm>, the name itself
resides within the executable, and is resolved by the dynamic linker when it
is run. When the program is run by the user,
the dynamic linker<indexterm><primary>dynamic linker</primary></indexterm> loads the
dynamic libraries<indexterm><primary>dynamic libraries</primary></indexterm> listed in our link statement,
and then finds all of the function and variable names that were named by
our program but not found at link time, and matches them up with corresponding
entries in the shared libraries it loads. It then replaces all of the names
with the addresses which they are loaded at. This sounds time-consuming. It
is to a small degree, but it only happens once - at program startup time.
</para>
</sect1>
<sect1>
<title>How Dynamic Libraries Work</title>
<para>
In our first programs, all of the code was contained within the
source file. Such programs are called
<emphasis>statically-linked executables<indexterm><primary>statically-linked</primary></indexterm></emphasis>, because
they contained all of the necessary functionality for the program
that wasn't handled by the kernel. In the programs we wrote in
<xref linkend="records" />, we used both our main program file
and files containing routines used by multiple programs. In these cases,
we combined all of the code together using the linker at link-time, so it
was still statically-linked.
However, in the <literal>helloworld-lib</literal> program, we started
using dynamic libraries. When you use dynamic libraries, your program is
then <emphasis>dynamically-linked<indexterm><primary>dynamically-linked</primary></indexterm></emphasis>, which means that not all of the code
needed to run the program is actually contained within the program file
itself, but in external libraries.
</para>
<para>
When we put the <literal>-lc</literal> on the command to
link the <literal>helloworld</literal> program, it told the linker
to use the <literal>c</literal> library (<filename>libc.so</filename>) to look up any symbols<indexterm><primary>symbols</primary></indexterm> that
weren't already defined in <filename>helloworld.o</filename>. However,
it doesn't actually add any code to our program, it just notes in
the program where to look.
When the <literal>helloworld</literal> program begins, the file
<filename>/lib/ld-linux.so.2<indexterm><primary>/lib/ld-linux.so.2</primary></indexterm></filename> is loaded first. This is
the dynamic linker<indexterm><primary>dynamic linker</primary></indexterm>. This looks at our <literal>helloworld</literal>
program and sees that it needs the <literal>c</literal> library
to run. So, it searches for a file called <filename>libc.so</filename>
in the standard places (listed in <filename>/etc/ld.so.conf<indexterm><primary>/etc/ld.so.conf</primary></indexterm></filename>
and in the contents of the <literal>LD_LIBRARY_PATH<indexterm><primary>LD_LIBRARY_PATH</primary></indexterm></literal> environment variable),
then looks in it for all the needed symbols (<literal>printf</literal>
and <literal>exit</literal> in this
case), and then loads the library into the program's virtual memory.
Finally, it replaces all instances of <literal>printf</literal> in the
program with the actual location of <literal>printf</literal> in the
library.
</para>
<para>
Run the following command:
<programlisting>
ldd<indexterm><primary>ldd</primary></indexterm> ./helloworld-nolib
</programlisting>
It should report back <literal>not a dynamic executable</literal>.
This is just like we said - <literal>helloworld-nolib</literal> is a
statically-linked executable. However, try this:
<programlisting>
ldd ./helloworld-lib
</programlisting>
It will report back something like
<programlisting>
libc.so.6 => /lib/libc.so.6 (0x4001d000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x400000000)
</programlisting>
The numbers in parenthesis may be different on your system. This
means that the program <literal>helloworld</literal> is
linked to <filename>libc.so.6</filename> (the <literal>.6</literal>
is the version number), which is found at <filename>/lib/libc.so.6</filename>,
and <filename>/lib/ld-linux.so.2</filename> is found at
<filename>/lib/ld-linux.so.2</filename>. These libraries have to be loaded
before the program can be run. If you are interested, run the <literal>ldd</literal>
program on various programs that are on your Linux distribution, and see what
libraries they rely on.
</para>
</sect1>
<sect1>
<title>Finding Information about Libraries</title>
<para>
Okay, so now that you know about libraries, the question is, how
do you find out what libraries you have on your system and what
they do? Well, let's skip that question for a minute and ask another
question: How do programmers describe functions to each other in their
documentation? Let's
take a look at the function <literal>printf</literal>. Its calling interface<indexterm><primary>calling interface</primary></indexterm>
(usually referred to as a <emphasis>prototype<indexterm><primary>prototype</primary></indexterm></emphasis>) looks like this:
</para>
<programlisting>
int printf(char *string, ...);
</programlisting>
<para>
In Linux, functions<indexterm><primary>functions</primary></indexterm> are described in the C programming language<indexterm><primary>C programming language</primary></indexterm>.
In fact, most Linux programs are written in C. That is why most documentation
and binary compatibility is defined using the C language. The interface to
the <literal>printf</literal> function above is described using the C programming language.
</para>
<para>
This definition means that there is a function <literal>printf</literal>.
The things inside the parenthesis are the function's parameters<indexterm><primary>parameters</primary></indexterm> or arguments.
The first parameter here is <literal>char *string</literal>. This means
there is a parameter named <literal>string</literal> (the name isn't
important, except to use for talking about it), which has a
type <literal>char *</literal>. <literal>char<indexterm><primary>char</primary></indexterm></literal>
means that it wants a single-byte character. The <literal>*<indexterm><primary>*</primary></indexterm></literal> after it means
that it doesn't actually want a character as an argument, but instead
it wants the address of a character or sequence of characters. If you look
back at our <literal>helloworld program</literal>, you will notice that
the function call looked like this:
</para>
<programlisting>
pushl $hello
call printf
</programlisting>
<para>
So, we pushed the address of the <literal>hello</literal> string, rather
than the actual characters. You might notice that we didn't push the length
of the string. The way that <literal>printf<indexterm><primary>printf</primary></indexterm></literal> found
the end of the string was because we ended it with a null character<indexterm><primary>null character</primary></indexterm>
(<literal>\0</literal>). Many functions work that way, especially C language
functions. The <literal>int<indexterm><primary>int</primary></indexterm></literal> before the function definition tell what
type of value the function will return in &eax-indexed; when it returns.
<literal>printf</literal> will return an <literal>int</literal>
when it's through. Now, after the <literal>char *string</literal>, we
have a series of periods, <literal>...<indexterm><primary>...</primary></indexterm></literal>. This means that it
can take an indefinite number of additional arguments after the string.
Most functions can only take a specified number of arguments. <literal>printf</literal>,
however, can take many. It will look into the
<literal>string</literal> parameter, and everywhere it sees the characters
<literal>%s</literal>, it will look for another string from the stack to insert, and
everywhere it sees <literal>%d</literal> it will look for a number
from the stack to insert. This is best described using an example:
</para>
<programlisting>
&printf-example-s;
</programlisting>
<para>
Type it in with the filename <filename>printf-example.s</filename>,
and then do the following commands:
</para>
<programlisting>
as printf-example.s -o printf-example.o
ld printf-example.o -o printf-example -lc \
-dynamic-linker /lib/ld-linux.so.2
</programlisting>
<para>
Then run the program with <literal>./printf-example</literal>, and it
should say this:
</para>
<programlisting>
Hello! Jonathan is a person who loves the number 3
</programlisting>
<para>
Now, if you look at the code, you'll see that we actually push
the format string last, even though it's the first parameter listed. You
always push a functions parameters in reverse order.<footnote>
<para>
The reason that parameters are pushed in the reverse order is because
of functions which take a variable number of parameters like
<literal>printf</literal>. The parameters pushed in last
will be in a known position relative to the top of the stack. The
program can then use these parameters to determine where on the stack
the additional arguments are, and what type they are. For example,
<literal>printf</literal> uses the format string to determine how many
other parameters are being sent. If we pushed the known arguments first,
you wouldn't be able to tell where they were on the stack.
</para></footnote>
You may be wondering
how the <literal>printf<indexterm><primary>printf</primary></indexterm></literal> function knows how many parameters
there are. Well, it searches through your string, and counts how
many <literal>%d</literal>s and <literal>%s</literal>s it finds,
and then grabs that number of parameters from the stack. If the
parameter matches a <literal>%d</literal>, it treats it as a number,
and if it matches a <literal>%s</literal>, it treats it as a pointer
to a null-terminated string. <literal>printf</literal> has
many more features than this, but these are the most-used ones.
So, as you can see, <literal>printf</literal> can make output a
lot easier, but it also has a lot of overhead, because it has to
count the number of characters in the string, look through it for all of the
control characters it needs to replace, pull them off the stack,
convert them to a suitable representation (numbers have to be converted
to strings, etc), and stick them all together appropriately.
</para>
<para>
We've seen how to use the C programming language<indexterm><primary>C programming language</primary></indexterm> prototypes<indexterm><primary>prototypes</primary></indexterm> to
call library functions. To use them effectively, however, you need to know
several more of the possible data types for reading functions. Here are
the main ones:
</para>
<variablelist>
<varlistentry>
<term><literal>int<indexterm><primary>int</primary></indexterm></literal></term>
<listitem><para>
An <literal>int</literal> is an integer number (4 bytes on x86 processor).
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>long<indexterm><primary>long</primary></indexterm></literal></term>
<listitem><para>
A <literal>long</literal> is also an integer number (4 bytes on an x86 processor).
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>long long<indexterm><primary>long long</primary></indexterm></literal></term>
<listitem><para>
A <literal>long long</literal> is an integer number that's larger than a <literal>long</literal> (8 bytes on an x86 processor).
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>short<indexterm><primary>short</primary></indexterm></literal></term>
<listitem><para>
A short is an integer number that's shorter than an <literal>int</literal> (2 bytes on an x86 processor).
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>char<indexterm><primary>char</primary></indexterm></literal></term>
<listitem><para>
A <literal>char</literal> is a single-byte integer number. This is mostly used for storing
character data, since ASCII strings usually are represented with one byte per character.
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>float<indexterm><primary>float</primary></indexterm></literal></term>
<listitem><para>
A <literal>float</literal> is a floating-point number (4 bytes on an x86 processor).
Floating-point numbers will be explained in more depth in <xref linkend="floatingpoint" />.
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>double<indexterm><primary>double</primary></indexterm></literal></term>
<listitem><para>
A <literal>double</literal> is a floating-point number that is larger than a float
(8 bytes on an x86 processor).
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>unsigned<indexterm><primary>unsigned</primary></indexterm></literal></term>
<listitem><para>
<literal>unsigned</literal> is a modifier used for any of the above types which
keeps them from being used as signed quantities. The difference between signed
and unsigned numbers will be discussed in <xref linkend="countingchapter" />.
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>*<indexterm><primary>*</primary></indexterm></literal></term>
<listitem><para>
An asterisk (<literal>*</literal>) is used to denote that the data isn't an actual
value, but instead is a pointer to a location holding the given
value (4 bytes on an x86 processor).
So, let's say in memory location <literal>my_location</literal> you have the number 20 stored.
If the prototype said to pass an <literal>int</literal>, you would
use direct addressing mode<indexterm><primary>direct addressing mode</primary></indexterm> and do <literal>pushl my_location</literal>.
However, if the prototype said to pass an <literal>int *</literal>,
you would do <literal>pushl $my_location</literal> - an immediate mode<indexterm><primary>immediate mode addressing</primary></indexterm> push of
the address that the value resides in. In addition to indicating the address of
a single value, pointers can also be used to
pass a sequence of consecutive locations, starting with the one pointed to by the
given value. This is called an array<indexterm><primary>array</primary></indexterm>.
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>struct<indexterm><primary>struct</primary></indexterm></literal></term>
<listitem><para>
A <literal>struct</literal> is a set of data items that have been put together under
a name. For example you could declare:
<programlisting>
struct teststruct {
int a;
char *b;
};
</programlisting>
and any time you ran into <literal>struct teststruct</literal> you
would know that it is actually two words right next to each other,
the first being an integer, and the second a pointer to a character
or group of characters. You never see structs passed
as arguments to functions. Instead, you usually see pointers to
structs passed as arguments. This is because passing structs to functions
is fairly complicated, since they can take up so many storage locations.
</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>typedef<indexterm><primary>typedef</primary></indexterm></literal></term>
<listitem><para>
A <literal>typedef</literal> basically allows you to rename a type. For example, I can
do <literal>typedef int myowntype;</literal> in a C program, and
any time I typed <literal>myowntype</literal>, it would be just
as if I typed <literal>int</literal>. This can get kind of
annoying, because you have to look up what all of the typedefs and
structs in a function prototype really mean. However, <literal>typedef</literal>s
are useful for giving types more meaningful and descriptive names.
</para></listitem>
</varlistentry>
</variablelist>
<note>
<title>Compatibility Note</title>
<para>
The listed sizes are for intel-compatible (x86) machines. Other machines will
have different sizes. Also, even when parameters shorter than a word are
passed to functions, they are passed as longs on the stack.
</para>
</note>
<para>
That's how to read function documentation. Now, let's get back to the
question of how to find out about libraries.
Most of your system libraries are in <filename>/usr/lib<indexterm><primary>/usr/lib</primary></indexterm></filename>
or <filename>/lib<indexterm><primary>/lib</primary></indexterm></filename>. If you want to just see what symbols they
define, just run <literal>objdump<indexterm><primary>objdump</primary></indexterm> -R FILENAME</literal> where
<literal>FILENAME</literal> is the full path to the library. The output
of that isn't too helpful, though, for finding an interface that you might
need. Usually, you have to know what
library you want at the beginning, and then just read the documentation.
Most libraries have manuals or man pages for their functions. The web is the
best source of documentation for libraries. Most libraries from
the GNU project also have info pages on them, which are a little more
thorough than man pages.
</para>
</sect1>
<sect1>
<title>Useful Functions</title>
<para>
Several useful functions you will want to be aware of from the <literal>c</literal>
library include:
</para>
<itemizedlist>
<listitem><para><literal>size_t strlen<indexterm><primary>strlen</primary></indexterm> (const char *s)</literal> calculates the size of null-terminated strings.</para></listitem>
<listitem><para><literal>int strcmp<indexterm><primary>strcmp</primary></indexterm> (const char *s1, const char *s2)</literal> compares two strings alphabetically.</para></listitem>
<listitem><para><literal>char * strdup<indexterm><primary>strdup</primary></indexterm> (const char *s)</literal> takes the pointer to a string, and creates a new copy in a new location, and returns the new location.</para></listitem>
<listitem><para><literal>FILE * fopen<indexterm><primary>fopen</primary></indexterm> (const char *filename, const char *opentype)</literal> opens a managed, buffered file (allows easier reading and writing than using file descriptors directly).<footnote><para><literal>stdin</literal>, <literal>stdout</literal>, and <literal>stderr</literal> (all lower case) can be used in these programs to refer to the files of their corresponding file descriptors.</para></footnote><footnote><para><literal>FILE</literal> is a struct. You don't need to know its contents to use it. You only have to store the pointer and pass it to the relevant other functions.</para></footnote></para></listitem>
<listitem><para><literal>int fclose<indexterm><primary>fclose</primary></indexterm> (FILE *stream)</literal> closes a file opened with <literal>fopen</literal>.</para></listitem>
<listitem><para><literal>char * fgets<indexterm><primary>fgets</primary></indexterm> (char *s, int count, FILE *stream)</literal> fetches a line of characters into string <literal>s</literal>.</para></listitem>
<listitem><para><literal>int fputs<indexterm><primary>fputs</primary></indexterm> (const char *s, FILE *stream)</literal> writes a string to the given open file.</para></listitem>
<listitem><para><literal>int fprintf<indexterm><primary>fprintf</primary></indexterm> (FILE *stream, const char *template, ...)</literal> is just like <literal>printf</literal>, but it uses an open file rather than defaulting to using standard output.</para></listitem>
</itemizedlist>
<para>
You can find the complete manual on this library by going to http://www.gnu.org/software/libc/manual/
</para>
</sect1>
<sect1>
<title>Building a Dynamic Library</title>
<para>
Let's say that we wanted to take all of our shared code from
<xref linkend="records" /> and build it into a dynamic library<indexterm><primary>dynamic library</primary></indexterm> to
use in our programs. The first thing we would do is assemble them
like normal:
</para>
<programlisting>
as write-record.s -o write-record.o
as read-record.s -o read-record.o
</programlisting>
<para>
Now, instead of linking them into a program, we want to link
them into a dynamic library. This changes our linker command
to this:
</para>
<programlisting>
ld -shared write-record.o read-record.o -o librecord.so
</programlisting>
<para>
This links both of these files together into a dynamic library<indexterm><primary>dynamic library</primary></indexterm>
called <filename>librecord.so</filename>. This file can now
be used for multiple programs. If we need to update the functions
contained within it, we can just update this one file and not
have to worry about which programs use it.
</para>
<para>
Let's look at how we would link against this library. To link
the <literal>write-records</literal> program, we would do the
following:
</para>
<programlisting>
as write-records.s -o write-records
ld -L . -dynamic-linker /lib/ld-linux.so.2 \
-o write-records -lrecord write-records.o
</programlisting>
<para>
In this command, <literal>-L .</literal> told the linker to look
for libraries in the current directory (it usually only searches
<filename>/lib<indexterm><primary>/lib</primary></indexterm></filename> directory,
<filename>/usr/lib<indexterm><primary>/usr/lib</primary></indexterm></filename>
directory, and a few others). As we've seen, the option
<literal>-dynamic-linker /lib/ld-linux.so.2</literal> specified the dynamic
linker. The option <literal>-lrecord</literal> tells the linker to search
for functions in the file named <filename>librecord.so</filename>.
</para>
<para>
Now the <literal>write-records</literal> program is built, but it will not
run. If we try it, we will get an error like the following:
</para>
<programlisting>
./write-records: error while loading shared libraries:
librecord.so: cannot open shared object file: No such
file or directory
</programlisting>
<para>
This is because, by default, the dynamic linker only searches
<filename>/lib</filename>, <filename>/usr/lib</filename>, and whatever
directories are listed in <filename>/etc/ld.so.conf<indexterm><primary>/etc/ld.so.conf</primary></indexterm></filename> for
libraries. In order to run the program, you either need to move the
library to one of these directories, or execute the following command:
</para>
<programlisting>
LD_LIBRARY_PATH=.
export LD_LIBRARY_PATH
</programlisting>
<indexterm><primary>LD_LIBRARY_PATH</primary></indexterm>
<para>
Alternatively, if that gives you an error, do this instead:
</para>
<programlisting>
setenv LD_LIBRARY_PATH .
</programlisting>
<para>
Now, you can run <literal>write-records</literal> normally by typing
<literal>./write-records</literal>. Setting
<literal>LD_LIBRARY_PATH</literal> tells the linker to add whatever
paths you give it to the library search path for dynamic libraries.
</para>
<para>
For further information about dynamic linking, see the following sources
on the Internet:
</para>
<itemizedlist>
<listitem><para>The man page for <literal>ld.so</literal> contains a lot of information about how the Linux dynamic linker works.</para></listitem>
<listitem><para>http://www.benyossef.com/presentations/dlink/ is a great presentation on dynamic linking in Linux.</para></listitem>
<listitem><para>http://www.linuxjournal.com/article.php?sid=1059 and http://www.linuxjournal.com/article.php?sid=1060 provide a good introduction to the ELF<indexterm><primary>ELF</primary></indexterm> file format, with more detail available at http://www.cs.ucdavis.edu/~haungs/paper/node10.html</para></listitem>
<listitem><para>http://www.iecc.com/linker/linker10.html contains a great description of how dynamic linking works with ELF files.</para></listitem>
<listitem><para>http://linux4u.jinr.ru/usoft/WWW/www_debian.org/Documentation/elf/node21.html contains a good introduction to programming position-independent code for shared libraries under Linux.</para></listitem>
</itemizedlist>
</sect1>
<sect1>
<title>Review</title>
<sect2>
<title>Know the Concepts</title>
<itemizedlist>
<listitem><para>What are the advantages and disadvantages of shared libraries?</para></listitem>
<listitem><para>Given a library named 'foo', what would the library's filename be?</para></listitem>
<listitem><para>What does the <literal>ldd</literal> command do?</para></listitem>
<listitem><para>Let's say we had the files <filename>foo.o</filename> and <filename>bar.o</filename>, and you wanted to link them together, and dynamically link them to the library 'kramer'. What would the linking command be to generate the final executable?</para></listitem>
<listitem><para>What is <emphasis>typedef</emphasis> for?</para></listitem>
<listitem><para>What are <emphasis>struct</emphasis>s for?</para></listitem>
<listitem><para>What is the difference between a data element of type <emphasis>int</emphasis> and <emphasis>int *</emphasis>? How would you access them differently in your program?</para></listitem>
<listitem><para>If you had a object file called <filename>foo.o</filename>, what would be the command to create a shared library called 'bar'?</para></listitem>
<listitem><para>What is the purpose of LD_LIBRARY_PATH?</para></listitem>
</itemizedlist>
</sect2>
<sect2>
<title>Use the Concepts</title>
<itemizedlist>
<listitem><para>Rewrite one or more of the programs from the previous chapters to print their results to the screen using <literal>printf</literal> rather than returning the result as the exit status code. Also, make the exit status code be 0.</para></listitem>
<listitem><para>Use the <literal>factorial</literal> function you developed in <xref linkend="recursivefunctions" /> to make a shared library. Then re-write the main program so that it links with the library dynamically.</para></listitem>
<listitem><para>Rewrite the program above so that it also links with the 'c' library. Use the 'c' library's <literal>printf</literal> function to display the result of the <literal>factorial</literal> call.</para></listitem>
<listitem><para>Rewrite the <literal>toupper</literal> program so that it uses the <literal>c</literal> library functions for files rather than system calls.</para></listitem>
</itemizedlist>
</sect2>
<sect2>
<title>Going Further</title>
<itemizedlist>
<listitem><para>Make a list of all the environment variables used by the GNU/Linux dynamic linker.</para></listitem>
<listitem><para>Research the different types of executable file formats in use today and in the history of computing. Tell the strengths and weaknesses of each.</para></listitem>
<listitem><para>What kinds of programming are you interested in (graphics, databbases, science, etc.)? Find a library for working in that area, and write a program that makes some basic use of that library.</para></listitem>
<listitem><para>Research the use of <literal>LD_PRELOAD</literal>. What is it used for? Try building a shared library that contained the <literal>exit</literal> function, and have it write a message to STDERR before exitting. Use <literal>LD_PRELOAD</literal> and run various programs with it. What are the results?</para></listitem>
</itemizedlist>
</sect2>
</sect1>
</chapter>