File: shlibs.xml

package info (click to toggle)
yaird 0.0.12-18etch1
  • links: PTS
  • area: main
  • in suites: etch
  • size: 1,432 kB
  • ctags: 725
  • sloc: perl: 4,161; xml: 3,233; ansic: 3,105; sh: 876; makefile: 150
file content (207 lines) | stat: -rw-r--r-- 7,413 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
<section id="shlibs">
  <title>Supporting Shared Libraries</title>

  <para>
    When an executable is added to the image, we want any required shared
    libraries to be added automatically.  The <code>SharedLibraries</code>
    module determines which files are required.  This section discusses
    the features of kernel and compiler we need to be aware of in order
    to do this reliably.
  </para>

  <para>
    Linux executables today are in ELF format; it is defined in
    <ulink url="http://www.linuxbase.org/spec/book/ELF-generic/ELF-generic.html">
      <citetitle>
	Generic ELF Specification ELFVERSION</citetitle></ulink>,
    part of the Linux Standard Base.  This is based on part of the System
    V ABI: Tool Interface Standard (TIS), Executable and Linking Format
    (ELF) Sepcification
  </para>

  <para>
    ELF has consequences in different parts of the system: in
    the link-editor, that needs to merge ELF object files into ELF
    executables; in the kernel (<filename>fs/binfmt_elf.c</filename>),
    that has to place the executable in RAM and transfer control to it,
    and in the runtime loader, that is invoked when starting the
    application to load the necessary shared libraries into RAM.
    The idea is as follows.
  </para>

  <itemizedlist>
    <listitem>
      <para>
	Executables are in ELF format, with a type of either
	<code>ET_EXEC</code> (executable) or <code>ET_DYN</code> (shared
	library; yes, you can execute those.)  There are other types of
	ELF file (core files for example) but you can't execute them.
      </para>
    </listitem>

    <listitem>
      <para>
	These files contain two kind of headers: program headers and
	section headers.  Program headers define segments of the file that
	the kernel should store consequetively in RAM; section headers define
	parts of the file that should be treated by the link editor
	as a single unit.  Program headers normally point to a group
	of adjacent sections.
      </para>
    </listitem>

    <listitem>
      <para>
	The program may be statically linked or dynamically (with shared
	libraries).
	If it's statically linked, the kernel loads relevant segments,
	then transfers control to main() in userland.
      </para>
    </listitem>

    <listitem>
      <para>
	If it's dynamically linked, one of the program headers has type
	<code>PT_INTERP</code>.  It points to a segment that contains
	the name of a (static) executable; this executable is loaded in
	RAM together with the segments of the dynamic executable.
      </para>
    </listitem>

    <listitem>
      <para>
	The kernel then transfers control to the userland
	interpreter, passing program headers and related info in a
	fourth argument to <code>main()</code>, after <code>envp</code>.
      </para>
    </listitem>

    <listitem>
      <para>
	There's one interesting twist: one of the segments loaded
	into RAM (<filename>linux-gate.so</filename>) does not
	come from the executable, but is a piece of kernel mapped
	into user space.  It contains a subroutine that the kernel
	provides to do a system call; the idea is that this way,
	the C library does not have to know which calling convention
	for system calls is supported by the kernel and optimal for
	the current hardware.  The link editor knows nothing about
	this, only the interpreter knows that the kernel can pass the
	address of this subroutine together with the program headers.
	<footnote>
	  <para>
	    For more info on the kernel-supplied shared library for
	    system calls, see
	
	    <ulink url="http://lwn.net/Articles/18411/">
	      <citetitle>LWN: How to speed up system calls</citetitle></ulink>,
	    <ulink url="http://lwn.net/Articles/30258/">
	      <citetitle>LWN: Patch: i386 vsyscall DSO implementation</citetitle></ulink>,
	    <ulink url="http://www.uwsg.iu.edu/hypermail/linux/kernel/0306.2/0674.html">
	      <citetitle>LKML: common name for the kernel DSO</citetitle></ulink>.
	  </para>
	</footnote>
      </para>
    </listitem>

    <listitem>
      <para>
	The interpreter interprets the <code>.dynamic</code> section of
	the dynamic executable.  This is a table containing various types
	of info; if the type is <code>DT_NEEDED</code>, the info is the
	name of a shared library that is needed to run the executable.
	Normally, it's the basename.
      </para>
    </listitem>

    <listitem>
      <para>
	The interpreter searches <code>LD_LIBARY_PATH</code> for the
	library and loads the first working version it finds, using a
	breath-first search.  Once everything is loaded, the interpreter
	hands over control to main in the executable.
      </para>
    </listitem>

    <listitem>
      <para>
	Except that that's not how it really works: the path that glibc
	uses depends on whether threads are supported, and klibc can
	function as a <code>PT_INTERP</code> but will not load additional
	libraries.
      </para>
    </listitem>
  </itemizedlist>

  <para>
    The <application>ldd</application> command finds the pathnames
    of shared libraries used by an executable.  This works
    only for glibc: it invokes the interpreter
    with the executable as argument plus an environment variable that
    tells it to print the pathnames rather than load them.  For other
    C libraries, there's no guaranteed correct way to find the path of
    shared libraries.
  </para>

  <para>
    Update: <application>ldd</application> also works for another 
    C library, uclibc, unless you disable that support while building
    the library by unsetting <code>LDSO_LDD_SUPPORT</code>.
  </para>

  <para>
    Thus, to figure out what goes on the initial ram image, first try
    <application>ldd</application>.  If that gives an answer, good.
    Otherwise, use a helper program to find <code>PT_INTERP</code> and
    <code>DT_NEEDED</code>.  If there's only <code>PT_INTERP</code>, good,
    add it to the image.  If there are <code>DT_NEEDED</code> libraries
    as well, and they have relative rather than absolute pathnames,
    we can't determine the full path, so don't generate an image.
  </para>

  <para>
    There are a number of options to build a helper to extract the relevant
    information from the executable:
    <itemizedlist>
      <listitem>
	<para>
	  Build it in perl.  The problem here is that unpacking 64-bit
	  integers is an optional part of the language.
	</para>
      </listitem>

      <listitem>
	<para>
	  Build a wrapper around <application>objdump</application> or
	  <application>readelf</application>.  The drawback is that
	  there programs are not part of a minimal Linux distribution:
	  depending on them in <application>yaird</application> would
	  increase the footprint.
	</para>
      </listitem>

      <listitem>
	<para>
	  Building a C program using libbdf.  This is a library
	  intended to simplify working with object files.  Drawbacks
	  are that it adds complexity that is not necessary in our
	  context since it supports multiple executable formats;
	  furthermore, at least in Debian it is treated as internal
	  to the gcc tool chain, complicating packaging the tool.
	</para>
      </listitem>

      <listitem>
	<para>
	  Building a C program based on <filename>elf.h</filename>.
	  This turns out to be easy to do.
	</para>
      </listitem>

    </itemizedlist>
  </para>

  <para>
    <application>Yaird</application> uses the last approach listed.
  </para>
</section>