Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > How ELF libraries work

Reply
Thread Tools

How ELF libraries work

 
 
sps
Guest
Posts: n/a
 
      12-28-2009
Hi All

In an attempt to learn more about ELF files (in particular how the GOT
and PLT work), I compiled a very simple C module as below

int num1 = 32;
extern int num2;

void print1()
{
printf("%d",num1);
printf("%d",num2);
}

void print2()
{
print1();

}

I then compiled it into a shared object (.so) file
gcc -c -fPIC print.c
gcc -shared -o libprint.so print.o

Using readelf and objdump, I examined the contents of the file.

readelf -a libprint.so
objdump -d libprint.so

I have mostly figured it out, but there are a couple of points that I
need clarified

a) Why are data objects and functions defined within the code module
indirectly accessed through the GOT. For example, the call to print1
is routed through the PLT. Why do this when you already know where
print1 is relative to the calling point? Is it just convenient to lump
everything in the GOT, or can these definitions be overriden?

b) There are two "mystery variables" which occur as the first and
second DWORD in the .data section. They have a RELATIVE relocation
applied to them. Because RELATIVE relocations do not relocate a
symbol, I don't know what these variables do? And in general, when do
use RELATIVE relocations. I understand that you are just adding the
load address to the location, but what code semantics create this
relocation?

c) Why are there two entries in the GOT for __cxa_finalize. They seem
to be identical, except that one is declared GLOB_DAT and the other
JMP_SLOT. The same occurs for __Jv_RegisterClasses.

d) What is the initial stack size of a process in Linux?

e) When the dynamic linker is called when a function is to be resolved
(lazy linkage), before it jumps to the DL, it pushes two values on the
stack: the first identifies the symbol to be resolved, and the second
identifies the calling module. How does this first value pushed map to
the symbol? It doesn't seem to be the symbol index in dynsym, and I
see no other relation to anything else?

f) Will a loader/dynamic linker only ever see GLOB_DAT, JMP_SLOT, COPY
and RELATIVE relocations? I assume the other relocation types only
apply to .o files. IS this correct? And if not, with what program
semantics would these other relocations appear?

Thanks for your answers.
 
Reply With Quote
 
 
 
 
BGB / cr88192
Guest
Posts: n/a
 
      12-28-2009

"sps" <(E-Mail Removed)> wrote in message
news:hhb87l$18h$(E-Mail Removed)...
> Hi All
>
> In an attempt to learn more about ELF files (in particular how the GOT
> and PLT work), I compiled a very simple C module as below
>


probably OT for CLC, but oh well...


> int num1 = 32;
> extern int num2;
>
> void print1()
> {
> printf("%d",num1);
> printf("%d",num2);
> }
>
> void print2()
> {
> print1();
>
> }
>
> I then compiled it into a shared object (.so) file
> gcc -c -fPIC print.c
> gcc -shared -o libprint.so print.o
>
> Using readelf and objdump, I examined the contents of the file.
>
> readelf -a libprint.so
> objdump -d libprint.so
>
> I have mostly figured it out, but there are a couple of points that I
> need clarified
>
> a) Why are data objects and functions defined within the code module
> indirectly accessed through the GOT. For example, the call to print1
> is routed through the PLT. Why do this when you already know where
> print1 is relative to the calling point? Is it just convenient to lump
> everything in the GOT, or can these definitions be overriden?
>


it is so that code can be position independent without needing internal
fixups.
it is also because, otherwise, the compiler would need to know which code is
internal or external (at compile time), or risk another overhead (having to
emit an additional indirect jump at link-time).

on Windows, the strategy is to instead assume local (first) and fall-back to
an indirect jump, I guess assuming that local jumps are a lot more likely
than imports.

it is also worth noting that DLL's are not, as a general rule, position
independent, meaning that they have to be relocated if loaded to a
non-preferred address (commonly referred to as "rebasing").

for variables, issues get a little more ugly, which is what one can't
(generally) share global variables between DLL's.


> b) There are two "mystery variables" which occur as the first and
> second DWORD in the .data section. They have a RELATIVE relocation
> applied to them. Because RELATIVE relocations do not relocate a
> symbol, I don't know what these variables do? And in general, when do
> use RELATIVE relocations. I understand that you are just adding the
> load address to the location, but what code semantics create this
> relocation?
>


I would have to go check, but I think those are self-reference pointers.
I am not certain not having digged into ELF shared-object mechanics that
much (I know more at this point about PE/COFF DLL's...).


> c) Why are there two entries in the GOT for __cxa_finalize. They seem
> to be identical, except that one is declared GLOB_DAT and the other
> JMP_SLOT. The same occurs for __Jv_RegisterClasses.
>


these would appear to be related to g++ and GCJ.

AFAIK, '__cxa_finalize' is called during app teardown to do, whatever...
I think there should also be (possibly) a "__cxa_initialize" which would be
used for top-level initialization, again related to C++.

'__Jv_' is a prefix generally used for much of anything GCJ related.


> d) What is the initial stack size of a process in Linux?
>


I think like 8MB or something...
note that this is not generally mapped in all at once, but the backing
memory gets paged into existence on write.

Windows uses 4MB, and slightly different behavior for paging in the stack
(one needs to be careful if grabbing too much stack memory at once).


> e) When the dynamic linker is called when a function is to be resolved
> (lazy linkage), before it jumps to the DL, it pushes two values on the
> stack: the first identifies the symbol to be resolved, and the second
> identifies the calling module. How does this first value pushed map to
> the symbol? It doesn't seem to be the symbol index in dynsym, and I
> see no other relation to anything else?
>


maybe a pointer?...
I really don't know on this one.


> f) Will a loader/dynamic linker only ever see GLOB_DAT, JMP_SLOT, COPY
> and RELATIVE relocations? I assume the other relocation types only
> apply to .o files. IS this correct? And if not, with what program
> semantics would these other relocations appear?
>


you may want to check the ELF spec for this one...


in my case (for a custom DLL loader), I just implemented all of the
relocation types.


> Thanks for your answers.



 
Reply With Quote
 
 
 
 
Beej Jorgensen
Guest
Posts: n/a
 
      12-29-2009
On 12/28/2009 01:33 PM, sps wrote:
> In an attempt to learn more about ELF files (in particular how the GOT
> and PLT work), I compiled a very simple C module as below


This is a repost from comp.os.linux.development.system 5 years ago:

http://linux.derkeiler.com/Newsgroup...5-10/0016.html

You need to be less obvious... but then you'd be on topic.

Happy New Year!
-Beej
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
DVD Verdict reviews: ELF, ZORBA THE GREEK, THE BELLBOY, and more! DVD Verdict DVD Video 0 11-29-2004 10:10 AM
Watcom and ELF... Spike C++ 1 07-23-2004 12:03 PM
New Releases: Takimg Lives, Punisher & Elf: Updated complete downloable R1 DVD DB & info lists Doug MacLean DVD Video 0 06-29-2004 04:38 AM
PE and ELF libraries for Python a_bogdan_marinescu Python 0 11-21-2003 12:41 PM



Advertisments