Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Need to create a C lib - using C++ classes - is it possible

Reply
Thread Tools

Need to create a C lib - using C++ classes - is it possible

 
 
James Kanze
Guest
Posts: n/a
 
      05-25-2008
On May 25, 5:53 am, "Alf P. Steinbach" <(E-Mail Removed)> wrote:
> * Angus:
> > We have a lot of C++ code. And we need to now create a
> > library which can be used from C and C++. Given that we
> > have a lot of C++ code using classes how can we 'hide' the
> > fact that it is C++ from C compilers?


> The C++ code will need the C++ runtime library.


Good point. If you don't use any standard components from the
library, nor new, nor typeid, maybe not, but then what's the
point.

> Within the standards of C and C++ the only way to achieve that
> is to insist that the C code using the library is called from
> a C++ main program.


No. There are two separate issues involved here. Neither the C
nor the C++ standards say anything about how the compiler is
invoked; to get the C++ library with gcc, for example, you can
either invoke it as g++, or specify the library explicitly
(-lstdc++, with the normal Unix linkers). Formally, the C++
standard requires that main() be written and compiled in C++, or
you have undefined behavior. (I'm not sure, but that could also
be the case in C.) In practice, the reason for this is to
ensure correct initialization of variables with static lifetime;
if you have no static variables, it's possible that you won't
have a problem (but don't forget that std:ut, etc. are static
variables). And there too, the implementation could provide
other additional arrangements. (I'm not sure, but I think if
you compile with g++, rather than gcc, one of the effects is to
cause a different crt0 to be used, and that it is this crt0
which ensures the construction and destruction of static
variables. And it's also possible that some compilers add
special information to the object file when you compile a
program in C++, or perhaps only when you compile the program
with main, so that it will work. This trick can also be used to
ensure the additional libraries.)

> In Windows an alternative is to have the library as a DLL,
> because Windows DLLs are more decoupled.


That sort of works in Unix, as well, if the C++ standard library
is also a DLL. (Which is generally NOT recommended, of course.)

[...]
> The best is to forget that silly idea. Using C library from
> C++, OK. But C++ has additional requirements from runtime
> library, so other way, generally !OK, unless you're working at
> a low level where you wouldn't have to ask...


It's actually a frequent requirement, and the original posters
question reflects one of the more common ways of migrating to
C++.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
 
 
 
James Kanze
Guest
Posts: n/a
 
      05-25-2008
On May 25, 6:28 am, "Alf P. Steinbach" <(E-Mail Removed)> wrote:
> * Ian Collins:


> > Alf P. Steinbach wrote:
> >> * Angus:


> >>> We have a lot of C++ code. And we need to now create a
> >>> library which can be used from C and C++. Given that we
> >>> have a lot of C++ code using classes how can we 'hide' the
> >>> fact that it is C++ from C compilers?
> >> The C++ code will need the C++ runtime library.


> >> Within the standards of C and C++ the only way to achieve
> >> that is to insist that the C code using the library is
> >> called from a C++ main program.


> > Is it? In practice, the only constraint on any platform
> > I've used is that the application must be linked with the
> > C++ compiler diver.


> It's not clear exactly what you mean, but I'm guessing you
> mean using a C++ compiler and linker for the main program.


> In that case you have a C++ main program.


Or not. You can very easily compile main as a C program, but
use the C++ compiler driver (supposing such a thing exists) to
link.

In a very real sense, you're both wrong, of course: the correct
answer is that it depends on the implementation. Under Unix (or
at least Solaris and Linux), you do have a separate C++ compiler
driver (invoked by g++ or CC, rather than by gcc or cc), and
linking with that will ensure that the C++ standard libraries
are linked in. You can also invoke the linker directly, both
under Unix or under Windows, and ensure that whatever you want
is linked in however you want, but you generally have to know
very well what you are doing, what libraries are actually
required, what commands or whatever might be necessary in
addition to ensure support for e.g. exceptions or dynamic
initialization of static variables, etc., etc. And some of that
functionality may not even be accessible directly at the linker
interface; at least some implementations of C++ have generated
special code in main to ensure the initialization of statics,
for example (in which case, main does have to be compiled as
C++).

All in all, it's more complicated than just putting everything
in an `extern "C"', but most of all, it's very, very
implementation dependent.

> >> The best is to forget that silly idea. Using C library
> >> from C++, OK. But C++ has additional requirements from
> >> runtime library, so other way, generally !OK, unless you're
> >> working at a low level where you wouldn't have to ask...


> > Why is it generally not OK? Provided the restrictions
> > regarding exception leakage are met, there shouldn't be a
> > problem.


> E.g. static variables of class type, internal use of
> exceptions, use of standard library features that
> (implementation-specific) requires C++ runtime library,
> including internal use of exceptions in standard library; this
> is a FAQ, IIRC.


As a general rule, it's probably easier (and surer) if you
ensure that main is compiled as C++. With at least some
development systems, however, it's not absolutely necessary.
What is necessary, however, even if you compile main as C++, is
that you find out the implementation specific requirements, and
respect them.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
 
 
 
James Kanze
Guest
Posts: n/a
 
      05-25-2008
On May 25, 6:41 am, Ian Collins <(E-Mail Removed)> wrote:
> Alf P. Steinbach wrote:


[...]
> > E.g. static variables of class type, internal use of
> > exceptions, use of standard library features that
> > (implementation-specific) requires C++ runtime library,
> > including internal use of exceptions in standard library;
> > this is a FAQ, IIRC.


> True, but linking the application with the C++ compiler diver
> takes care of these details.


Maybe with g++ and Sun CC, but it's certainly not a general
rule. I'm not too sure with regards to VC++, but I'm pretty
sure that at least in the past, Microsoft had a facility where
by the compiler added information to the object file requesting
that the linker add certain libraries (and remember, unlike with
Unix linkers, the order of libraries doesn't matter with the
Microsoft linker). It's quite possible that Alf's description
applies to VC++.

> But as you say, this makes a C++ application, even if most of
> the code is C!


A source is either C or C++. You can compile an individual
source with either a C++ compiler or a C compiler. But an
application is a binary file, and doesn't have a language
(unless it's a shell script, in which case the language is
neither C nor C++).

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-25-2008
On May 25, 1:34 pm, "Alf P. Steinbach" <(E-Mail Removed)> wrote:
> * James Kanze:
> > On May 25, 5:53 am, "Alf P. Steinbach" <(E-Mail Removed)> wrote:
> >> * Angus:
> >>> We have a lot of C++ code. And we need to now create a
> >>> library which can be used from C and C++. Given that we
> >>> have a lot of C++ code using classes how can we 'hide' the
> >>> fact that it is C++ from C compilers?


> >> The C++ code will need the C++ runtime library.


> > Good point. If you don't use any standard components from the
> > library, nor new, nor typeid, maybe not, but then what's the
> > point.


> >> Within the standards of C and C++ the only way to achieve that
> >> is to insist that the C code using the library is called from
> >> a C++ main program.


> > No. There are two separate issues involved here. Neither the C
> > nor the C++ standards say anything about how the compiler is
> > invoked; to get the C++ library with gcc, for example, you can
> > either invoke it as g++, or specify the library explicitly
> > (-lstdc++, with the normal Unix linkers). Formally, the C++
> > standard requires that main() be written and compiled in C++, or
> > you have undefined behavior.


> That last sentence contradicts the "No" at the start.


The "no" referred to the statement that "the only way to achieve
that [the presence of the C++ runtime library] is to insist that
the C code [...] is called from a C++ main program." IMHO, it's
probably a good idea to do so, and will make things easier, but
the whole business of invoking the linker and making sure you
get the right libraries is implementation defined.

> Anyway, this is a FAQ item,


> <url:http://www.parashift.com/c++-faq-lite/mixing-c-and-cpp.html#faq-32..1>
> "You must use your C++ compiler when compiling main()
> (e.g., for static initialization)"


> and as you note also it's stated by the Holy Standard that
> static variables may be (dynamically) initialized after entry
> of main(), which implicitly requires a C++ main().


Woah. There's a definite misunderstanding here. The C++
standard has a somewhat twisted explination concerning how
static variables may be initialized after entering main, but it
is in practice unimplementable, and can effectively be ignored.
Static variables are initialized before entering main.

The issue here is what that actually means. Some compilers
(including CFront) do (or did) recognize the name main, and
generate special code for the function, which called the
function which did global initialization. Conceptually, this is
still "before entering main", since it is before any statement
you write in main will be executed. But of course, it *does*
require that main be compiled with the C++ compiler in order to
ensure static initialization.

But this is an implementation constraint, not a standard
constraint. The standard doesn't really say anything about how
you link C and C++ (or even how you link C++ with other C++).
Very few implementations today have this constraint. But they
have other constraints (invoke the linker with g++, rather than
gcc, for example). The whole point is that just about anything
you try to say about this issue is implementation defined.

> Even though that part of the standard is IMHO defective,
> talking about "after the first statement of main" instead of
> entry of main.


It's defective, because the constraints that it places on the
implementation in this case are impossible to meet. But the
"after the first statement in main" is very intentional; what
happens before, and where, can simply not be determined by a
conforming program.

> [snip]


> >> In Windows an alternative is to have the library as a DLL,
> >> because Windows DLLs are more decoupled.


> > That sort of works in Unix, as well, if the C++ standard
> > library is also a DLL. (Which is generally NOT recommended,
> > of course.)


> Well, the Windows situation is sort of opposite. Windows
> dynamic libraries are strongly decoupled modules. In
> particular, the OS provides automatic per-DLL initialization
> and cleanup calls, so a DLL is almost free to use whatever
> (the main problem with this scheme has to do with
> per-thread-per-DLL storage).


Both Unix and Windows do object specific initialization when you
dynamically load an object. There's no difference in them
there. Both also have many different options with regards to
what is or is not visible in the various "modules". The main
differences are, I think, that 1) all of the options in Windows
are compile and link time---you don't have any choices at load
time, and 2) symbols in the root are not available to
dynamically loaded objects under Windows, and are always
available to dynamically loaded objects under Unix. Other than
that, it's largely a question of which options you choose. (And
you certainly don't have per DLL dynamic storage under Windows
unless you want to. I know that the Windows applications where
I work don't have it.)

> > [...]
> >> The best is to forget that silly idea. Using C library from
> >> C++, OK. But C++ has additional requirements from runtime
> >> library, so other way, generally !OK, unless you're working at
> >> a low level where you wouldn't have to ask...


> > It's actually a frequent requirement, and the original posters
> > question reflects one of the more common ways of migrating to
> > C++.


> Ouch.


> It's very backwards, in many ways: structurally,
> learning-wise, safety, simplicity. Just think about it. The
> programmer is trying to implement a type safe little part of
> the program in C++, but since using C as main language doesn't
> even manage to do this in a good way or learn the Right
> Things, then to top it off throws away all that hard-won type
> safety and language-enforced correctness by using this part
> only via a non-enforcing C language interface.


The company has a large application written in C. They're not
going to rewrite the whole thing. As subsystems get rewritten,
they're rewritten in C++. It's just good engineering.

> I guess with proper insulating abstractions, like XCOM, it
> could be better, but when the aim is to "migrate" to C++ I
> doubt such abstractions will be in place.


You don't need XCOM. You do need to provide two interfaces, a
C++ interface (which will be used by new code, written in C++),
and a C interface which is compatible with the previous C
interface. But then, you usually have to ensure backwards
compatibility anyway.

> However, technically it should be no big deal to write


> extern "C" int c_language_main( int, char*[] );


> int main( int argc, char* argv[] )
> {
> return c_language_main( argc, argv );
> }


> and compile that top-level as C++.


Technically, no. Practically, it depends. There may be very
good reasons for not doing so.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-26-2008
On May 25, 7:51 pm, "Alf P. Steinbach" <(E-Mail Removed)> wrote:
> * James Kanze:
> >> Anyway, this is a FAQ item,


> >> <url:http://www.parashift.com/c++-faq-lite/mixing-c-and-cpp.html#faq-32.1>
> >> "You must use your C++ compiler when compiling main()
> >> (e.g., for static initialization)"


> >> and as you note also it's stated by the Holy Standard that
> >> static variables may be (dynamically) initialized after
> >> entry of main(), which implicitly requires a C++ main().


> > Woah. There's a definite misunderstanding here. The C++
> > standard has a somewhat twisted explination concerning how
> > static variables may be initialized after entering main, but
> > it is in practice unimplementable, and can effectively be
> > ignored. Static variables are initialized before entering
> > main.


> I agree, for different reasons, that this part of the standard
> is ungood and in fact pretty meaningless. However, it's
> there. And seems to survive into C++0x.


And no implementation actually tries to take advantage of it. I
don't know of any that defer initialization until after the
first statement in main.

> > The issue here is what that actually means. Some compilers
> > (including CFront) do (or did) recognize the name main, and
> > generate special code for the function, which called the
> > function which did global initialization. Conceptually, this is
> > still "before entering main", since it is before any statement
> > you write in main will be executed. But of course, it *does*
> > require that main be compiled with the C++ compiler in order to
> > ensure static initialization.


> > But this is an implementation constraint, not a standard
> > constraint. The standard doesn't really say anything about how
> > you link C and C++ (or even how you link C++ with other C++).
> > Very few implementations today have this constraint. But they
> > have other constraints (invoke the linker with g++, rather than
> > gcc, for example). The whole point is that just about anything
> > you try to say about this issue is implementation defined.


> >> Even though that part of the standard is IMHO defective,
> >> talking about "after the first statement of main" instead of
> >> entry of main.


> > It's defective, because the constraints that it places on
> > the implementation in this case are impossible to meet. But
> > the "after the first statement in main" is very intentional;


> No, I don't think it can be intentional.


It is.

The intent is to make dynamic linking conforming. Of course,
the wording still doesn't succeed in that, even with regards to
initialization, and there are many other places where dynamic
linking introduces undefined behavior, but as it happens, I
happened to campaign to get this statement removed, on the
grounds that it couldn't be implemented, and caused real
problems with existing code, and the reason given me as to why
it stayed was dynamic linking.

> int main()
> {
> return myCppMain();
> }


> "after the first statement in main" would here mean after the
> program's finished.


Or never, if nothing in the translation unit of the static
variable is ever used. Although you're right that "after the
first statement of main" really means after the first statement
of main has finished, which leads to some interesting problems
as well. The intent is very much "after you're into user
written code in main".

The original phrase (more or less) goes back to the ARM, and the
intent there was clearly to speed program start up, by not
requiring static initialisers to be executed until other code in
the module was needed, and thus, the module was paged in; at the
time, the specification of <iostream.h> required it to contain a
static variable with dynamic initialization, which meant that on
program start up, you'd get a page hit for every module which
included <iostream.h>. On the systems at the time, that could
be very noticeable.

The standard dropped the requirment concerning the static
variable; I don't know if this was intentional or through
oversight (when the previously single header was split up into
several distinct headers), but a lot of implementations (e.g.
g++, STL port) of <iostream> do define the static variable, even
if it is no longer required. And I don't notice complaints
about start up speed from them today. So maybe the issue isn't
relevant on today's machines (or maybe people concerned with
performance are simply using other compilers).

> The left brace is not a statement, and empty statements in C++
> have to be explicitly introduced via ";".


> My reading is that the intention is that initialization can
> occur after the left brace of the main function's function
> body, /before/ the first statement, but not later, just as in
> your CFront-example above. I can't make sense of anything
> else.


I didn't say it made sense. I said that it was the intent.
Conceptually, in the CFront-example, the initialization takes
place before entering main---at least with regards to anything
that a conforming program can tell. Or if you prefer, it is
what happens when you "execute" the opening left brace of main.
A sort of extended function prefix, so to speak---instead of
just setting up the local stack frame, the compiler generates
code to call the initializers, then set up the local stack
frame. There is explicite wording in the standard to allow
this, but it is elsewhere: the fact that, unlike in C, you are
not allowed to call main from your code.

> > what happens before, and where, can simply not be determined
> > by a conforming program.


> Huh?


There is no way a conforming program can determine whether the
initialization was the last thing before calling main in crt0
(or whatever the implementation calls its start-up code), or the
first thing in main (before any of your code is executed).

> >> [snip]

> > Both Unix and Windows do object specific initialization when
> > you dynamically load an object. There's no difference in
> > them there. Both also have many different options with
> > regards to what is or is not visible in the various
> > "modules". The main differences are, I think, that 1) all
> > of the options in Windows are compile and link time---you
> > don't have any choices at load time,


> What does this mean? What choices can be specified at load
> time for *nix shared library?


Whether the global symbols in the object are available when
loading other dynamic objects or not. (Specific implementations
of dlopen may have other options, but this basic choice is
specified by Posix.)

> > and 2) symbols in the root are not available to
> > dynamically loaded objects under Windows, and are always
> > available to dynamically loaded objects under Unix.


> I'm not sure what you mean here.


I'm not that sure of the terminology myself; by "root", I mean
the code loaded as the initial binary image, before any dynamic
objects have been loaded. When you load a dynamic object under
Unix (using dlopen), you must specify either RTLD_GLOBAL or
RTLD_LOCAL: in the first case, all of the globals in the dynamic
object become accessible to other dynamic objects, in the
second, no. But since it's the operating system which loads the
root, you can't specify anything there. Under Unix, the global
symbols in the root are available to all other dynamic objects,
as if it had been loaded specifying RTLD_GLOBAL. Under Windows,
if I understand correctly, the root is always loaded as if
RTLD_LOCAL had been specified (and the choice for other
dynamic objects is made when they are build, rather than when
they are loaded---but I'm not really that certain about anything
in the Windows world).

> But anyway, Windows DLLs enjoy a good degree of decoupling
> because there are two sets of symbols: symbols linked by the
> ordinary language-specific linker, which are only visible
> until the DLL has been created, and symbols linked by Windows
> loader, which are the subset of the former set that are
> explicitly exported or imported. All the rest, e.g. the DLL's
> usage of some runtime library, is hidden.


That's more or less true under Unix as well, depending on the
options. Of course, the linker under Unix isn't standardized,
and Posix allows an implementation to add any number of
additional options to dlopen, so different Unix will have
different capabilities here; Solaris, at least, offers the
possibility of exporting symbols on a symbol by symbol basis,
creating groups of dynamic objects, with symbols visible within
the group, but not elsewhere, and who knows what else.

Historically, Unix dynamic linking was developed to allow the
sharing of object files, and by default, it tries to behave as
much as possible like static linking. But in practice, it works
very well for things like plugins (where you want a maximum of
isolation) as well, and today is probably used more for this and
for versioning than for pure sharing.

> > Other than
> > that, it's largely a question of which options you choose. (And
> > you certainly don't have per DLL dynamic storage under Windows
> > unless you want to. I know that the Windows applications where
> > I work don't have it.)


> Not sure what you mean by "DLL dynamic storage", and even if I
> did understand that term I suspect that I wouldn't understand
> the complete sentence, ending with "unless you want to". What
> I wrote about was per-thread storage, and problems with that
> in the context of automatic initialization and cleanup calls
> from OS.


I thought you were talking about the common complaint that you
can't free memory in a different DLL than the one it was
allocated in. Which in fact depends on how you link; the
Windows specialists here have no trouble with it, for example.

[...]
> > The company has a large application written in C. They're
> > not going to rewrite the whole thing. As subsystems get
> > rewritten, they're rewritten in C++. It's just good
> > engineering.


> I'm not convinced that it is, in the sense of migration to
> C++.


> However, I think it could be good engineering in the sense of
> using C++ as a "restricted C", i.e. a C with more strict type
> checking.


No. Although that too is IMHO a good step. But a large
application will likely be organized into many sub-systems, and
in general, management prefers that when reworking one
sub-system, you not touch any other.

> A C program has a much more procedural structure than proper
> C++ code, and replacing parts with C++ means forcing use of
> C++ in procedural, non-OO mode.


It depends on the C. Long before I'd ever heard of C++, the way
I structured C was to define a struct and a set of functions
which manipulated it, and cross my fingers that no one
manipulated the struct other than with my functions. From what
I can see in third party libraries today, this seems to be more
or less common practice.

> >> I guess with proper insulating abstractions, like XCOM, it
> >> could be better, but when the aim is to "migrate" to C++ I
> >> doubt such abstractions will be in place.


> > You don't need XCOM. You do need to provide two interfaces, a
> > C++ interface (which will be used by new code, written in C++),
> > and a C interface which is compatible with the previous C
> > interface. But then, you usually have to ensure backwards
> > compatibility anyway.


> It's difficult to understand the first sentence here, which is
> seemingly a tautology. One must assume that what you mean is
> that "XCOM or similar technologies do not provide any
> significant advantage for ...", for what?


You have an existing interface, defined in C. You reimplement
the sub-system in C++, using classes, and defining an interface
which uses classes. You then implement the C interface,
forwarding to the new classes.

I'm not familiar with XCOM, but I don't think that this is
similar. (It's obviously a lot easier if the original
application did use some sort of isolation layer, like Corba or
XCOM. But most don't.)

> >> However, technically it should be no big deal to write


> >> extern "C" int c_language_main( int, char*[] );


> >> int main( int argc, char* argv[] )
> >> {
> >> return c_language_main( argc, argv );
> >> }


> >> and compile that top-level as C++.


> > Technically, no. Practically, it depends. There may be
> > very good reasons for not doing so.


> Such as?


Such as the fact that you don't have access to the main. It's
not part of the sub-system your group is responsible for.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-27-2008
On May 26, 1:19 pm, "Alf P. Steinbach" <(E-Mail Removed)> wrote:
> * James Kanze:
> > On May 25, 7:51 pm, "Alf P. Steinbach" <(E-Mail Removed)> wrote:


> >>>> However, technically it should be no big deal to write


> >>>> extern "C" int c_language_main( int, char*[] );


> >>>> int main( int argc, char* argv[] )
> >>>> {
> >>>> return c_language_main( argc, argv );
> >>>> }


> >>>> and compile that top-level as C++.


> >>> Technically, no. Practically, it depends. There may be
> >>> very good reasons for not doing so.


> >> Such as?


> > Such as the fact that you don't have access to the main. It's
> > not part of the sub-system your group is responsible for.


> Whomever's responsible should do it.


Whomever's responsible has other, more important things to do.
And may not even be available, if the library is the company's
product.

There's a tension here: basic software engineering says not to
change things that don't need changing, which argues against
replacing main. There are also very strong arguments in favor
of replacing it (which you've presented). When there are
arguments on both sides, it becomes a judgement call. Most of
the time, I'd go with your suggestion, but I recognize the fact
that there will be cases where it isn't practical.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-27-2008
On May 27, 8:31 am, Paavo Helde <(E-Mail Removed)> wrote:
> James Kanze <(E-Mail Removed)> kirjutas:
> > On May 25, 7:51 pm, "Alf P. Steinbach" <(E-Mail Removed)> wrote:
> >> * James Kanze:

> [...]


> > Whether the global symbols in the object are available when
> > loading other dynamic objects or not. (Specific implementations
> > of dlopen may have other options, but this basic choice is
> > specified by Posix.)


> >> > and 2) symbols in the root are not available to
> >> > dynamically loaded objects under Windows, and are always
> >> > available to dynamically loaded objects under Unix.


> >> I'm not sure what you mean here.


> > I'm not that sure of the terminology myself; by "root", I mean
> > the code loaded as the initial binary image, before any dynamic
> > objects have been loaded. When you load a dynamic object under
> > Unix (using dlopen), you must specify either RTLD_GLOBAL or
> > RTLD_LOCAL: in the first case, all of the globals in the dynamic
> > object become accessible to other dynamic objects, in the
> > second, no. But since it's the operating system which loads the
> > root, you can't specify anything there. Under Unix, the global
> > symbols in the root are available to all other dynamic objects,
> > as if it had been loaded specifying RTLD_GLOBAL. Under Windows,
> > if I understand correctly, the root is always loaded as if
> > RTLD_LOCAL had been specified (and the choice for other
> > dynamic objects is made when they are build, rather than when
> > they are loaded---but I'm not really that certain about anything
> > in the Windows world).


> In Windows, as you say, one decides separately for each
> function or class if it is visible to other modules
> ("exported") or not. Other modules can only be linked against
> exported symbols. In regard of dynamic linking, there is
> really no significant difference between "root" .EXE and other
> modules. The difference is more related to the typical way how
> one builds the whole program (final .EXE depending on a number
> of service DLL-s).


That's an interesting requirement. Who invokes this dynamic
linking, then, if not the executable?

> In Windows, the dynamic dependencies are resolved and checked by the
> static linker. The linker checks that all symbols are actually
> resolvable, and encodes the appropriate filename into the linked module.
> This pretty much means that the dependency graph of loadable modules is a
> directed one, without any cycles. For implicit loading of the whole
> program into memory the root module (final .EXE) has to be linked last,
> thus no implicitly loaded DLL can have symbols resolved by the .EXE
> module (modulo some file renaming trickery or out-of-date file versions,
> of course).


> However, this is different for plugin-style DLL-s, which are
> loaded explicitly during the run by the program itself. These
> DLL-s can easily make use of symbols defined by the .EXE
> module. In some cases I have seen the .EXE itself is only a
> thin envelope application providing the main GUI window frame,
> and 99% of the application functionality is provided by
> dynamically loaded plugins, incidentally linked against to the
> .EXE module for using some shared service functions.


In other words, the only constraints are those due to order of
loading.

> To get this a bit more on-topic, I compare the Windows model
> with Linux dynamic loader. There are both pros and cons:


> Pros:
> In a Windows build, if a module has been built successfully one
> knows that all its dependencies have been resolved. In Linux one can
> easily have a .so file with unresolved dependencies. This can be checked
> and avoided of course, with a little extra work in Makefiles.


> A dynamic symbol cannot be accidentally reseated into another
> module, which happens to export the same symbol.


That's true under Unix as well, IF the symbol in question is in
a module loaded with RTLD_LOCAL (or if the user takes other,
more specific, but platform dependent steps).

> Cons:
> One cannot have dependency cycles among dynamically loaded modules..
> Actually I think this is rather a "pro".


Yes and no. But if I understand your explination correctly, you
can in fact have cycles under Windows as well, if the modules
are explicitly loaded (e.g. as plugins), and that's probably the
only time you'd want them. (The root used, and thus depends on,
various plugins, and the plugins use common functionality in the
root.)

> A dynamic symbol cannot be willingly reseated into another module,
> making things like Electric Fence cumbersome to implement, and causing
> Microsoft to invent layers and layers of debugging interfaces and
> subsystems.


Rather a special case.

> If the modules are linked against different versions of C or C++
> runtime libraries, it becomes difficult to exchange objects using
> resources provided by these libraries (FILE*, malloc(), std::vector,
> etc.).


Any dynamic linking introduces any number of additional problems
with regards to version management, binary compatibility, etc.,
and it's better to avoid it except where the advantages outweigh
these costs. Basically, about the only implicit dynamic linking
which I'd use is that of the system ABI (to ensure portability
to different versions of the system) and low level libraries
which can be more or less assimilated to the system (data base
interface, etc.). Under Unix (my usual platform), I don't think
I've ever invoked dlopen except with RTLD_LOCAL, and except for
the bundled system libraries and Sybase, I don't use any
implicit dynamic linking. Under Windows, I rather suspect that
I'd do likewise---including dynamic linking for the system API
and low level functions like malloc. (But I'm not sure: it
partially depends on what is bundled with the OS, since you
don't want to have to deliver additional system level DLL's with
your application.)

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-28-2008
On May 27, 7:48 pm, Paavo Helde <(E-Mail Removed)> wrote:
> James Kanze <(E-Mail Removed)> kirjutas:


[...]
> >> To get this a bit more on-topic, I compare the Windows model
> >> with Linux dynamic loader. There are both pros and cons:


[...]
> >> Cons:
> >> One cannot have dependency cycles among dynamically loaded
> >> modules

> > .
> >> Actually I think this is rather a "pro".


> > Yes and no. But if I understand your explination correctly, you
> > can in fact have cycles under Windows as well, if the modules
> > are explicitly loaded (e.g. as plugins), and that's probably the
> > only time you'd want them. (The root used, and thus depends on,
> > various plugins, and the plugins use common functionality in the
> > root.)


> No, I probably expressed myself badly. The plugins are
> compiled and linked after the root executable is ready. The
> root executable does not know anything about the plugins by
> itself. When started, it reads the names of needed plugin
> files from registry or somewhere else, and loads them
> explicitly with LoadLibrary() Windows SDK call. The plugins
> depend on the .EXE module - this is already loaded into
> memory, so everybody is happy.


The root module must know something about the plugins: an entry
point, for example. I think what you meant was that it only
knows it "symbolically", as a string constant in the code, and
not as some information in the .exe or the .dll format which the
system must exploint.

[...]
> > Any dynamic linking introduces any number of additional
> > problems with regards to version management, binary
> > compatibility, etc., and it's better to avoid it except
> > where the advantages outweigh these costs. Basically, about
> > the only implicit dynamic linking which I'd use is that of
> > the system ABI (to ensure portability to different versions
> > of the system) and low level libraries which can be more or
> > less assimilated to the system (data base interface, etc.).
> > Under Unix (my usual platform), I don't think I've ever
> > invoked dlopen except with RTLD_LOCAL, and except for the
> > bundled system libraries and Sybase, I don't use any
> > implicit dynamic linking. Under Windows, I rather suspect
> > that I'd do likewise---including dynamic linking for the
> > system API and low level functions like malloc. (But I'm
> > not sure: it partially depends on what is bundled with the
> > OS, since you don't want to have to deliver additional
> > system level DLL's with your application.)


> Yes, there are lots of problems with dynamic linking. However,
> they provide a better modularity for the whole system, and at
> least on Windows, they provide a quite good additional layer
> of encapsulation.


The additional modularity and encapsulation is useful in some
cases: plugins, and versionning for system level ABI's come to
mind. When those advantages outweigh the cost (or the cost is
minimal, e.g. when the dynamic object you link is bundled with
the OS, or must be provided separately anyway because of
licensing issues), then fine. But they create serious problems
with regards to deployment and versioning.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-28-2008
On May 28, 7:00 pm, Paavo Helde <(E-Mail Removed)> wrote:
> James Kanze <(E-Mail Removed)> kirjutas:
> > The root module must know something about the plugins: an entry
> > point, for example. I think what you meant was that it only
> > knows it "symbolically", as a string constant in the code, and
> > not as some information in the .exe or the .dll format which the
> > system must exploint.


> There is a standard entry point called DllMain(). Windows will call that
> function when loading and unloading the DLL. In that function the DLL can
> register itself in the data structures by the main application in some way..
> In our case the DLL typically defines some new classes derived from the
> abstract base classes provided by the framework, and the registration
> involves base class pointers to the derived class objects.


> AFAIK Posix has similar entry and exit points called init() and fini().


At least in Posix, those aren't entry points in the usual sense;
dlopen will only return after init() returns. They do provide
the hook for the initialization of static data, however, and
having a static variable which registers with some sort of
registry in the root is a well established technique under Unix.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Lib depending on Lib, packaging for distro Christopher C++ 4 11-01-2007 10:36 PM
difference between libboost_regex-mt-gd-1_33_1.lib and boost_regex-mt-gd-1_33_1.lib ideal.black@gmail.com C++ 3 09-30-2007 06:54 AM
how to debug this error? /usr/lib/gcc/i486-linux-gnu/4.1.2/../../../../lib/crt1.o Durduran C Programming 10 07-30-2007 09:03 PM
Need odbc32.lib odbccp32.lib Praetorian C++ 1 04-20-2006 07:14 PM
Diff betw common/lib and shared/lib in Tomcat James Yong Java 0 09-12-2005 02:36 AM



Advertisments