Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Garbage collection problems

Reply
Thread Tools

Garbage collection problems

 
 
cr88192
Guest
Posts: n/a
 
      11-21-2007

"Paul Hsieh" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On Nov 20, 12:40 am, Philip Potter <(E-Mail Removed)> wrote:
>> jacob navia wrote:
>> > Keith Thompson wrote:
>> >> jacob navia wrote:
>> >>> Normal applications like 99.9999 C applications that can use the GC
>> >>> without any problems!

>>
>> >> Is 99.9999 intended to be an exaggeration, or do you really think the
>> >> number is that high (I assume it's a percentage)?

>>
>> > How many applications you have seen that write pointers into
>> > disk files, xor pointers or do such kind of nonsense?

>>
>> > I have never seen one.

>>
>> Jacob, don't quote figures you fabricated then use anecdotal evidence to
>> back them up. That's just not science.

>
> Lol! Can you point me to any other "science" being practiced in this
> newsgroup? Come on now, aside from some pointer arithmetic (which
> Boehm has not problem with)
>


it is the science of edge-cases and arguing...


>> [...] Unless you have done proper
>> research across a wide cross-section of application domains and coding
>> groups, the fact that you've never seen one means precisely zilch. (I
>> have seen precisely zero uses of ptrdiff_t - I guess it doesn't exist,
>> right?)

>
> The Better String Library uses ptrdiff_t to deal with aliasing in a
> conservative and portable way.
>
>> It doesn't matter how many people can use GC successfully, it matters
>> how many /can't/

>
> Jacob's response to you here is very on point. GC is being proposed
> to deal with the much worse problem of dealing with malloc and free.
> The problems that might happen due to misuse of GC will be a drop in
> the ocean compared with the rampant memory leaks C programmers create
> today.
>


yes.
I use to do this some before I discovered GC.

if I could not determine well enough if/when things needed to be freed, I
didn't bother.
I also tended to end up with apps that would randomly "explode" as well...


eventually, I started implementing custom memory managers, and later,
garbage collectors, as a way of battling this problem...

the general structure of these allocators tends to remain the same though:
I allocate memory in large "chunks" (say, 256kB or 1MB). these chunks were
then divided into a large number of objects, which were allocated and freed.

some of my early allocators used linked lists and best-fit, but tended to be
kind of slow (especially when I had mark/sweep GCs, as a long time during
sweep was used up positionally inserting every freed object into the free
list...).

later on (starting in about 2002 and since), pretty much all my allocators
have been cell-based (each block is a large array of fixed-size cells
managed by bitmaps).


>> [...] - and because the sort of problems that we're talking
>> about, the errors are of the intermittent runtime difficult-to-debug
>> variety. If I can't be absolutely sure that I can use your GC on my
>> standard C program, I won't bother. I'd have to go through the entire
>> program to make sure it doesn't do 'clever' stuff with pointers; or
>> start a new C-with-GC program from scratch (but I wouldn't do that
>> because I'd be stuck with one compiler on one platform).

>
> You are worried about non-opaque (and non-pointer offsetting) pointer
> manipulation? No matter what it is you have in mind, its almost
> certainly a violation of the standard. Writing pointers to a file
> then reading them back? That's extremely brittle coding that I would
> be very scared of -- disks can fail, and be accessed asynchronously by
> other processes.
>
> Real programs don't do this sort of nonsense. We're talking about a
> percentage that is way less than the coverage missed by a typical test
> suite for even the best software.
>


it is brittle, but I have heard of a few apps doing this kind of thing.
an example of this is some forms of persistent store, where part (or all) of
the heap is stored to a file.

of course, these kind of apps usually do a custom heap (a custom memory
manager over some large mmaped block of memory), always mmap the file in the
same place, and disallow any pointers to memory outside this region.


of course, this has little to do with whether or not the implementation has
GC, since such memory is completely outside the control of the GC anyways...

of course, this is only one of several ways to implement a persistent
store...


> This is why the Boehm collector exists, and is being used in real
> production code today.
>


yes, it is used, because it is usable...


> --
> Paul Hsieh
> http://www.pobox.com/~qed/
> http://bstring.sf.net/



 
Reply With Quote
 
 
 
 
cr88192
Guest
Posts: n/a
 
      11-21-2007

<(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On Nov 20, 7:50 am, jacob navia <(E-Mail Removed)> wrote:
>> Keith Thompson wrote:
>> > jacob navia wrote:
>> >> Normal applications like 99.9999 C applications that can use the GC
>> >> without any problems!

>>
>> > Is 99.9999 intended to be an exaggeration, or do you really think the
>> > number is that high (I assume it's a percentage)?

>>
>> How many applications you have seen that write pointers into
>> disk files, xor pointers or do such kind of nonsense?
>>
>> I have never seen one.

>
> Surely the point is that if you want to propose a *standardized* GC
> (which would be better done in comp.std.c), then what does and doesn't
> break it will need to be specified!
>
> You could either do this negatively:
> "As long as you don't xor pointers or write them to disk or ...
> (expand 'such kind of nonsense' here) or ... then your program can use
> GC safely"
>
> or positively:
> "You're allowed to store arbitrarily many pointers in arbitrarily deep
> linked sequences, add offsets to them, cast them(?), ..., and GC is
> guaranteed to work"
>
> Either way, it seems to me that you'll end up with an extremely long
> and unwieldy list of conditions if you want to cover all bases. If you
> can't express an idea cleanly, that might indicate that there's
> something wrong with it...
>


I don't think the list has to be "that" long or unweildy (after all, most of
the C standard would have to be filled with such lists in that case).

for example, here are a few general requirements for my GC:
1. pointers to GC'ed objects need to be located in GC-reachable memory to be
GC-reachable;
2. pointers need to be stored in aligned memory locations (this much is done
by the compiler);
3. a pointer is required to point somewhere inside the object being pointed
to;
4. an GC'ed object is not required to be retained if not GC-reachable at any
point in time.

and, provisions:
data, bss, and the main and gc-known thread stacks are reachable;
the search depth is only bounded by the ability to create a sufficiently
large trace stack;
....

so, no xor'ing, as this violates 3.
no files, strings, or malloc'ed memory, as this violates 1 and 4;
care needs be taken when mixing GC and byte-arrays, due to 2.

for example:
char a[4];
*((void **)a)=gcalloc(256);
is in potential violation of 2.

....


>>
>> --
>> jacob navia
>> jacob at jacob point remcomp point fr
>> logiciels/informatiquehttp://www.cs.virginia.edu/~lcc-win32



 
Reply With Quote
 
 
 
 
cr88192
Guest
Posts: n/a
 
      11-21-2007

"Tor Rustad" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> cr88192 wrote:
>> "Tor Rustad" <(E-Mail Removed)> wrote in message

>
> [...]
>
>>> IMO, the main domain of C is system and embedded development. Even if
>>> extending this domain by including communication, security development
>>> and DB engine development, a GC seems neither important, or of much
>>> interest.
>>>

>>
>> errm, are you trying to claim that all of us good old desktop-pc
>> developers are all off using stuff like Java and C# or something?...

>
> No, I say that C1X should focus on the main domain of C, and try to keep
> the language small and simple. The C99 variable-length arrays was a step
> too far. Adding GC to Standard C, would IMO be a major mistake.
>


I will agree to a point here:
those kind of arrays, IMO, do not belong within the main language.

but, GC is a runtime feature, and it is very sensible that it be left out
for embedded targets.


I was not arguing for standardized GC though, only that GC, itself, has
value.
where and for who it has value, are the people who use it.

to what extent standardization would make sense, would be in terms of a
"standardized" definition of the API and provided functionality and
semantics (said, "portable" GC), but I do not expect by any means that it be
somehow "required" (as is the case with Java and C#, where GC is an assumed
and fundamental part of the language).


>> must of us are in a land where we still want things like GC, but don't
>> feel like selling our souls to some proprietary VM framework to get it.

>
> Nobody is stopping you from using the Boehm GC.
>


what do you think I was arguing through most of the thread?...

this is what I was arguing, that using Boehm is a very valid and sensible
option (however, certain other people were opposing that GC has use in C
land, ever...).


> --
> Tor <(E-Mail Removed) | tr i-za-h a-z>




 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      11-21-2007
jacob navia wrote:
>

.... snip ...
>
> This change doesn't look good for being able to use third party
> libraries compiled with a different compiler...


You can't count on that anyhow. In fact, you expect failure.

--
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
Try the download section.



--
Posted via a free Usenet account from http://www.teranews.com

 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      11-21-2007
Paul Hsieh wrote:
>

.... snip ...
>
> It forces call stack uniformity (the EH needs to be able to crawl up
> the stack and unwind the stack in a uniform way.) So for example,


To bring things back to reality, there is no reason to have a stack
in any C system.

--
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
Try the download section.


--
Posted via a free Usenet account from http://www.teranews.com

 
Reply With Quote
 
cr88192
Guest
Posts: n/a
 
      11-21-2007

"CBFalconer" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> jacob navia wrote:
>>

> ... snip ...
>>
>> This change doesn't look good for being able to use third party
>> libraries compiled with a different compiler...

>
> You can't count on that anyhow. In fact, you expect failure.
>


theoretically, yes.

in practicality, one is forced to deal with this issue.

for example, between windows and linux, there are any number of compilers
you can compile code with, and they can make dlls and static libraries.

if one makes their compiler incompatible, say, by using a different calling
convention, then they are unable to use really any code (static libraries,
DLLs, ...) for their platform, and the compiler becomes essentially useless.


so, on x86, there is a singular more or less mandatory calling convention
(cdecl), and on windows, another less-common but still mandatory convention
(stdcall).

on x86-64, there are 2 calling conventions, the SystemV x86-64 convention
(followed by linux and friends), and the Win64 calling convention (windows).

and, again, to be useful on these archs, one needs to support these
conventions (even as such, I have been vaguely tempted to just use my own
convention for x86-64, but this would require, of all horrors, stubbing, to
interface external code).

what convention would I use:
I would actually just "upgrade" the existing 32-bit x86 calling convention
to work on x86-64...
that or use the Win64 convention internally, which should not be too
difficult to support (would just need to come up with some means of creating
a Win64 to SysV adaptor interface...).

or such...


> --
> Chuck F (cbfalconer at maineline dot net)
> <http://cbfalconer.home.att.net>
> Try the download section.
>
>
>
> --
> Posted via a free Usenet account from http://www.teranews.com
>



 
Reply With Quote
 
cr88192
Guest
Posts: n/a
 
      11-21-2007

"CBFalconer" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Paul Hsieh wrote:
>>

> ... snip ...
>>
>> It forces call stack uniformity (the EH needs to be able to crawl up
>> the stack and unwind the stack in a uniform way.) So for example,

>
> To bring things back to reality, there is no reason to have a stack
> in any C system.
>


errm?...

what kind of 'reality' is it you are living in?...


I guess it is theoretically possible, but it would lead to incompatibility
and what reason is there that anyone should implement such a thing?...

I guess, maybe you mean, all the call-frames are allocated on the heap, but
then one has to totally re-engineer the calling conventions, recompile all
the libraries, and likely deal with a pretty damn major performance hit.


I guess it could allow things like scheme style continuations, lazy
evaluation, and simplify using fine-grained concurrency (no more "threads",
only 'workers' and 'work queues').


the fact that pretty much every arch has a stack, and those that don't, fake
it, this is telling of its relevance...



 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      11-21-2007
cr88192 wrote:
> "CBFalconer" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> Paul Hsieh wrote:
>> ... snip ...
>>> It forces call stack uniformity (the EH needs to be able to crawl up
>>> the stack and unwind the stack in a uniform way.) So for example,

>> To bring things back to reality, there is no reason to have a stack
>> in any C system.
>>

>
> errm?...
>
> what kind of 'reality' is it you are living in?...
>
>


Mr Falconer has his own world, as many people here around.

A stack is not necessary, one complement or sign magnitude
representations rule the world, etc etc.

I think we have touched a trap representation


--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
 
Reply With Quote
 
cr88192
Guest
Posts: n/a
 
      11-21-2007

"jacob navia" <(E-Mail Removed)> wrote in message
news:fi13tb$rbr$(E-Mail Removed)...
> cr88192 wrote:
>> "CBFalconer" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>> Paul Hsieh wrote:
>>> ... snip ...
>>>> It forces call stack uniformity (the EH needs to be able to crawl up
>>>> the stack and unwind the stack in a uniform way.) So for example,
>>> To bring things back to reality, there is no reason to have a stack
>>> in any C system.
>>>

>>
>> errm?...
>>
>> what kind of 'reality' is it you are living in?...
>>
>>

>
> Mr Falconer has his own world, as many people here around.
>
> A stack is not necessary, one complement or sign magnitude
> representations rule the world, etc etc.
>
> I think we have touched a trap representation
>
>


yes, ok.


but, it is maybe interesting from a theoretical perspective, aka:
a C-implementation allocating all the call frames on a heap.

so, we want a call frame?
grab it from a free-list (or, somehow, allocate more if needed).

need to store locals? grab a chunk of memory for this and attach to call
frame.
need space for args? likewise, grab a chunk of memory, evaluate and store in
the args, and attach it to the (newly created) call frame of the function
being called.

....

afaik, this particular approach is used in many scheme implementations, for
the reason that it makes implementing call/cc really fast and easy (we can
jump outword to some enclosing call frame, do stuff, or jump right back into
a call frame we had previously jumped out of).

some people have used call/cc as a way of implementing multithreading from
within scheme itself, ...


however, such an idea, though possible, would result in something very
different than a traditional C compiler.

very likely, we would have to rework the C code into CPS (actually, this is
partly how the JIT backend for my last script VM worked, as it compiled the
single bytecode blocks into some number of CPS-linked chunks).

CPS:
http://en.wikipedia.org/wiki/Continuation_passing_style


maybe external functions, like API calls, could be handled by temporarily
assigning a conventional stack (this stack serving to disappear once the
call completes). an inward call would mostly call a stub which serves to
transfer from a more traditional stack-based call-frame to the heap-and-CPS
variant...


of course, in my case, something like this would require an almost
completely different implementation of my lower compiler...

of course, it would make a whole lot of sense if I ever really intended to
(effectively) support languages like scheme or haskell...


interesting? maybe.
practical? probably not.


> --
> jacob navia
> jacob at jacob point remcomp point fr
> logiciels/informatique
> http://www.cs.virginia.edu/~lcc-win32



 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      11-21-2007
cr88192 wrote:
> "jacob navia" <(E-Mail Removed)> wrote in message
> news:fi13tb$rbr$(E-Mail Removed)...
>> cr88192 wrote:
>>> "CBFalconer" <(E-Mail Removed)> wrote in message
>>> news:(E-Mail Removed)...
>>>> Paul Hsieh wrote:
>>>> ... snip ...
>>>>> It forces call stack uniformity (the EH needs to be able to crawl up
>>>>> the stack and unwind the stack in a uniform way.) So for example,
>>>> To bring things back to reality, there is no reason to have a stack
>>>> in any C system.
>>>>
>>> errm?...
>>>
>>> what kind of 'reality' is it you are living in?...
>>>
>>>

>> Mr Falconer has his own world, as many people here around.
>>
>> A stack is not necessary, one complement or sign magnitude
>> representations rule the world, etc etc.
>>
>> I think we have touched a trap representation
>>
>>

>
> yes, ok.
>
>
> but, it is maybe interesting from a theoretical perspective, aka:
> a C-implementation allocating all the call frames on a heap.
>
> so, we want a call frame?
> grab it from a free-list (or, somehow, allocate more if needed).
>
> need to store locals? grab a chunk of memory for this and attach to call
> frame.
> need space for args? likewise, grab a chunk of memory, evaluate and store in
> the args, and attach it to the (newly created) call frame of the function
> being called.
>


So, you are implementig the stack in software. Instead of doing a
single register subtract, you allocate memory, put that in the list
of active frames, etc etc!

That would be really bad for performance
> ...
>
> afaik, this particular approach is used in many scheme implementations, for
> the reason that it makes implementing call/cc really fast and easy (we can
> jump outword to some enclosing call frame, do stuff, or jump right back into
> a call frame we had previously jumped out of).
>


call/cc is completely UNREADABlE. You do not see in the program text
where the program is going, you have to figure out what is the current
continuation!!!!!

That was something completely NUTS!


> some people have used call/cc as a way of implementing multithreading from
> within scheme itself, ...
>


Fine... When it works. When it doesn't, I would not like to be
the one debugging that stuff.

>
>
> interesting? maybe.
> practical? probably not.


I agree with that!

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Memory problems (garbage collection) Carbon Man Python 6 04-29-2009 11:16 AM
IE6 Garbage Collection and general IE6 slowness problems timothytoe Javascript 4 06-03-2008 05:17 PM
Collection problems (create Collection object, add data to collection, bind collection to datagrid) yvind Isaksen ASP .Net 1 05-18-2007 09:24 AM
Garbage collection problems with a c++ wrapper for a module James S Python 2 07-20-2004 01:22 AM
Debbugging help! (.NET 1.1 Framework Garbage Collection Problems) Cheung, Jeffrey Jing-Yen ASP .Net 3 07-10-2003 07:29 PM



Advertisments