Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Re: Storage management techniques in C

Reply
Thread Tools

Re: Storage management techniques in C

 
 
BGB / cr88192
Guest
Posts: n/a
 
      12-24-2009

"Richard Harter" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
>
>
> In this thread I would like to explore storage allocation
> techniques in C programming.
>
> One of the performance issues of concern in C programming is the
> amount of time spent in allocating and deallocating storage. In
> measurements that I have made and that others have reported
> upwards of 25% of the time used is spent in the allocator - and
> it can be much higher, depending on the allocator design.
>
> The issue is simple; the more calls that are made to malloc and
> free, the greater the time cost of memory allocation. What is
> more, the likelihood of memory leaks goes up with the number of
> calls to the allocator.
>
> What should be done to reduce the cost of allocation?
>


<snip>

one thing that generally works fairly well IME for lots of objects all of a
particular type and size (a fairly common case IME), is to have a special
"free" function which doesn't actually free the object, but instead adds it
to a type-specific free-list.

when allocating an object, first one can check that the free-list is not
NULL, and if this is the case, simply grab the first item and update the
list to the next item.

if the list is empty, fall back to a more general purpose allocator.
periodically, one may also free the free list (for example, via a GC
callback, to maintain a memory quota, ...).

in general, this can notably speed up object allocation/deallocation for a
given object type.



 
Reply With Quote
 
 
 
 
BGB / cr88192
Guest
Posts: n/a
 
      12-31-2009

"Richard Harter" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On Thu, 24 Dec 2009 15:31:21 -0700, "BGB / cr88192"
> <(E-Mail Removed)> wrote:
>
>>
>>"Richard Harter" <(E-Mail Removed)> wrote in message
>>news:(E-Mail Removed)...

> <snip>
>>
>>one thing that generally works fairly well IME for lots of objects all of
>>a
>>particular type and size (a fairly common case IME), is to have a special
>>"free" function which doesn't actually free the object, but instead adds
>>it
>>to a type-specific free-list.
>>
>>when allocating an object, first one can check that the free-list is not
>>NULL, and if this is the case, simply grab the first item and update the
>>list to the next item.
>>
>>if the list is empty, fall back to a more general purpose allocator.
>>periodically, one may also free the free list (for example, via a GC
>>callback, to maintain a memory quota, ...).
>>
>>in general, this can notably speed up object allocation/deallocation for a
>>given object type.

>
> It's a good technique; I've used it myself. However there are
> situations where it doesn't work very well. See my reply to
> bartc.
>


cases where objects are used in bursts, partial solutions:
keep track of number of free items and start freeing if there are too many;
in some cases, maybe register a callback with the GC which can trigger bulk
freeing if memory is tight.

fragmentation:
allocate objects in lumps (granted, this may not always turn out well, and
may require special MM/GC support).

actually, I have not really had too many problems with fragmentation IME,
although I suspect this may be due to my usage of generally "unorthodox"
allocator algos (AKA: bitmap and cell-based allocators, which oddly almost
no one too far outside LISP-land tend to use...).

granted, fragmentation-related performance issues may be difficult to
detect, so possibly I might not have noticed them...


in general though, in my newer GC I keep cons cells and object-cells in
different places, though mostly this is more due to technical reasons.

cons cells are in one place, with a special cons-specific allocator (special
cases: a cons is always 2 pointers, and a cons is never more than 1 cell);
object cells are in another place, and use a different allocator (for small
objects, the allocator is specialized for small objects, as finding space
for larger objects via bitmap searches would be slow);
larger objects use free lists.

one can debate what the exact ideal cutoff is between a cell-based allocator
and a free-list allocator though, but IME it hasn't been too much of an
issue thus far.

a cell allocator has a small per-object overhead at the cost of a
linear-space overhead, and a free-list allocator has a higher per-object
overhead with no linear-space overhead.

for example, with masses of 10 byte objects, a 40 byte object header would
be a killer;
but, with multi-MB objects, a 6-8% space overhead would be a killer (say, if
for every 10MB there were 820kB of bitmaps...), whereas the 40 byte object
header would be insignificant.

hence, the use of different strategies depending on object size, ...


>



 
Reply With Quote
 
 
 
 
bartc
Guest
Posts: n/a
 
      01-01-2010
"BGB / cr88192" <(E-Mail Removed)> wrote in message
news:hhjcb5$pet$(E-Mail Removed)...

> for example, with masses of 10 byte objects, a 40 byte object header would
> be a killer;
> but, with multi-MB objects, a 6-8% space overhead would be a killer (say,
> if for every 10MB there were 820kB of bitmaps...), whereas the 40 byte
> object header would be insignificant.


What are the bitmaps used for?

With 1 bit per byte, the overhead on 10MB is about 1.25MB. But with 1 bit
per 16 bytes say, the overhead is some 80KB, some 0.8%.

--
Bartc

 
Reply With Quote
 
BGB / cr88192
Guest
Posts: n/a
 
      01-02-2010

"bartc" <(E-Mail Removed)> wrote in message
news:TAk%m.20822$(E-Mail Removed) om...
> "BGB / cr88192" <(E-Mail Removed)> wrote in message
> news:hhjcb5$pet$(E-Mail Removed)...
>
>> for example, with masses of 10 byte objects, a 40 byte object header
>> would be a killer;
>> but, with multi-MB objects, a 6-8% space overhead would be a killer (say,
>> if for every 10MB there were 820kB of bitmaps...), whereas the 40 byte
>> object header would be insignificant.

>
> What are the bitmaps used for?
>


space allocation and keeping track of which objects exist and where they
exist at, ...

this can be constrasted with the strategy of using the object-headers and
linked lists for memory management.
for example, with bitmaps, one need not actually even have object headers,
but in my case they exist mostly to contain semantic information (exact
object size, type, flags, ... but, given a lack of any embedded pointers, it
allows the entire header to be packed into 64 bits in my case...).

some of my early allocators in this style did not have object headers, but
eventually these headers became a part of the core MM/GC, as this made it so
that the information available in the header was available for every object
(eliminating the occasional "odd man out" which did not have a header).


for example, in a "traditional" MM, one could have a collection of
free-lists (of assorted object sizes), and find a memory block of sufficient
size from the needed list. in this case, the object would then likely be
migrated to a list of live objects, and have the size of the object managed
in the header via a size field.

so, a memory block might look something like:
struct MemBlock_s {
MemBlock *next_list; //next block in free-list / live-list
MemBlock *next_phys; //next block physically (so we can merge / coalesce
blocks)
size_t size; //object size
int flags; //some flags and stuff
....
//data starts here
};

for very small objects, this header is massive...


if the object is, instead, an external bit pattern, we don't need to keep
track of objects via such headers.
similarly, coalescing is "free", since freeing any 2 (or more) adjacent
objects will create a big hole in the bitmap, meaning a space which can hold
a bigger object (IOW: coalescing free space is free).

similarly, for GC it tends to be MUCH faster than when working with
free-lists.


the one, notable, cost with this strategy is that the performance turns to
crap if ones' objects are large, hence there is a good reason to manage
larger objects via a free list instead (these 2 strategies seem to work
fairly well when used in conjunction).

this works well, also, since large objects tend to be fairly rare vs small
objects, and the small number of large objects greatly reduces time used in
things like linked-list hopping.

at the same time, I can also speed up the large-object GC by using a sorted
array of large objects, so that I can locate objects quickly via a binary
search, ...


> With 1 bit per byte, the overhead on 10MB is about 1.25MB. But with 1 bit
> per 16 bytes say, the overhead is some 80KB, some 0.8%.
>


it is typically 8 bits per 16-bytes in my current GC (or 6.25%).
so, it is not "strictly" a bitmap in the 1-bit-per-item sense, but I lack a
better term.

in this setup:
2 bits: cell type (free/header/data/reserved);
2 bits: cell color (white/black/gray/extra_black);
3 bits: gc-type (conservative/atomic, 'GCP': defiled, ref-count/many)
1 bit: (?...)

sadly, I can't really remember how all this works, as I didn't exactly
document it so well, and getting this much required going back and trying to
decipher a lot of the evil bit-twiddly...

earlier versions of the GC (from years ago) used 4 bits per cell, but this
was expanded to 8 bits both to improve performance and provide a few more
bits for other uses.

a bunch more info goes on in the object header (more bit-twiddly).

'GCP' was an ill-advised feature (an attempt at pointer-based precise
collection to be used in-conjunction with conservative collection), which
has mostly died back (not part of the main API anymore), but it first added
ref-counting (in this incarnation, another past "sibling" GC had also added
ref-counting earlier).

there is also conservative ref-counting which is implemented via 4 bits
located in the object header, which is now used instead.


> --
> Bartc



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Storage management techniques in C bartc C Programming 2 12-26-2009 12:26 AM
Why is enterprise storage so much more expensive than personal storage? John Computer Support 4 03-17-2008 09:50 PM
How to access the external storage unit of storage router =?Utf-8?B?SWduYXRpdXM=?= Wireless Networking 4 11-06-2006 06:40 AM
Difference b/w storage class and storage class specifier sarathy C Programming 2 07-17-2006 05:06 PM



Advertisments