Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C Programming (http://www.velocityreviews.com/forums/f42-c-programming.html)
-   -   Re: Storage management techniques in C (http://www.velocityreviews.com/forums/t709725-re-storage-management-techniques-in-c.html)

bartc 12-24-2009 11:35 AM

Re: Storage management techniques in C
 

"Richard Harter" <cri@tiac.net> wrote in message
news:4b331a37.190331578@text.giganews.com...

> One of the performance issues of concern in C programming is the
> amount of time spent in allocating and deallocating storage. In
> measurements that I have made and that others have reported
> upwards of 25% of the time used is spent in the allocator - and
> it can be much higher, depending on the allocator design.


You're talking about C's malloc(), realloc() and free() functions?

> The issue is simple; the more calls that are made to malloc and
> free, the greater the time cost of memory allocation. What is
> more, the likelihood of memory leaks goes up with the number of
> calls to the allocator.
>
> What should be done to reduce the cost of allocation?


I would like more detailed figures than the 25% you mentioned before
creating complicated new strategies.

How much in each of malloc, realloc and free? What proportion of the calls,
or of the total time spent, is in different sized allocations (ie. are most
of the calls related to allocating/deallocating fairly small blocks)? What
proportion of calls are for the same size allocation? And so on.

> (1) Multiple copies


> (2) LIFO storage


> (3) FIFO storage


> (4) Persistent scratch space


> (5) Limited duration space


> Perhaps there are other techniques that people might like to
> mention.


I think there are many other patterns.

For example, large numbers of smallish allocations, all of the same size
(similar to your (1), but probably much bigger). This can easily be handled
by allocating one big block (which can be augmented if needed) and
allocating/deallocating from that, using free lists. This would be very
fast.

Another is where the allocations may be mixed size (but still smallish), but
do not need to be freed until the end of some processing step. Again,
allocate a large block (and add more as needed if difficult to estimate
total), then deallocate the large blocks at the end using perhaps just one
or two free() calls.

And another, where a block has to grow in size using realloc. Depending on
how efficient you think realloc might be, allocating spare capacity and
testing for that before needing to call realloc might help.

But I'm sure these are all things that people will try as soon as they find
allocation is a bottleneck.

And myself, for small allocations (up to 1KB), I use my own allocator based
on a large allocated block or two. This is faster because the
allocation/deallocation functions take a short, fixed time (no searching for
things or moving stuff around). (However, if the large blocks involved
become mostly empty, this space cannot be used elsewhere, eg, to allocate a
4KB block.)

--
Bartc


Gene 12-24-2009 10:37 PM

Re: Storage management techniques in C
 
On Dec 24, 2:00*pm, c...@tiac.net (Richard Harter) wrote:
> On Thu, 24 Dec 2009 11:35:06 GMT, "bartc" <ba...@freeuk.com>
> wrote:
>
>
>
> >"Richard Harter" <c...@tiac.net> wrote in message
> >news:4b331a37.190331578@text.giganews.com...

>
> >> One of the performance issues of concern in C programming is the
> >> amount of time spent in allocating and deallocating storage. *In
> >> measurements that I have made and that others have reported
> >> upwards of 25% of the time used is spent in the allocator - and
> >> it can be much higher, depending on the allocator design.

>
> >You're talking about C's malloc(), realloc() and free() functions?

>
> Essentially, yes. *AFAIK the major open source implementations
> all use high quality allocators - this definitely wasn't true
> years ago. *I've never looked at Microsoft's allocators, but I
> would be surprised if they weren't efficient.
>
>
>
> >> The issue is simple; the more calls that are made to malloc and
> >> free, the greater the time cost of memory allocation. *What is
> >> more, the likelihood of memory leaks goes up with the number of
> >> calls to the allocator.

>
> >> What should be done to reduce the cost of allocation?

>
> >I would like more detailed figures than the 25% you mentioned before
> >creating complicated new strategies.

>
> >How much in each of malloc, realloc and free? What proportion of the calls,
> >or of the total time spent, is in different sized allocations (ie. are most
> >of the calls related to allocating/deallocating fairly small blocks)? What
> >proportion of calls are for the same size allocation? And so on.

>
> Sorry, I no longer have precise numbers at hand - they are buried
> away in industrial reports. *In any case, case studies are only
> guides.
>
> The general pattern I have seen is that in the initial
> implementations of applications allocation costs are buried by
> the overall general inefficiency. *Allocation costs only become
> significant after the worst inefficiencies have been carved away.
>
> As far as the relative cost of malloc, realloc, and free are
> concerned, that depends on the allocator implementation.
> Commonly the cost of malloc is about 50% more than the cost of
> free. *On average realloc is only slightly less than the cost of
> malloc+free.
>
> Most of the cases that I am familiar with the block sizes are
> mostly small and intermediate - same size allocation doesn't make
> much difference. *YMMV.
>
>
>
>
>
>
>
> >> (1) Multiple copies

>
> >> (2) LIFO storage

>
> >> (3) FIFO storage

>
> >> (4) Persistent scratch space

>
> >> (5) Limited duration space

>
> >> Perhaps there are other techniques that people might like to
> >> mention.

>
> >I think there are many other patterns.

>
> >For example, large numbers of smallish allocations, all of the same size
> >(similar to your (1), but probably much bigger). This can easily be handled
> >by allocating one big block (which can be augmented if needed) and
> >allocating/deallocating from that, using free lists. This would be very
> >fast.

>
> That is (1). *Free lists are problematic. *What can happen is
> that the space becomes fragmented, i.e., the pointers jump all
> over the place. *When this happens you can run into cache faults.
>
> Another issue is that if you have blooms (an explosion of
> allocated space that dies of rapidly) you end up with large
> amounts of dead space that can't be deallocated.
>
> That said, multiple copies with a free list is simple and
> (usuallY) fast. *However it is no simpler nor faster than the
> "complicated" alternatives.
>
>
>
> >Another is where the allocations may be mixed size (but still smallish), but
> >do not need to be freed until the end of some processing step. Again,
> >allocate a large block (and add more as needed if difficult to estimate
> >total), then deallocate the large blocks at the end using perhaps just one
> >or two free() calls.

>
> That is (5).
>
>
>
> >And another, where a block has to grow in size using realloc. Depending on
> >how efficient you think realloc might be, allocating spare capacity and
> >testing for that before needing to call realloc might help.

>
> >But I'm sure these are all things that people will try as soon as they find
> >allocation is a bottleneck.

>
> >And myself, for small allocations (up to 1KB), I use my own allocator based
> >on a large allocated block or two. This is faster because the
> >allocation/deallocation functions take a short, fixed time (no searching for
> >things or moving stuff around). (However, if the large blocks involved
> >become mostly empty, this space cannot be used elsewhere, eg, to allocate a
> >4KB block.)

>
> Well done. *It's a start.
>
> Thanks for the comments.
>
> Richard Harter, c...@tiac.nethttp://home.tiac.net/~cri,http://www.varinoma.com
> Infinity is one of those things that keep philosophers busy when they
> could be more profitably spending their time weeding their garden.- Hide quoted text -



There's always the old GNU obstack idea, which combines a couple of
yours. An obstack is a big chunk of memory that can be allocated in
arbitrary-sized pieces, with a "mark-release" facility to free items
allocated within the obstack in LIFO order. Entire obstacks are freed
in any order automatically when all the items in them are released.

The idea is that a common pattern is to build a significant data
structure composed of small nodes, use it for its intended purpose,
then get rid of it. An example where this occurs is in a compiler,
where you're processing procedures or modules in order, building
syntax trees, optmization graphs, code blocks, etc. Some of these can
be thrown away when as each procedure is completed. You'd allocate
such structures in an obstack.

Obstacks grow automatically by chaining more big chunks of the same
size. The chunks can in turn be managed efficiently because same-size
blocks don't fragment.


Gene 12-26-2009 12:26 AM

Re: Storage management techniques in C
 
On Dec 25, 3:08*pm, c...@tiac.net (Richard Harter) wrote:
> On Thu, 24 Dec 2009 14:37:49 -0800 (PST), Gene
>
> <gene.ress...@gmail.com> wrote:
> >On Dec 24, 2:00=A0pm, c...@tiac.net (Richard Harter) wrote:
> >> On Thu, 24 Dec 2009 11:35:06 GMT, "bartc" <ba...@freeuk.com>

>
> <snip>
>
>
>
>
>
>
>
> >There's always the old GNU obstack idea, which combines a couple of
> >yours. *An obstack is a big chunk of memory that can be allocated in
> >arbitrary-sized pieces, with a "mark-release" facility to free items
> >allocated within the obstack in LIFO order. *Entire obstacks are freed
> >in any order automatically when all the items in them are released.

>
> >The idea is that a common pattern is to build a significant data
> >structure composed of small nodes, use it for its intended purpose,
> >then get rid of it. *An example where this occurs is in a compiler,
> >where you're processing procedures or modules in order, building
> >syntax trees, optmization graphs, code blocks, etc. *Some of these can
> >be thrown away when as each procedure is completed. *You'd allocate
> >such structures in an obstack.

>
> >Obstacks grow automatically by chaining more big chunks of the same
> >size. *The chunks can in turn be managed efficiently because same-size
> >blocks don't fragment.

>
> I've never seen obstacks, but I will look at them. *I use a
> package I call storage pools (stgpool) that is similar in some
> respects. *Significant differences:
>
> (1) There are no mark/release tags within a pool. *The
> presumption is that all items in the pool can be deleted at the
> same time. *Dead space is ignored until pool deletion time.
>
> (2) Pools use chained blocks; however block sizes start out
> small. *Each block is larger than its predecessor.
>
> (3) Release strategy is separated from storage management; code
> using storage pools decides when to close a pool. *Typically this
> is done when some task or a set of tasks are completed - the pool
> is used for allocated storage needed within the task.
>
> I have the impression that my storage pools are substantially
> lighter weight than obstacks. *Thanks for mentioning them.- Hide quoted text -


Obstacks are actually very light weight. Allocation is an "enough
space" calculation and infrequent possible new chunk allocation
followed by a pointer increment.

Release checks to see if the released address entails any full chunks
(a couple of pointer comparisons) and, if necessary, pushes the tail
chain onto a free list of chunks. The op is O(n) for n freed chunks.

Now that you mention the term storage pool, I'll mention that a
prudent thing to do is use different functions or perhaps macros to
allocate like items (vice calling malloc in all cases) so that
experimenting with special purpose allocators after profiling is
simplified.

In the Ada language, there is a built-in facility for per-class
storage allocation called "storage pools." I suppose there are no hard
real time programmers participating here, as another disadvantage of
most general purpose allocators is wide variation between best and
worst case run time behavior. I believe a major reason for storage
pools in Ada was to allow embedded programmers to substitute highly
deterministic algorithms to meet hard time constraints.



All times are GMT. The time now is 03:35 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.