Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > sizeof([ALLOCATED MEMORY])

Reply
Thread Tools

sizeof([ALLOCATED MEMORY])

 
 
Keith Thompson
Guest
Posts: n/a
 
      05-05-2006
Howard Hinnant <(E-Mail Removed)> writes:
> In article <(E-Mail Removed)>,
> Keith Thompson <(E-Mail Removed)> wrote:

[snip xrealloc() idea]
>> This would allow an implementation to allocate more memory than you
>> ask for, while still permitting it to take back some of the extra
>> space if it needs it.

>
> I think that's an excellent idea. Perhaps there is some refinement
> still to be done here. But the basic idea is fundamentally a win-win.


Thanks. On the other hand, there's always the danger of creeping
featurism. The potential complexity of a memory allocation system is
nearly unbounded. For example, it might sometimes make sense to make
a request like:

I need a chunk of at least 300 bytes of data. I'll probably need
to expand it to 500 bytes later on, so reserve 500 if you can; if
not, I'll settle for 300. I *might* need to expand it to 4000
bytes eventually, so optimize for that case if you can. Tell me
how many bytes you were actually able to allocate. And I'm also
going to need to allocate a 10000 byte chunk; if you can't satisfy
that request immediately, let me exactly what my options are for
freeing other chunks of allocated memory to make that allocation
possible.

(Substitute larger sizes and/or more allocations to make this scenario
realistic on a modern non-embedded system.)

In general, you want to optimize for your program's actual usage
pattern, without necessarily knowing in advance what that's going to
be. Just defining the data structures to describe what you want and
what the system can give you would be a daunting task, let alone
actually implementing it. There has to be a tradeoff between having a
simple interface and adding more functionality.

One could argue that the C's current malloc/realloc/free interface is
the best tradeoff. A malloc/realloc/xrealloc/free interface would
certainly provide some options that don't currently exist, but it's
not clear that it would be worth the added complexity (or that it
wouldn't).

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
 
 
 
Howard Hinnant
Guest
Posts: n/a
 
      05-05-2006
In article <(E-Mail Removed)>,
Keith Thompson <(E-Mail Removed)> wrote:

> Just defining the data structures to describe what you want and
> what the system can give you would be a daunting task, let alone
> actually implementing it. There has to be a tradeoff between having a
> simple interface and adding more functionality.
>
> One could argue that the C's current malloc/realloc/free interface is
> the best tradeoff. A malloc/realloc/xrealloc/free interface would
> certainly provide some options that don't currently exist, but it's
> not clear that it would be worth the added complexity (or that it
> wouldn't).


Excellent points all.

I propose that a good saddle point between complexity and simplification
is somewhere near:

1. How much memory did you actually give me?

2. You gave me a N-byte block (and thanks). I'm interested in
expanding it (in-place) to at a minimum of (N+Nmin) bytes, and perhaps
to a maximum of (N+Nmax) bytes. Can you help me do that?

3. You gave me a N-byte block (and thanks). I no longer need a large
part of this memory and would like to recycle it. Can you shrink this
block in-place such that it is no smaller than M bytes?

My proposal focuses more on the immediate, and not so much on the "what
I might change my mind to in the future". Imho, the client is better
informed to predict future needs and should pose its questions such that
it hides those details from the allocation system and just asks more
specific questions.

-Howard
 
Reply With Quote
 
 
 
 
Kelsey Bjarnason
Guest
Posts: n/a
 
      05-05-2006
On Thu, 04 May 2006 16:43:00 +0000, Howard Hinnant wrote:

>
>
> In article <(E-Mail Removed)>,
> Kelsey Bjarnason <(E-Mail Removed)> wrote:
>
>> [snips]
>>
>> On Thu, 04 May 2006 14:18:31 +0000, Howard Hinnant wrote:
>>
>> > The data structure as described above has an efficiency concern:
>> > Repeated growth in size can lead to quadratic expense due to continually
>> > reallocating for a slightly larger size.

>>
>> So don't do that. In simplest terms, don't add, multiply.

>
> Did you read my entire post or just stop here?
>
> -Howard


I read the entire thing, but it seemed little better than the original
post; it completely fails to grasp, despite the discussion of multipliers,
that they're the actual solution to the problem at hand, and it fails to
grasp the obvious point that if you need N bytes, allocate N bytes, don't
allocate n-m and pray.

It also failed to grasp a trivial solution if it's the allocations, rather
than the consumption, that is the problem: allocate once, into a single
large buffer, and dole out parcels of it yourself.

So, basically, it was a lot of text, no content. What part did I miss?


 
Reply With Quote
 
Robert Latest
Guest
Posts: n/a
 
      05-05-2006
On 2006-05-04, Keith Thompson <(E-Mail Removed)> wrote:

> If I were going to suggest a new feature to address this (perhaps
> straying off-topic a bit), I'd probably propose a new function similar
> to realloc(), except that it's guaranteed not to relocate the object.
> Let's call it xrealloc() (not a great name, but it will do for now).
> If realloc() would have expanded or contracted the object in place,
> xrealloc() is equivalent to realloc(). If realloc() would have
> relocated the object, xrealloc() fails. (We wouldn't require
> xrealloc() to succeed in *exactly* the same circumstances where
> realloc() would expand or shrink the object in place, but it would
> usually work out that way.)
>
> If I've allocated space for 300 bytes, and I find that I need 302
> bytes, I can call xrealloc(), asking for just 302 bytes. If this
> succeeds, I'm done, and the call was probably very cheap. If it
> fails, I can then do a more expensive realloc() call, requesting 400
> or 500 bytes to leave room for future growth. And of course existing
> code can ignore xrealloc() and continue to work as it always has.
>
> This would allow an implementation to allocate more memory than you
> ask for, while still permitting it to take back some of the extra
> space if it needs it.


Well, I really can't see much of a point in this. Either you need
memory or you don't. Your proposed xrealloc() thing would make
programs behave like on a bazaar: "I'd like 1000 bytes more
memory" -- "Sure, it's gonna cost you though. But you can have
300 cheap if you want". Now what can a sensible program make of
that information?

The real issue is that you can optimize memory usage only if you
know the memory management strategy of your implementation. I
can't think there of a sensible way to learn much about this from
within a C program, let alone portably. Even if it existed -- if
memory performance was such a big issue with a particular
program, I woudn't leave it up to the implementation to decide on
strategies but rather write platform-specific modules based on
solid knowledge of the underlying mechanisms.

Graphics programs come to mind as software that needs hight
memory handling performance. If you ever used Adobe Photoshop
(Windows) and Gimp (Linux) on the same hardware you know what I'm
talking about. Photoshop is incredibly fast on images that nearly
bring the system down when opened in Gimp. I don't think that
Windows' memory system is that much better; I think Photoshop has
a dedicated, hightly optimized memory handling system built in.

robert
 
Reply With Quote
 
Howard Hinnant
Guest
Posts: n/a
 
      05-05-2006
In article <(E-Mail Removed)> ,
Robert Latest <(E-Mail Removed)> wrote:

> Well, I really can't see much of a point in this. Either you need
> memory or you don't. Your proposed xrealloc() thing would make
> programs behave like on a bazaar: "I'd like 1000 bytes more
> memory" -- "Sure, it's gonna cost you though. But you can have
> 300 cheap if you want". Now what can a sensible program make of
> that information?


I think I see the disconnect.

Some of us seem to be discussing writing custom code for a known task at
hand. Some of us are even discussing writing custom versions of malloc:

> It also failed to grasp a trivial solution if it's the allocations, rather
> than the consumption, that is the problem: allocate once, into a single
> large buffer, and dole out parcels of it yourself.


What I am attempting to communicate (and obviously not doing a good job
of it) is writing a reusable (or semi-reusable) library in C that models
a dynamic array. For the definition of dynamic array I'm using that
found in Wikipedia, which does a much better job of describing it than
I've done here:

http://en.wikipedia.org/wiki/Dynamic_array

If you go through the exercise of writing a reusable dynamic array in C
(as people do, e.g.
http://geta.life.uiuc.edu/~gary/prog...ca105b/Array.c ),
then you eventually end up with code that looks something like
"addToArray" found at the above link:

void
addToArray (Array array, void *element)
{
<snip>
if (array->elements == array->maxSize) {
array->maxSize *= 2;
array->array = (void **) realloc (array->array, array->maxSize
* sizeof (void *));
<snip>
}
}

(a partial listing in an attempt to respect the author's GPL copyright).

What reasonable steps can the C standard take to help the average C
programmer write a more efficient version of "addToArray"? Common
implementations of malloc/realloc/free have properties that could easily
be exploited here if only those properties were exposed (e.g. how much
free memory currently resides after the allocation pointed to by
array->array?)

The dynamic array data structure is so prevalent in software design and
use today that it warrants this kind of attention to detail.

-Howard
 
Reply With Quote
 
Ben Pfaff
Guest
Posts: n/a
 
      05-05-2006
Howard Hinnant <(E-Mail Removed)> writes:

> (a partial listing in an attempt to respect the author's GPL copyright).


The GPL[*] encourages distribution of source code, so it's a
little weird to consider a partial listing as a way of respecting
it.
[*] Which is a license, not a copyright.
--
"You call this a *C* question? What the hell are you smoking?" --Kaz
 
Reply With Quote
 
Ben Pfaff
Guest
Posts: n/a
 
      05-05-2006
Howard Hinnant <(E-Mail Removed)> writes:

> If you go through the exercise of writing a reusable dynamic array in C
> (as people do, e.g.
> http://geta.life.uiuc.edu/~gary/prog...ca105b/Array.c ),
> then you eventually end up with code that looks something like
> "addToArray" found at the above link:


I kind of like the x2nrealloc function that some GNU software
uses. It's a little more flexible and simpler to use than the
typical dynamic array. Here's the documentation:

/* If P is null, allocate a block of at least *PN such objects;
otherwise, reallocate P so that it contains more than *PN objects
each of S bytes. *PN must be nonzero unless P is null, and S must
be nonzero. Set *PN to the new number of objects, and return the
pointer to the new block. *PN is never set to zero, and the
returned pointer is never null.

Repeated reallocations are guaranteed to make progress, either by
allocating an initial block with a nonzero size, or by allocating a
larger block.

In the following implementation, nonzero sizes are doubled so that
repeated reallocations have O(N log N) overall cost rather than
O(N**2) cost, but the specification for this function does not
guarantee that sizes are doubled.

Here is an example of use:

int *p = NULL;
size_t used = 0;
size_t allocated = 0;

void
append_int (int value)
{
if (used == allocated)
p = x2nrealloc (p, &allocated, sizeof *p);
p[used++] = value;
}

This causes x2nrealloc to allocate a block of some nonzero size the
first time it is called.

To have finer-grained control over the initial size, set *PN to a
nonzero value before calling this function with P == NULL. For
example:

int *p = NULL;
size_t used = 0;
size_t allocated = 0;
size_t allocated1 = 1000;

void
append_int (int value)
{
if (used == allocated)
{
p = x2nrealloc (p, &allocated1, sizeof *p);
allocated = allocated1;
}
p[used++] = value;
}

*/

--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
 
Reply With Quote
 
=?iso-8859-1?q?Ion_Gazta=F1aga?=
Guest
Posts: n/a
 
      05-06-2006
Ben Pfaff(e)k dio:
> Howard Hinnant <(E-Mail Removed)> writes:
>
>> If you go through the exercise of writing a reusable dynamic array in C
>> (as people do, e.g.
>> http://geta.life.uiuc.edu/~gary/prog...ca105b/Array.c ),
>> then you eventually end up with code that looks something like
>> "addToArray" found at the above link:

>
> I kind of like the x2nrealloc function that some GNU software
> uses. It's a little more flexible and simpler to use than the
> typical dynamic array. Here's the documentation:


The C reallocation feature, in my opinion, misses some important points
that make memory allocation suboptimal, and disallows some C++
features:

-> Not all objects stored in the buffer can be binary copied. An struct
can have a pointer to itself, for example. This problem is obvious when
using C allocation functions to build C++ allocators for non POD
objects. Realloc binary copies automatically data, so we can't use
realloc for non binary copyable objects. We need a memory
reallocation/allocation function that has an option to disable data
copying.

-> We can't specify both a minimum size for allocation/reallocation and
a preferred size. We have to guess a reallocation size (for example,
doubling the size). Most of the times, we need to insert N extra
objects in a buffer with S objects, and currently we call realloc
doubling the size -> S*2. However, maybe the current block can be
expanded between N and S*2. Obviously, we prefer an expansion than a
reallocation:

allocate_at_least(p
,S+N /*min size*/
,S*2 /*preferred size*/
&allocated);

The meaning: if the current block can be expanded at least to S+N, do
it, otherwise try to allocate S*2, otherwise, allocate S+N, otherwise
return 0. Checking for expansion is very cheap, and if the next block
is free with a size between N and S, we can avoid the fragmentation and
reuse that space. This makes buffer expansion more probable, minimizes
allocations and improves performance.

-> In many realloc implementations, we can't expand backwards. Imagine
the following common situation, after some allocation/deallocation
operations:

| free1 | current_buffer | free2 |

free1 and free2 are not big enough to hold S+N elements, but free1 +
current_buffer + free2 is big enough. This reallocation is very fast
(surely constant time) in most implementations (for example using a
doubly linked list of memory blocks). Apart from this, locality is
improved since the previous block can be in the same memory page. Less
size overhead, less fragmentation, more locality and avoiding an
unneeded allocation. Unconditional backwards reallocation can disallow
any reallocation for complex c++ objects (for example, objects whose
constructor can throw) if we need strong exception guarantee. So I
would make backwards expansion optional.

----

The overhead of memory allocation is in many known applications the
biggest bottleneck. Minimizing unneeded allocations and using expansion
possibilities will reduce memory usage, will improve locality and will
improve speed.

Regards,

Ion

 
Reply With Quote
 
Malcolm
Guest
Posts: n/a
 
      05-07-2006
"Howard Hinnant" <(E-Mail Removed)> wrote
>
> Some of us seem to be discussing writing custom code for a known task at
> hand. Some of us are even discussing writing custom versions of malloc:
>

I've got a whole armoury of memory allocation routines.
>
>> It also failed to grasp a trivial solution if it's the allocations,
>> rather
>> than the consumption, that is the problem: allocate once, into a single
>> large buffer, and dole out parcels of it yourself.

>
> What I am attempting to communicate (and obviously not doing a good job
> of it) is writing a reusable (or semi-reusable) library in C that models
> a dynamic array. For the definition of dynamic array I'm using that
> found in Wikipedia, which does a much better job of describing it than
> I've done here:
>
> http://en.wikipedia.org/wiki/Dynamic_array
>
> If you go through the exercise of writing a reusable dynamic array in C
> (as people do, e.g.
> http://geta.life.uiuc.edu/~gary/prog...ca105b/Array.c ),
> then you eventually end up with code that looks something like
> "addToArray" found at the above link:
>
> void
> addToArray (Array array, void *element)
> {
> <snip>
> if (array->elements == array->maxSize) {
> array->maxSize *= 2;
> array->array = (void **) realloc (array->array, array->maxSize
> * sizeof (void *));
> <snip>
> }
> }
>
> (a partial listing in an attempt to respect the author's GPL copyright).
>
> What reasonable steps can the C standard take to help the average C
> programmer write a more efficient version of "addToArray"? Common
> implementations of malloc/realloc/free have properties that could easily
> be exploited here if only those properties were exposed (e.g. how much
> free memory currently resides after the allocation pointed to by
> array->array?)
>
> The dynamic array data structure is so prevalent in software design and
> use today that it warrants this kind of attention to detail.
>

Basically you don't do it like that.
The problem is that the generic AddToArray() function has an interface which
is too clunky considering the triviality of the underlying algorithm.

The answer is that the array will represent something in the real world,
which can be encapsulated.

eg
/* keep these opaque */
typedef struct
{
char *name;
char *title;
float salary;
} EMPOYEE;

typedef struct
{
EMPLOYEE *employees;
int Nemployees;
} PAYROLL;

Expose these

int getNemployees(PAYROLL *p);
void addemployee(PAYROLL *p, char *name, char *title);
void setsalary(PAYROLL *p, char *name, float salary);
void runpayroll(PAYROLL *p);

Now we need to decide at a general level what the program will do should it
run out of memory. Maybe we want to terminate with an error message, maybe
we want to pass back a flag, maybe we want to silently suppress the error.
The place to put this logic in is after the call to realloc - and messing
about with a general function means that we need to set error handling
strategies and so forth, and the whole thing becomes very difficult to use.

There could also be other errors, such as two employees with the same name,
or unrecognised job titles. Again, it makes sense to consider what to do at
more or less the same level.

Now let's say that our program runs too slowly, because of the continuous
reallocation of massive list of employees. Again, this is the level at which
to tackle the problem, and move to a tree or linked list structure, or maybe
store in extra capacity. Again, if we are running into efficiency problems,
the solution will be determined by the other characteristics of the problem
at hand - do we need to sort the employees, is it important to allow fast
random access, do we delete as well as add employees?

--
Website www.personal.leeds.ac.uk/bgy1mm
Programming goodies.


 
Reply With Quote
 
Howard Hinnant
Guest
Posts: n/a
 
      05-07-2006
In article <(E-Mail Removed)>,
"Malcolm" <(E-Mail Removed)> wrote:

> > The dynamic array data structure is so prevalent in software design and
> > use today that it warrants this kind of attention to detail.
> >

> Basically you don't do it like that.


Negating the usefulness of the dynamic array is unconvincing in the
extreme.

There are many useful data structures in modern software design. The
dynamic array is not only one of those useful data structures, it is one
of the most often used. That is not to say that it is a silver bullet
or anything like that. Indeed other data structures are often the right
choice, including the fixed size (even if the size is selected at
runtime) array.

But the dynamic array is an extremely useful data structure. It is not
always called "dynamic array". "Introduction to Algorithms" by Cormen,
Leiserson and Rivest (a very well respected book) refers to this data
structure as "dynamic table". Whatever you call it, it is a valuable
tool in the programmer's toolbox.

-Howard
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments