Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > malloc

Reply
Thread Tools

malloc

 
 
Noob
Guest
Posts: n/a
 
      02-17-2012
pete wrote:

> No.
> The standard says:
> void free(void *ptr);
> If ptr is a null pointer, no action occurs.


You missed his post scriptum.
 
Reply With Quote
 
 
 
 
James Kuyper
Guest
Posts: n/a
 
      02-17-2012
On 02/17/2012 11:05 AM, Joachim Schmitz wrote:
> Noob wrote:
>> Eric Sosman wrote:
>>
>>> The only allocator bug I have personally encountered was with
>>> a malloc() implementation that never, never returned NULL. When it
>>> ought to have returned NULL, it crashed the program instead ...

>>
>> Visual Studio 6 used to crash on free(NULL).

>
> Which seems OK, of malloc() never returns NULL.


That would be non-conforming; if malloc() successfully allocates memory,
"Each such allocation shall yield a pointer to an object disjoint from
any other object". If a program calls malloc(SIZE_MAX) often enough
without ever free()ing the memory, it must sooner or later either fail
with a null return value, or return a pointer to memory that is NOT
disjoint from the memory previously allocated,, even in the very loose
(and IMO, nonconforming) sense that over-committing malloc()s use for
the term "allocate".

> free() only needs to take what malloc() gives, doesn't it?


No, there's an explicit requirement that passing a null pointer to
free() has no affect. Crashing doesn't qualify.
--
James Kuyper
 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      02-17-2012
Goran <(E-Mail Removed)> writes:
> On Feb 17, 11:26*am, Keith Thompson <(E-Mail Removed)> wrote:

[...]
>> I suspect he's referring to the malloc() implementation
>> on typical Linux systems, which overcommits memory by default.
>> It can allocate a large chunk of address space for which no actual
>> memory is available. *The memory isn't actually allocated until
>> the process attempts to access it. *If there isn't enough memory
>> available for the allocation, the "OOM killer" kills some process
>> (not necessarily the one that did the allocation).

>
> That crossed my mind, but what he said doesn't correspond with what
> happens: malloc does return something and __doesn't__ crash the
> program. OOM killer kills the code upon an attempt to access that
> memory.


If you call malloc() and it overcommits, it won't crash the
program until you access the allocated memory. (The rationale for
overcommitting is that most programs don't actually use most of
the memory the memory the allocate. I find that odd)

> But given the way he explained it, it's possible that he's affected by
> OOM killer, and he forgot, or never knew, what really happened.


Given the post you're responding to, I would find it more likely that
he knows exactly what happened and didn't mention it. Reading his
more recent followup, apparently it was on VMS, not Linux, and was
likely a bug in DEC's C library.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Joshua Maurice
Guest
Posts: n/a
 
      02-17-2012
On Feb 17, 9:25*am, Keith Thompson <(E-Mail Removed)> wrote:
> If you call malloc() and it overcommits, it won't crash the
> program until you access the allocated memory. *(The rationale for
> overcommitting is that most programs don't actually use most of
> the memory the memory the allocate. *I find that odd)


Really? The explanation that I'm most familiar with is that most fork
calls are immediately followed by exec, and thus if you're low on
memory, then a large process cannot spawn a new process without
overcommit because the only process create primitive is fork, which
"copies" the memory space of the parent process.

I of course think this is a broken state of affairs for several
reasons. 1- Just introduce a new create process primitive that creates
a process from an executable file with a copy options for specifying
the env vars, the command line, the working dir, etc. Sure it lacks
the full "power" of fork + exec, but it's a lot easier to use, and for
most uses of fork, I expect this would suffice, and then we could
avoid this nasty overcommit problem. 2- I'm annoyed at the ease at
which resources can be leaked to child processes and the near
impossibility to do anything portably about it.

However, due to discussions on this board, I've come to learn that OOM
situations are very hard to program for, and OOM in a common desktop
just can't be handled gracefully.
 
Reply With Quote
 
Joshua Maurice
Guest
Posts: n/a
 
      02-17-2012
On Feb 17, 12:46*pm, Paavo Helde <(E-Mail Removed)> wrote:
> BGB <(E-Mail Removed)> wrote innews:jhk2rh$19o$(E-Mail Removed):
>
>
>
>
>
>
>
>
>
> > On 2/16/2012 2:21 PM, Devil with the China Blue Dress wrote:
> >> In article<jhjr14$(E-Mail Removed)>, (E-Mail Removed) (Joe keane)
> >> wrote:

>
> >>> Bugs of memory allocation will make you mad.

>
> >>> Bugs *in* memory allocation will put you in the cuckoo people place.

>
> >> Which is why they are exceedingly rare. Nearly all allocation
> >> problems are due to the program storing outside array bounds.

>
> > which are annoyingly difficult to track down sometimes...

>
> > it would be nice if compilers would offer an option (as a debugging
> > feature) to optionally put in bounds checking for many operations

>
> Yes it is, and they are doing that. Use e.g. MSVC with iterator checking
> switched on (this is the default) and accessing e.g. a std::vector out of
> bounds will generate a runtime error. With gcc one can use MALLOC_CHECK_
> or -lmcheck, these also should catch some out-of-bounds access errors.
>
> For raw pointers it is more difficult, you can use something like
> ElectricFence or Valgrind on Linux, but it makes your program to run many
> times slower and consume lots of more memory so this is just for
> debugging.


MSVC also has some debug options that try to catch writes past the
ends of allocated regions as well. IIRC, they overallocate, and put
special bit patterns at the start and end on allocation, and when
freed they check to see if those bit patterns are intact, raising a
fatal error or something if it finds a problem.
 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      02-17-2012
["Followup-To:" header set to comp.lang.c.]
On 2012-02-17, Keith Thompson <(E-Mail Removed)> wrote:
> Goran <(E-Mail Removed)> writes:
>> On Feb 17, 11:26*am, Keith Thompson <(E-Mail Removed)> wrote:

> [...]
>>> I suspect he's referring to the malloc() implementation
>>> on typical Linux systems, which overcommits memory by default.
>>> It can allocate a large chunk of address space for which no actual
>>> memory is available. *The memory isn't actually allocated until
>>> the process attempts to access it. *If there isn't enough memory
>>> available for the allocation, the "OOM killer" kills some process
>>> (not necessarily the one that did the allocation).

>>
>> That crossed my mind, but what he said doesn't correspond with what
>> happens: malloc does return something and __doesn't__ crash the
>> program. OOM killer kills the code upon an attempt to access that
>> memory.

>
> If you call malloc() and it overcommits, it won't crash the
> program until you access the allocated memory. (The rationale for
> overcommitting is that most programs don't actually use most of
> the memory the memory the allocate. I find that odd)


Odd or not, it is borne out empirically. Applications are physically
smaller than their virtual footprints.

It may be the case that C programs that malloc something usually use
the whole block.

But overcommitting is not implemented at the level of malloc, but
at the level of a lower level allocator like mmap.

If the system maps a large block to give you a smaller one, that large
block will not be all used immediately.

Another example is thread stacks. If you give each thread a one megabyte
stack and make 100 threads, that's 100 megs of virtual space. But that one
megabyte is a worst case that few, if any, of the threads will hit.

Programs with lots of threads on GNU/Linux have inflated virtual footprints
due to the generous default stack size.
 
Reply With Quote
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      02-18-2012
In comp.lang.c++ Joshua Maurice <(E-Mail Removed)> wrote:

(snip)

> Really? The explanation that I'm most familiar with is that most fork
> calls are immediately followed by exec, and thus if you're low on
> memory, then a large process cannot spawn a new process without
> overcommit because the only process create primitive is fork, which
> "copies" the memory space of the parent process.


That was fixed about 20 years ago. Among others, there is vfork()

"vfork - spawn new process in a virtual memory efficient way"

A simple explanation is that vfork() tells the system that you
expect to call exec() next, and it can optimize for that case.

> I of course think this is a broken state of affairs for several
> reasons. 1- Just introduce a new create process primitive that creates
> a process from an executable file with a copy options for specifying
> the env vars, the command line, the working dir, etc.


(snip)

-- glen
 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      02-18-2012
On 2012-02-18, glen herrmannsfeldt <(E-Mail Removed)> wrote:
> In comp.lang.c++ Joshua Maurice <(E-Mail Removed)> wrote:
>
> (snip)
>
>> Really? The explanation that I'm most familiar with is that most fork
>> calls are immediately followed by exec, and thus if you're low on
>> memory, then a large process cannot spawn a new process without
>> overcommit because the only process create primitive is fork, which
>> "copies" the memory space of the parent process.

>
> That was fixed about 20 years ago. Among others, there is vfork()
>
> "vfork - spawn new process in a virtual memory efficient way"
>
> A simple explanation is that vfork() tells the system that you
> expect to call exec() next, and it can optimize for that case.


vfork is a dangerous hack which exposes the semantics of the optimization
of fork to the program.

A modern copy-on-write fork hides the semantics: the parent and child
spaces are shared, but appear duplicated.

A copy-on-write fork does have to account for the virtual space required
to duplicate the private mappings of the parent process, because those
will be copied physically if they are touched.

If you have a 500 megabyte process, of which 400 megabytes are private
mappings, and that process forks, the virtual layout of the system
increases by 400 megabytes. If overcommit is not allowed, that means
that the 400 megabytes has to be counted as physical memory.

Joshua is completely right here: forking is one of the use cases for
overcommit for this reason.
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      02-18-2012
On 2/17/2012 3:51 PM, Joshua Maurice wrote:
> On Feb 17, 12:46 pm, Paavo Helde<(E-Mail Removed)> wrote:
>> BGB<(E-Mail Removed)> wrote innews:jhk2rh$19o$(E-Mail Removed):
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>> On 2/16/2012 2:21 PM, Devil with the China Blue Dress wrote:
>>>> In article<jhjr14$(E-Mail Removed)>, (E-Mail Removed) (Joe keane)
>>>> wrote:

>>
>>>>> Bugs of memory allocation will make you mad.

>>
>>>>> Bugs *in* memory allocation will put you in the cuckoo people place.

>>
>>>> Which is why they are exceedingly rare. Nearly all allocation
>>>> problems are due to the program storing outside array bounds.

>>
>>> which are annoyingly difficult to track down sometimes...

>>
>>> it would be nice if compilers would offer an option (as a debugging
>>> feature) to optionally put in bounds checking for many operations

>>
>> Yes it is, and they are doing that. Use e.g. MSVC with iterator checking
>> switched on (this is the default) and accessing e.g. a std::vector out of
>> bounds will generate a runtime error. With gcc one can use MALLOC_CHECK_
>> or -lmcheck, these also should catch some out-of-bounds access errors.
>>
>> For raw pointers it is more difficult, you can use something like
>> ElectricFence or Valgrind on Linux, but it makes your program to run many
>> times slower and consume lots of more memory so this is just for
>> debugging.

>


I was mildly aware of ElectricFence and Valgrind, and was also thinking
mostly of malloc'ed raw pointers and C-style arrays (when passed as
pointers). the idea would be if something like Valgrind were directly
integrated into the compiler/runtime as a debug option. this is mostly
what I was writing about.

as for bounds-checked collection types, yes, I have a few of those as
well, and also mostly use a custom memory manager (mostly for GC and
dynamic type-checking), which could (sadly) create issues (likely not
detect bounds violations) if this feature were available.


> MSVC also has some debug options that try to catch writes past the
> ends of allocated regions as well. IIRC, they overallocate, and put
> special bit patterns at the start and end on allocation, and when
> freed they check to see if those bit patterns are intact, raising a
> fatal error or something if it finds a problem.


this is partly what my memory allocators do, but may also under certain
cases scan the heap, detect corruption, and attempt to diagnose it.

a nifty tool I have used in some places is object-origin tracking, where
every time the allocator is accessed, it records where it was called
from, and will use this information (combined with some data-forensics)
to try to make an educated guess as to "who done it" (or, IOW, around
where the offending code might be).

although, sadly, there is no good way to implement a HW write barrier
for this (sadly, neither Windows nor Linux give enough control over the
CPU to really make something like this all that workable, even then it
would still likely be page-fault driven slowness).

I have some stuff for "software write barriers" (mostly needed for other
reasons), but not a lot of code uses them (unless forced into it),
partly due to the added awkwardness and performance overheads.

one option that could sort of work for larger arrays would be to put
unused pages between memory objects, such that going out of bounds is
more likely to trigger a page fault.


or such...
 
Reply With Quote
 
Joshua Maurice
Guest
Posts: n/a
 
      02-18-2012
On Feb 17, 7:04*pm, (E-Mail Removed) (Scott Lurndal) wrote:
> Joshua Maurice <(E-Mail Removed)> writes:
> >Really? The explanation that I'm most familiar with is that most fork
> >calls are immediately followed by exec, and thus if you're low on
> >memory, then a large process cannot spawn a new process without
> >overcommit because the only process create primitive is fork, which
> >"copies" the memory space of the parent process.

>
> You're confusing overcommit with copy-on-write. *Fork uses COW[*] in
> which the parent and child share the physical pages until the child
> writes to one - at that point, they child gets a copy (and an allocation
> occurs which may fail at that point if memory and swap are exhausted).
>
> Overcommit was allowed to support sparse arrays which are common
> with some workloads.

[...]
>[*] COW came into general use in the SVR3.2/SVR4 timeframe. Linux has always
> used COW on fork. The only cost for the child is the page table
> (which actually can be quite a bit for a 1TB virtual address space using
> 4k pages - IIRC about 2GB just for page tables to map that much VA; makes
> 1G pages much more attractive (drops the page table size to 8k)).


So, I thought that if I turned overcommit off in Linux that if I tried
to fork with a large process and low commit, then the fork would fail.
(We're getting a little off topic, but I do not care.)

> (p.s. *see 'posix_spawn').


posix_spawn solves the "COW" and fork memory problem, but posix_spawn
still has all of the same process w.r.t. leaking resources to child
processes because its defined semantics are "as if fork followed by
exec".
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
to malloc or not to malloc?? Johs32 C Programming 4 03-30-2006 10:03 AM
porting non-malloc code to malloc micromysore@gmail.com C Programming 3 02-19-2005 05:39 AM
Malloc/Free - freeing memory allocated by malloc Peter C Programming 34 10-22-2004 10:23 AM
free'ing malloc'd structure with malloc'd members John C Programming 13 08-02-2004 11:45 AM
Re: free'ing malloc'd structure with malloc'd members ravi C Programming 0 07-30-2004 12:42 PM



Advertisments