Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Garbage collection

Reply
Thread Tools

Garbage collection

 
 
Tom Wright
Guest
Posts: n/a
 
      03-21-2007
Hi all

I suspect I may be missing something vital here, but Python's garbage
collection doesn't seem to work as I expect it to. Here's a small test
program which shows the problem on python 2.4 and 2.5:

$ python2.5
Python 2.5 (release25-maint, Dec 9 2006, 15:33:01)
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>


(at this point, Python is using 15MB)

>>> a = range(int(1e7))
>>>


(at this point, Python is using 327MB)

>>> a = None
>>>


(at this point, Python is using 251MB)

>>> import gc
>>> gc.collect()

0
>>>


(at this point, Python is using 252MB)


Is there something I've forgotten to do? Why is Python still using such a
lot of memory?


Thanks!

--
I'm at CAMbridge, not SPAMbridge
 
Reply With Quote
 
 
 
 
Thinker
Guest
Posts: n/a
 
      03-21-2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom Wright wrote:
> Hi all
>
> I suspect I may be missing something vital here, but Python's garbage
> collection doesn't seem to work as I expect it to. Here's a small test
> program which shows the problem on python 2.4 and 2.5:

................ skip .....................
> (at this point, Python is using 252MB)
>
>
> Is there something I've forgotten to do? Why is Python still using such a
> lot of memory?
>
>
> Thanks!
>

How do you know amount of memory used by Python?
ps ˇB top or something?

- --
Thinker Li - http://www.velocityreviews.com/forums/(E-Mail Removed) (E-Mail Removed)
http://heaven.branda.to/~thinker/GinGin_CGI.py
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGATUI1LDUVnWfY8gRAhy9AKDTA2vZYkF7ZLl9Ufy4i+ onVSmWhACfTAOv
PdQn/V1ppnaKAhdrblA3y+0=
=dmnr
-----END PGP SIGNATURE-----

 
Reply With Quote
 
 
 
 
Tom Wright
Guest
Posts: n/a
 
      03-21-2007
Thinker wrote:
> How do you know amount of memory used by Python?
> ps ? top or something?


$ ps up `pidof python2.5`
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
tew24 26275 0.0 11.9 257592 243988 pts/6 S+ 13:10 0:00 python2.5

"VSZ" is "Virtual Memory Size" (ie. total memory used by the application)
"RSS" is "Resident Set Size" (ie. non-swapped physical memory)


--
I'm at CAMbridge, not SPAMbridge
 
Reply With Quote
 
skip@pobox.com
Guest
Posts: n/a
 
      03-21-2007
Tom> I suspect I may be missing something vital here, but Python's
Tom> garbage collection doesn't seem to work as I expect it to. Here's
Tom> a small test program which shows the problem on python 2.4 and 2.5:

Tom> (at this point, Python is using 15MB)

>>> a = range(int(1e7))
>>> a = None
>>> import gc
>>> gc.collect()

0

Tom> (at this point, Python is using 252MB)

Tom> Is there something I've forgotten to do? Why is Python still using
Tom> such a lot of memory?

You haven't forgotten to do anything. Your attempts at freeing memory are
being thwarted (in part, at least) by Python's int free list. I believe the
int free list remains after the 10M individual ints' refcounts drop to zero.
The large storage for the list is grabbed in one gulp and thus mmap()d I
believe, so it is reclaimed by being munmap()d, hence the drop from 320+MB
to 250+MB.

I haven't looked at the int free list or obmalloc implementations in awhile,
but if the free list does return any of its memory to the system it probably
just calls the free() library function. Whether or not the system actually
reclaims any memory from your process is dependent on the details of the
malloc/free implementation's details. That is, the behavior is outside
Python's control.

Skip
 
Reply With Quote
 
Thinker
Guest
Posts: n/a
 
      03-21-2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom Wright wrote:
> Thinker wrote:
>> How do you know amount of memory used by Python? ps ? top or
>> something?

>
> $ ps up `pidof python2.5` USER PID %CPU %MEM VSZ RSS TTY
> STAT START TIME COMMAND tew24 26275 0.0 11.9 257592 243988
> pts/6 S+ 13:10 0:00 python2.5
>
> "VSZ" is "Virtual Memory Size" (ie. total memory used by the
> application) "RSS" is "Resident Set Size" (ie. non-swapped physical
> memory)
>
>

This is amount of memory allocate by process not Python interpreter.
It is managemed by
malloc() of C library. When you free a block memory by free()
function, it only return
the memory to C library for later use, but C library not always return
the memory to
the kernel.

Since there is a virtual memory for modem OS, inactive memory will be
paged
to pager when more physical memory blocks are need. It don't hurt
much if you have enough
swap space.

What you get from ps command is memory allocated by process, it don't
means
they are used by Python interpreter.

- --
Thinker Li - (E-Mail Removed) (E-Mail Removed)
http://heaven.branda.to/~thinker/GinGin_CGI.py
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGATzJ1LDUVnWfY8gRAjSOAKC3uzoAWBow0VN77srjR5 eBF0kXawCcCUYv
0RgdHNHqWMEn2Ap7zQuOFaQ=
=/hWg
-----END PGP SIGNATURE-----

 
Reply With Quote
 
Tom Wright
Guest
Posts: n/a
 
      03-21-2007
(E-Mail Removed) wrote:
> You haven't forgotten to do anything. Your attempts at freeing memory are
> being thwarted (in part, at least) by Python's int free list. I believe
> the int free list remains after the 10M individual ints' refcounts drop to
> zero. The large storage for the list is grabbed in one gulp and thus
> mmap()d I believe, so it is reclaimed by being munmap()d, hence the drop
> from 320+MB to 250+MB.
>
> I haven't looked at the int free list or obmalloc implementations in
> awhile, but if the free list does return any of its memory to the system
> it probably just calls the free() library function. Whether or not the
> system actually reclaims any memory from your process is dependent on the
> details of themalloc/free implementation's details. That is, the behavior
> is outside Python's control.


Ah, thanks for explaining that. I'm a little wiser about memory allocation
now, but am still having problems reclaiming memory from unused objects
within Python. If I do the following:

>>>

(memory use: 15 MB)
>>> a = range(int(4e7))

(memory use: 1256 MB)
>>> a = None

(memory use: 953 MB)

....and then I allocate a lot of memory in another process (eg. open a load
of files in the GIMP), then the computer swaps the Python process out to
disk to free up the necessary space. Python's memory use is still reported
as 953 MB, even though nothing like that amount of space is needed. From
what you said above, the problem is in the underlying C libraries, but is
there anything I can do to get that memory back without closing Python?

--
I'm at CAMbridge, not SPAMbridge
 
Reply With Quote
 
skip@pobox.com
Guest
Posts: n/a
 
      03-21-2007

Tom> ...and then I allocate a lot of memory in another process (eg. open
Tom> a load of files in the GIMP), then the computer swaps the Python
Tom> process out to disk to free up the necessary space. Python's
Tom> memory use is still reported as 953 MB, even though nothing like
Tom> that amount of space is needed. From what you said above, the
Tom> problem is in the underlying C libraries, but is there anything I
Tom> can do to get that memory back without closing Python?

Not really. I suspect the unused pages of your Python process are paged
out, but that Python has just what it needs to keep going. Memory
contention would be a problem if your Python process wanted to keep that
memory active at the same time as you were running GIMP. I think the
process's resident size is more important here than virtual memory size (as
long as you don't exhaust swap space).

Skip
 
Reply With Quote
 
Tom Wright
Guest
Posts: n/a
 
      03-21-2007
(E-Mail Removed) wrote:
> Tom> ...and then I allocate a lot of memory in another process (eg.
> open Tom> a load of files in the GIMP), then the computer swaps the
> Python
> Tom> process out to disk to free up the necessary space. Python's
> Tom> memory use is still reported as 953 MB, even though nothing like
> Tom> that amount of space is needed. From what you said above, the
> Tom> problem is in the underlying C libraries, but is there anything I
> Tom> can do to get that memory back without closing Python?
>
> Not really. I suspect the unused pages of your Python process are paged
> out, but that Python has just what it needs to keep going.


Yes, that's what's happening.

> Memory contention would be a problem if your Python process wanted to keep
> that memory active at the same time as you were running GIMP.


True, but why does Python hang on to the memory at all? As I understand it,
it's keeping a big lump of memory on the int free list in order to make
future allocations of large numbers of integers faster. If that memory is
about to be paged out, then surely future allocations of integers will be
*slower*, as the system will have to:

1) page out something to make room for the new integers
2) page in the relevant chunk of the int free list
3) zero all of this memory and do any other formatting required by Python

If Python freed (most of) the memory when it had finished with it, then all
the system would have to do is:

1) page out something to make room for the new integers
2) zero all of this memory and do any other formatting required by Python

Surely Python should free the memory if it's not been used for a certain
amount of time (say a few seconds), as allocation times are not going to be
the limiting factor if it's gone unused for that long. Alternatively, it
could mark the memory as some sort of cache, so that if it needed to be
paged out, it would instead be de-allocated (thus saving the time taken to
page it back in again when it's next needed)


> I think the process's resident size is more important here than virtual
> memory size (as long as you don't exhaust swap space).


True in theory, but the computer does tend to go rather sluggish when paging
large amounts out to disk and back. Surely the use of virtual memory
should be avoided where possible, as it is so slow? This is especially
true when the contents of the blocks paged out to disk will never be read
again.


I've also tested similar situations on Python under Windows XP, and it shows
the same behaviour, so I think this is a Python and/or GCC/libc issue,
rather than an OS issue (assuming Python for linux and Python for windows
are both compiled with GCC).

--
I'm at CAMbridge, not SPAMbridge
 
Reply With Quote
 
Steve Holden
Guest
Posts: n/a
 
      03-21-2007
Tom Wright wrote:
> (E-Mail Removed) wrote:
>> Tom> ...and then I allocate a lot of memory in another process (eg.
>> open Tom> a load of files in the GIMP), then the computer swaps the
>> Python
>> Tom> process out to disk to free up the necessary space. Python's
>> Tom> memory use is still reported as 953 MB, even though nothing like
>> Tom> that amount of space is needed. From what you said above, the
>> Tom> problem is in the underlying C libraries, but is there anything I
>> Tom> can do to get that memory back without closing Python?
>>
>> Not really. I suspect the unused pages of your Python process are paged
>> out, but that Python has just what it needs to keep going.

>
> Yes, that's what's happening.
>
>> Memory contention would be a problem if your Python process wanted to keep
>> that memory active at the same time as you were running GIMP.

>
> True, but why does Python hang on to the memory at all? As I understand it,
> it's keeping a big lump of memory on the int free list in order to make
> future allocations of large numbers of integers faster. If that memory is
> about to be paged out, then surely future allocations of integers will be
> *slower*, as the system will have to:
>
> 1) page out something to make room for the new integers
> 2) page in the relevant chunk of the int free list
> 3) zero all of this memory and do any other formatting required by Python
>
> If Python freed (most of) the memory when it had finished with it, then all
> the system would have to do is:
>
> 1) page out something to make room for the new integers
> 2) zero all of this memory and do any other formatting required by Python
>
> Surely Python should free the memory if it's not been used for a certain
> amount of time (say a few seconds), as allocation times are not going to be
> the limiting factor if it's gone unused for that long. Alternatively, it
> could mark the memory as some sort of cache, so that if it needed to be
> paged out, it would instead be de-allocated (thus saving the time taken to
> page it back in again when it's next needed)
>

Easy to say. How do you know the memory that's not in use is in a
contiguous block suitable for return to the operating system? I can
pretty much guarantee it won't be. CPython doesn't use a relocating
garbage collection scheme, so objects always stay at the same place in
the process's virtual memory unless they have to be grown to accommodate
additional data.
>
>> I think the process's resident size is more important here than virtual
>> memory size (as long as you don't exhaust swap space).

>
> True in theory, but the computer does tend to go rather sluggish when paging
> large amounts out to disk and back. Surely the use of virtual memory
> should be avoided where possible, as it is so slow? This is especially
> true when the contents of the blocks paged out to disk will never be read
> again.
>

Right. So all we have to do is identify those portions of memory that
will never be read again and return them to the OS. That should be easy.
Not.
>
> I've also tested similar situations on Python under Windows XP, and it shows
> the same behaviour, so I think this is a Python and/or GCC/libc issue,
> rather than an OS issue (assuming Python for linux and Python for windows
> are both compiled with GCC).
>

It's probably a dynamic memory issue. Of course if you'd like to provide
a patch to switch it over to a relocating garbage collection scheme
we'll all await it with bated breath

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Recent Ramblings http://holdenweb.blogspot.com

 
Reply With Quote
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      03-21-2007
On Wed, 21 Mar 2007 15:32:17 +0000, Tom Wright <(E-Mail Removed)>
declaimed the following in comp.lang.python:

>
> True, but why does Python hang on to the memory at all? As I understand it,
> it's keeping a big lump of memory on the int free list in order to make
> future allocations of large numbers of integers faster. If that memory is
> about to be paged out, then surely future allocations of integers will be
> *slower*, as the system will have to:
>

It may not just be that free list -- which on a machine with lots of
RAM may never be paged out anyway [mine (XP) currently shows: physical
memory total/available/system: 2095196/1355296/156900K, commit charge
total/limit/peak: 514940/3509272/697996K (limit includes page/swap file
of 1.5GB)] -- it could easily just be that the OS or runtime just
doesn't return memory to the OS until a process/executable image exits.
--
Wulfraed Dennis Lee Bieber KD6MOG
(E-Mail Removed) (E-Mail Removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (E-Mail Removed))
HTTP://www.bestiaria.com/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Collection problems (create Collection object, add data to collection, bind collection to datagrid) Řyvind Isaksen ASP .Net 1 05-18-2007 09:24 AM
Templates - Garbage In Garbage Not Out ramiro_b@yahoo.com C++ 1 07-25-2005 04:48 PM
Garbage Collection kamran MCSD 1 04-04-2005 10:04 PM
Garbage Collection and Manage Code? Laser Lu ASP .Net 5 01-27-2004 03:48 AM
Debbugging help! (.NET 1.1 Framework Garbage Collection Problems) Cheung, Jeffrey Jing-Yen ASP .Net 3 07-10-2003 07:29 PM



Advertisments