Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > multythreading app memory consumption

Reply
Thread Tools

multythreading app memory consumption

 
 
Roman Petrichev
Guest
Posts: n/a
 
      10-22-2006
Hi folks.
I've just faced with very nasty memory consumption problem.
I have a multythreaded app with 150 threads which use the only and the
same function - through urllib2 it just gets the web page's html code
and assigns it to local variable. On the next turn the variable is
overritten with another page's code. At every moment the summary of
values of the variables containig code is not more than 15Mb (I've just
invented a tricky way to measure this). But during the first 30 minutes
all the system memory (512Mb) is consumed and 'MemoryError's is arising.
Why is it so and how can I limit the memory consumption in borders, say,
400Mb? Maybe there is a memory leak there?
Thnx

The test app code:


Q = Queue.Queue()
for i in rez: #rez length - 5000
Q.put(i)


def checker():
while True:
try:
url = Q.get()
except Queue.Empty:
break
try:
opener = urllib2.urlopen(url)
data = opener.read()
opener.close()
except:
sys.stderr.write('ERROR: %s\n' % traceback.format_exc())
try:
opener.close()
except:
pass
continue
print len(data)


for i in xrange(150):
new_thread = threading.Thread(target=checker)
new_thread.start()
 
Reply With Quote
 
 
 
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      10-23-2006
On Mon, 23 Oct 2006 03:31:28 +0400, Roman Petrichev <(E-Mail Removed)>
declaimed the following in comp.lang.python:

> Hi folks.
> I've just faced with very nasty memory consumption problem.
> I have a multythreaded app with 150 threads which use the only and the
> same function - through urllib2 it just gets the web page's html code
> and assigns it to local variable. On the next turn the variable is
> overritten with another page's code. At every moment the summary of
> values of the variables containig code is not more than 15Mb (I've just
> invented a tricky way to measure this). But during the first 30 minutes
> all the system memory (512Mb) is consumed and 'MemoryError's is arising.
> Why is it so and how can I limit the memory consumption in borders, say,
> 400Mb? Maybe there is a memory leak there?
> Thnx
>

How much stack space gets allocated for 150 threads?

>
> Q = Queue.Queue()
> for i in rez: #rez length - 5000


Can't be the "test code" as you don't show the imports or where
"rez" is defined.
--
Wulfraed Dennis Lee Bieber KD6MOG
http://www.velocityreviews.com/forums/(E-Mail Removed) (E-Mail Removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (E-Mail Removed))
HTTP://www.bestiaria.com/
 
Reply With Quote
 
 
 
 
Roman Petrichev
Guest
Posts: n/a
 
      10-23-2006
Dennis Lee Bieber wrote:
> On Mon, 23 Oct 2006 03:31:28 +0400, Roman Petrichev <(E-Mail Removed)>
> declaimed the following in comp.lang.python:
>
>> Hi folks.
>> I've just faced with very nasty memory consumption problem.
>> I have a multythreaded app with 150 threads which use the only and the
>> same function - through urllib2 it just gets the web page's html code
>> and assigns it to local variable. On the next turn the variable is
>> overritten with another page's code. At every moment the summary of
>> values of the variables containig code is not more than 15Mb (I've just
>> invented a tricky way to measure this). But during the first 30 minutes
>> all the system memory (512Mb) is consumed and 'MemoryError's is arising.
>> Why is it so and how can I limit the memory consumption in borders, say,
>> 400Mb? Maybe there is a memory leak there?
>> Thnx
>>

> How much stack space gets allocated for 150 threads?

Actually I don't know. How can I get to know this?
>> Q = Queue.Queue()
>> for i in rez: #rez length - 5000

>
> Can't be the "test code" as you don't show the imports or where
> "rez" is defined.

Isn't it clear that "rez" is just a list of 5000 urls? I cannot place it
here, but believe me all of them are not big - "At every moment the
summary of values of the variables containig code is not more than 15Mb"

Regards

 
Reply With Quote
 
Istvan Albert
Guest
Posts: n/a
 
      10-23-2006
Roman Petrichev wrote:

> try:
> url = Q.get()
> except Queue.Empty:
> break


This code will never raise the Queue.Empty exception. Only a
non-blocking get does:

url = Q.get(block=False)

As mentioned before you should post working code if you expect people
to help.

i.

 
Reply With Quote
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      10-23-2006
On Mon, 23 Oct 2006 12:07:47 +0400, Roman Petrichev <(E-Mail Removed)>
declaimed the following in comp.lang.python:


> Actually I don't know. How can I get to know this?


Unfortunately I don't know of any utility for finding stack sizes --
it may be somewhere in the OS documentation (I'm sure there is a default
size somewhere). Though I don't expect 150 threads to use more than 1MB
total...

Even if these are using all physical memory, I'd just expect some of
the threads to start paging out to disk.
--
Wulfraed Dennis Lee Bieber KD6MOG
(E-Mail Removed) (E-Mail Removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (E-Mail Removed))
HTTP://www.bestiaria.com/
 
Reply With Quote
 
Neil Hodgson
Guest
Posts: n/a
 
      10-23-2006
Roman Petrichev:
> Dennis Lee Bieber wrote:
>> How much stack space gets allocated for 150 threads?

> Actually I don't know. How can I get to know this?


On Linux, each thread will often be allocated 10 megabytes of stack.
This can be viewed and altered with the ulimit command.

Neil
 
Reply With Quote
 
Bryan Olson
Guest
Posts: n/a
 
      10-24-2006
Roman Petrichev wrote:
> Hi folks.
> I've just faced with very nasty memory consumption problem.
> I have a multythreaded app with 150 threads

[...]
>
> The test app code:
>
>
> Q = Queue.Queue()
> for i in rez: #rez length - 5000
> Q.put(i)
>
>
> def checker():
> while True:
> try:
> url = Q.get()
> except Queue.Empty:
> break
> try:
> opener = urllib2.urlopen(url)
> data = opener.read()
> opener.close()
> except:
> sys.stderr.write('ERROR: %s\n' % traceback.format_exc())
> try:
> opener.close()
> except:
> pass
> continue
> print len(data)
>
>
> for i in xrange(150):
> new_thread = threading.Thread(target=checker)
> new_thread.start()


Don't know if this is the heart of your problem, but there's no
limit to how big "data" could be, after

data = opener.read()

Furthermore, you keep it until "data" gets over-written the next
time through the loop. You might try restructuring checker() to
make data local to one iteration, as in:

def checker():
while True:
onecheck()

def onecheck():
try:
url = Q.get()
except Queue.Empty:
break
try:
opener = urllib2.urlopen(url)
data = opener.read()
opener.close()
print len(data)
except:
sys.stderr.write('ERROR: %s\n' % traceback.format_exc())
try:
opener.close()
except:
pass


--
--Bryan
 
Reply With Quote
 
Bryan Olson
Guest
Posts: n/a
 
      10-24-2006
Dennis Lee Bieber wrote:
> How much stack space gets allocated for 150 threads?


In Python 2.5, each thread will be allocated

thread.stack_size()

bytes of stack address space. Note that address space is
not physical memory, nor even virtual memory. On modern
operating systems, the memory gets allocated as needed,
and 150 threads is not be a problem.


--
--Bryan
 
Reply With Quote
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      10-24-2006
On Mon, 23 Oct 2006 23:25:47 GMT, Neil Hodgson
<(E-Mail Removed)> declaimed the following in
comp.lang.python:

>
> On Linux, each thread will often be allocated 10 megabytes of stack.
> This can be viewed and altered with the ulimit command.
>

If true, 151 (main + 150 threads) wants 1.5GB... Sounds a bit high
to me... 1MB each I could see...
--
Wulfraed Dennis Lee Bieber KD6MOG
(E-Mail Removed) (E-Mail Removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (E-Mail Removed))
HTTP://www.bestiaria.com/
 
Reply With Quote
 
Andrew MacIntyre
Guest
Posts: n/a
 
      10-24-2006
Bryan Olson wrote:

> In Python 2.5, each thread will be allocated
>
> thread.stack_size()
>
> bytes of stack address space. Note that address space is
> not physical memory, nor even virtual memory. On modern
> operating systems, the memory gets allocated as needed,
> and 150 threads is not be a problem.


Just a note that [thread|threading].stack_size() returns 0 to indicate
the platform default, and that value will always be returned unless an
explicit value has previously been set.

The Posix thread platforms (those that support programmatic setting of
this parameter) have the best support for sanity checking the requested
size - the value gets checked when actually set, rather than when the
thread creation is attempted.

The platform default thread stack sizes I can recall are:
Windows: 1MB (though this may be affected by linker options)
Linux: 1MB or 8MB depending on threading library and/or distro
FreeBSD: 64kB

--
-------------------------------------------------------------------------
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: (E-Mail Removed) (pref) | Snail: PO Box 370
(E-Mail Removed) (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Low power consumption board with memory Duccio VHDL 0 02-25-2006 10:12 PM
ASP.NET app instance memory consumption Ivan Belov ASP .Net 0 01-19-2005 06:53 PM
aspnet_wp.exe was recycled because memory consumption exceeded the tony_wang ASP .Net 1 11-21-2003 07:28 AM
Urgent! GDI+ Memory consumption Ervin ASP .Net 0 09-15-2003 07:58 PM
aspnet_wp.exe memory consumption Kiran Kumar ASP .Net 1 07-15-2003 12:08 PM



Advertisments