Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Memory leak ?? (http://www.velocityreviews.com/forums/t319485-memory-leak.html)

Kim Petersen 07-10-2003 12:34 PM

Memory leak ??
 
Memory leak - malloc/free implementation - GC kicking in late - know bug
- or ?

Using python-2.2.2-26 on RH9 (shrike) x86 -fully patched

The following program slowly eats up more and more memory when run on
large datasets... can anyone tell what the trouble is?

i've run it up to 240000 recsets so far - and it eats about .1% of my
mem pr. 1000 (doesn't really matter how much does it?).

--
Med Venlig Hilsen / Regards

Kim Petersen - Kyborg A/S (Udvikling)
IT - Innovationshuset
Havneparken 2
7100 Vejle
Tlf. +4576408183 || Fax. +4576408188

#!/usr/bin/python
#
# Created: 13:32 10/07-2003 by Kim Petersen <kp@kyborg.dk>
#
# $Id$
from __future__ import generators
import gzip
import re

err1=re.compile("^'ERROR:\s+(.*?)' in '(.*)'\s*$")

def iterator(file):
buffer=[]
while 1:
if not buffer:
buffer=file.readlines(1000)
line=buffer[0]
del buffer[0]
if not line:
raise
yield line

def getrec(lines):
result=[]
while 1:
line=lines.next().rstrip()
if not line: break
result.append(line)
if not result: return None
(error,dataset)=(result[:-1],eval(result[-1]))
error=''.join(error)[16:]
return error,dataset

if __name__ == "__main__":
import sys

lines=iterator(gzip.open("error.txt.gz"))
i=0
while 1:
if (i%1000)==0:
sys.stdout.write("%-10.10d\r" % (i,))
sys.stdout.flush()
rec=getrec(lines)
if not rec: break
(errline,dataset)=rec
if not err1.match(errline):
sys.stdout.write("%s\n" % (errline,))
sys.stdout.write("%-10.10d\r" % (i,))
sys.stdout.flush()
i+=1
sys.stdout.write("%-10.10d\n" % (i,))
sys.stdout.flush()

# Local Variables:
# tab-width: 3
# py-indent-offset: 3
# End:



A.M. Kuchling 07-10-2003 02:42 PM

Re: Memory leak ??
 
On Thu, 10 Jul 2003 14:34:05 +0200,
Kim Petersen <kp@kyborg.dk> wrote:
> Using python-2.2.2-26 on RH9 (shrike) x86 -fully patched
>
> The following program slowly eats up more and more memory when run on
> large datasets... can anyone tell what the trouble is?


Your code uses eval(), which is pretty heavyweight because it has to
tokenize, parse, and then evaluate the string. There have been a few memory
leaks in eval(), and perhaps you're running into one of them. Try using
int() or float() to convert strings to numbers instead of eval. As a bonus,
your program will be faster and much more secure (could an attacker tweak
your logfiles so you end up eval()ing os.unlink('/etc/passwd')?).

In general, using eval() is almost always a mistake; few programs need to
take arbitrary expressions as input.

--amk

Kim Petersen 07-11-2003 08:24 AM

Re: Memory leak ?? [resolved - thank you]
 
A.M. Kuchling wrote:
> On Thu, 10 Jul 2003 14:34:05 +0200,
> Kim Petersen <kp@kyborg.dk> wrote:
>
>>Using python-2.2.2-26 on RH9 (shrike) x86 -fully patched
>>
>>The following program slowly eats up more and more memory when run on
>>large datasets... can anyone tell what the trouble is?

>
>
> Your code uses eval(), which is pretty heavyweight because it has to
> tokenize, parse, and then evaluate the string. There have been a few memory
> leaks in eval(), and perhaps you're running into one of them. Try using
> int() or float() to convert strings to numbers instead of eval. As a bonus,
> your program will be faster and much more secure (could an attacker tweak
> your logfiles so you end up eval()ing os.unlink('/etc/passwd')?).


Thank you very much - it was eval()

this solved my trouble (calling get_list instead of eval) - is there a
more generic/efficient way of solving reading a list/expression? (i know
this one will fail for some strings for instance):

def get_value(str):
str=str.strip()
if str.lower()=='none':
return None
elif str[0] in ['"',"'"]:
return str[1:-1]
else:
if str[-1]=='j':
return complex(str)
elif '.' in str or 'e' in str:
return float(str)
else:
return int(str)

def get_list(str):
try:
if str[0]=='(':
robj=tuple
else:
robj=list
items=str.strip()[1:-1].split(', ')
return robj(map(get_value,items))
except:
traceback.print_exc()
print str
return []

--
Med Venlig Hilsen / Regards

Kim Petersen - Kyborg A/S (Udvikling)
IT - Innovationshuset
Havneparken 2
7100 Vejle
Tlf. +4576408183 || Fax. +4576408188



All times are GMT. The time now is 02:33 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.