Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > memory error

Reply
Thread Tools

memory error

 
 
Bart Nessux
Guest
Posts: n/a
 
      06-02-2004
def windows():
import os
excludes = ['hiberfil.sys', 'ipnathlp.dll', 'helpctr.exe', 'etc',
'etc', 'etc']
size_list = []
for root, dirs, files in os.walk('/'):
total = [x for x in files if x not in excludes]
for t in total:
s = file(os.path.join(root,t))
size = s.read()
size_list.append(size)
s.close()

windows()

The above function crashes with a memory error on Windows XP Pro at the
'size = s.read()' line. Page File usage (normally ~120 MB) will rise to
300+ MB and pythonw.exe will consume about 200 MB of actual ram right
before the crash. The machine has 512 MB of ram and is doing nothing
else while running the script.

I've written the script several ways, all with the same result. I've
noticed that a binary read 'rb' consumes almost twice as much physical
memory and causes the crash to happen quicker, but that's about it.

Initially, I wanted to use Python to open every file on the system (that
could be opened) and read the contents so I'd know the size of the file
and then add all of the reads to a list that I'd sum up. Basically
attempt to add up all bytes on the machine's disk drive.

Any ideas on what I'm doing wrong or suggestions on how to do this
differently?

Thanks,

Bart
 
Reply With Quote
 
 
 
 
Heiko Wundram
Guest
Posts: n/a
 
      06-02-2004
Am Mittwoch, 2. Juni 2004 15:11 schrieb Bart Nessux:
> size = s.read()


You read the complete content of the file here. size will not contain the
length of the file, but the complete file data. What you want is either
len(s.read()) (which is sloooooooooow), or have a look at os.path.getsize().

> size_list.append(size)


This appends the complete file to the list. And as such should explain the
memory usage you're seeing...

HTH!

Heiko.

 
Reply With Quote
 
 
 
 
Benjamin Niemann
Guest
Posts: n/a
 
      06-02-2004
Bart Nessux wrote:
> def windows():
> import os
> excludes = ['hiberfil.sys', 'ipnathlp.dll', 'helpctr.exe', 'etc',
> 'etc', 'etc']
> size_list = []
> for root, dirs, files in os.walk('/'):
> total = [x for x in files if x not in excludes]
> for t in total:
> s = file(os.path.join(root,t))
> size = s.read()
> size_list.append(size)
> s.close()
>
> windows()
>
> The above function crashes with a memory error on Windows XP Pro at the
> 'size = s.read()' line. Page File usage (normally ~120 MB) will rise to
> 300+ MB and pythonw.exe will consume about 200 MB of actual ram right
> before the crash. The machine has 512 MB of ram and is doing nothing
> else while running the script.
>
> I've written the script several ways, all with the same result. I've
> noticed that a binary read 'rb' consumes almost twice as much physical
> memory and causes the crash to happen quicker, but that's about it.
>
> Initially, I wanted to use Python to open every file on the system (that
> could be opened) and read the contents so I'd know the size of the file
> and then add all of the reads to a list that I'd sum up. Basically
> attempt to add up all bytes on the machine's disk drive.
>
> Any ideas on what I'm doing wrong or suggestions on how to do this
> differently?

Your building an array containing the *contents* of all your files.
If you really need to use read(), use "size = len(s.read())", but this
still requires to read and hold a complete file at a time in memory (and
probably chokes when it stumbles over your divx collection
I think using os.stat() should be better...
 
Reply With Quote
 
fishboy
Guest
Posts: n/a
 
      06-02-2004
On Wed, 02 Jun 2004 09:11:13 -0400, Bart Nessux
<(E-Mail Removed)> wrote:

>def windows():
> import os
> excludes = ['hiberfil.sys', 'ipnathlp.dll', 'helpctr.exe', 'etc',
>'etc', 'etc']
> size_list = []
> for root, dirs, files in os.walk('/'):
> total = [x for x in files if x not in excludes]
> for t in total:
> s = file(os.path.join(root,t))
> size = s.read()
> size_list.append(size)
> s.close()
>
>windows()


Yeah, what the other guys said about os.stat and os.path.getsize.
Also, if you really want to read the actual file into memory, just get
small chunks and add those up.

Like (untested):

numberofbytes = 0
CHUNKSIZE = 4096
for root,dirs, files in os.walk('/'):
for name in files:
if name not in excludes:
f = file(os.path.join(root,name))
while 1:
s = f.read(CHUNKSIZE)
if not s:
f.close()
break
numberofbytes += len(s)

this way you never have more than 4k of data in memory at once.
(well it might be 8k, I dont know enought about the internals to tell
you when the previous 's' is garbage collected.)

><{{{*>


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Error : memory requirements exceed available memory ramon_batista@hotmail.com Cisco 1 04-04-2008 06:55 AM
"Svchost.exe Application Error The instruction at '0x4c1e39e1' referenced memory at '0x4c1e39e1', The memory could not be 'read'." t-rex Computer Support 5 03-16-2008 09:30 PM
How do I get an out-of-memory error memory usage dump? Todd Java 4 09-05-2007 03:08 PM
"Svchost.exe Application Error The instruction at '0x4c1e39e1' referenced memory at '0x4c1e39e1', The memory could not be 'read'." t-rex Computer Support 0 03-19-2005 10:38 PM
Differences between Sony Memory Stick & memory Stick Pro vs Memory Stick Duo? zxcvar Digital Photography 3 11-28-2004 10:48 PM



Advertisments