Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > urllib2 script slowing and stopping

Reply
Thread Tools

urllib2 script slowing and stopping

 
 
Dantium
Guest
Posts: n/a
 
      10-11-2010
I have a small script that reads several CSV files in a directory and
puts the data in a DB using Django.

There are about 1.7 million records in 120 CSV files, I am running the
script on a VPS with about 512mb of memory python 2.6.5 on ubuntu
10.04.

The script gets slow and seems to lock after about 870000 records.
running top show that the memory is all being used up y the python
process, is there someway I can improve on this script?


class Command(BaseCommand):

def handle(self, *args, **options):
count = 0
d = urllib2.urlopen(postcode_dir).read()
postcodefiles = re.findall('<a href="(.*?\.csv)">', d)
nprog = 0

for n in range(nprog, len(postcodefiles)):
fl = postcodefiles[n]
print 'Processing %d %s ...' % (n, fl)
s = urllib2.urlopen(postcode_dir + fl)
c = csv.reader(s.readlines())
for row in c:
postcode = row[0]
location = Point(map(float, row[10:12]))
Postcode.objects.create(code=postcode,
location=location)
count += 1
if count % 10000 == 0:
print "Imported %d" % count
s.close()
nprog = n+1



Thanks

-Dan
 
Reply With Quote
 
 
 
 
Ian
Guest
Posts: n/a
 
      10-11-2010
On Oct 11, 2:48*pm, Dantium <(E-Mail Removed)> wrote:
> I have a small script that reads several CSV files in a directory and
> puts the data in a DB using Django.
>
> There are about 1.7 million records in 120 CSV files, I am running the
> script on a VPS with about 512mb of memory python 2.6.5 on ubuntu
> 10.04.
>
> The script gets slow and seems to lock after about 870000 records.
> running top show that the memory is all being used up y the python
> process, is there someway I can improve on this script?


Probably you have "DEBUG = True" in your Django settings.py file. In
debug mode, Django records every query that is executed in
django.db.connection.queries. To fix it, either disable debug mode or
just have your script go in and clear out that list from time to time.

HTH,
Ian
 
Reply With Quote
 
 
 
 
Dantium
Guest
Posts: n/a
 
      10-11-2010
On Oct 11, 10:07*pm, Ian <(E-Mail Removed)> wrote:
> On Oct 11, 2:48*pm, Dantium <(E-Mail Removed)> wrote:
>
> > I have a small script that reads several CSV files in a directory and
> > puts the data in a DB using Django.

>
> > There are about 1.7 million records in 120 CSV files, I am running the
> > script on a VPS with about 512mb of memory python 2.6.5 on ubuntu
> > 10.04.

>
> > The script gets slow and seems to lock after about 870000 records.
> > running top show that the memory is all being used up y the python
> > process, is there someway I can improve on this script?

>
> Probably you have "DEBUG = True" in your Django settings.py file. *In
> debug mode, Django records every query that is executed in
> django.db.connection.queries. *To fix it, either disable debug mode or
> just have your script go in and clear out that list from time to time.
>
> HTH,
> Ian


Yeah thanks that helped!

It was still running really low on memory by the end though but they
all got added.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
script slowing down system? sammy Computer Support 20 08-03-2007 03:35 AM
starting and stopping a program from inside a python script dfaber Python 1 07-04-2006 09:41 AM
Problem with: urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) Josef Cihal Python 0 09-05-2005 11:26 AM
Mozilla 1.7.8 slowing Broadback Firefox 0 06-13-2005 06:00 PM
1721 Router Slowing Internet Access petemcc Cisco 2 11-21-2003 12:19 PM



Advertisments