![]() |
RE: Populating a dictionary, fast [SOLVED SOLVED]
> > You can download the list of keys from here, it's 43M gzipped:
> > http://www.sendspace.com/file/9530i7 > > > > and see it take about 45 minutes with this: > > > > $ cat cache-keys.py > > #!/usr/bin/python > > v = {} > > for line in open('keys.txt'): > > v[long(line.strip())] = True > > > > > It takes about 20 seconds for me. It's possible it's related to > int/long > unification - try using Python 2.5. If you can't switch to 2.5, try > using string keys instead of longs. Yes, this was it. It ran *very* fast on Python v2.5. Terribly on v2.4, v2.3. (I thought I had already evaluated v2.5 but I see now that the server With 2.5 on it invokes 2.3 for 'python'.) Thanks! |
Re: Populating a dictionary, fast [SOLVED SOLVED]
On Nov 12, 12:46 pm, "Michael Bacarella" <m...@gpshopper.com> wrote:
> > > It takes about 20 seconds for me. It's possible it's related to > > int/long > > unification - try using Python 2.5. If you can't switch to 2.5, try > > using string keys instead of longs. > > Yes, this was it. It ran *very* fast on Python v2.5. Um. Is this the take away from this thread? Longs as dictionary keys are bad? Only for older versions of Python? This could be a problem for people like me who build lots of structures using seek values, which are longs, as done in http://nucular.sourceforge.net and http://bplusdotnet.sourceforge.net and elsewhere. Someone please summarize. -- Aaron Watters === http://www.xfeedme.com/nucular/pydis...=white%20trash |
Re: Populating a dictionary, fast [SOLVED SOLVED]
Aaron Watters <aaron.watters@gmail.com> writes:
> On Nov 12, 12:46 pm, "Michael Bacarella" <m...@gpshopper.com> wrote: >> >> > It takes about 20 seconds for me. It's possible it's related to >> > int/long >> > unification - try using Python 2.5. If you can't switch to 2.5, try >> > using string keys instead of longs. >> >> Yes, this was it. It ran *very* fast on Python v2.5. > > Um. Is this the take away from this thread? Longs as dictionary > keys are bad? Only for older versions of Python? It sounds like Python 2.4 (and previous versions) had a bug when populating large dicts on 64-bit architectures. > Someone please summarize. Yes, that would be good. |
Re: Populating a dictionary, fast [SOLVED SOLVED]
On Wed, 14 Nov 2007 18:16:25 +0100, Hrvoje Niksic wrote:
> Aaron Watters <aaron.watters@gmail.com> writes: > >> On Nov 12, 12:46 pm, "Michael Bacarella" <m...@gpshopper.com> wrote: >>> >>> > It takes about 20 seconds for me. It's possible it's related to >>> > int/long >>> > unification - try using Python 2.5. If you can't switch to 2.5, try >>> > using string keys instead of longs. >>> >>> Yes, this was it. It ran *very* fast on Python v2.5. >> >> Um. Is this the take away from this thread? Longs as dictionary keys >> are bad? Only for older versions of Python? > > It sounds like Python 2.4 (and previous versions) had a bug when > populating large dicts on 64-bit architectures. No, I found very similar behaviour with Python 2.5. >> Someone please summarize. > > Yes, that would be good. On systems with multiple CPUs or 64-bit systems, or both, creating and/or deleting a multi-megabyte dictionary in recent versions of Python (2.3, 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minutes, compared to seconds if the system only has a single CPU. Turning garbage collection off doesn't help. -- Steven. |
Re: Populating a dictionary, fast [SOLVED SOLVED]
On Nov 14, 6:26 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote: > >> Someone please summarize. > > > Yes, that would be good. > > On systems with multiple CPUs or 64-bit systems, or both, creating and/or > deleting a multi-megabyte dictionary in recent versions of Python (2.3, > 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minutes, > compared to seconds if the system only has a single CPU. Turning garbage > collection off doesn't help. > > -- > Steven. criminy... Any root cause? patch? btw, I think I've seen this, but I think you need to get into 10s of megs or more before it becomes critical. Note: I know someone will say "don't scare off the newbies" but in my experience most Python programmers are highly experienced professionals who need to know this sort of thing. The bulk of the newbies are either off in VB land or struggling with java. -- Aaron Watters === http://www.xfeedme.com/nucular/pydis...EXT=silly+walk |
Re: Populating a dictionary, fast [SOLVED SOLVED]
On Nov 14, 6:26 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote: > On systems with multiple CPUs or 64-bit systems, or both, creating and/or > deleting a multi-megabyte dictionary in recent versions of Python (2.3, > 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minutes, > compared to seconds if the system only has a single CPU. Turning garbage > collection off doesn't help. Fwiw, Testing on a 2 cpu 64 bit machine with 1gb real memory I consistently run out of real memory before I see this effect, so I guess it kicks in for dicts that consume beyond that. That's better than I feared at any rate... -- Aaron Watters === http://www.xfeedme.com/nucular/pydis...+nasty+windows |
Re: Populating a dictionary, fast [SOLVED SOLVED]
On Nov 14, 2007 5:26 PM, Steven D'Aprano
<steve@remove-this-cybersource.com.au> wrote: > On Wed, 14 Nov 2007 18:16:25 +0100, Hrvoje Niksic wrote: > > > Aaron Watters <aaron.watters@gmail.com> writes: > > > >> On Nov 12, 12:46 pm, "Michael Bacarella" <m...@gpshopper.com> wrote: > >>> > >>> > It takes about 20 seconds for me. It's possible it's related to > >>> > int/long > >>> > unification - try using Python 2.5. If you can't switch to 2.5, try > >>> > using string keys instead of longs. > >>> > >>> Yes, this was it. It ran *very* fast on Python v2.5. > >> > >> Um. Is this the take away from this thread? Longs as dictionary keys > >> are bad? Only for older versions of Python? > > > > It sounds like Python 2.4 (and previous versions) had a bug when > > populating large dicts on 64-bit architectures. > > No, I found very similar behaviour with Python 2.5. > > > >> Someone please summarize. > > > > Yes, that would be good. > > > On systems with multiple CPUs or 64-bit systems, or both, creating and/or > deleting a multi-megabyte dictionary in recent versions of Python (2.3, > 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minutes, > compared to seconds if the system only has a single CPU. Turning garbage > collection off doesn't help. > > I can't duplicate this in a dual CPU (64 bit, but running in 32 bit mode with a 32 bit OS) system. I added keys to a dict until I ran out of memory (a bit over 22 million keys) and deleting the dict took about 8 seconds (with a stopwatch, so not very precise, but obviously less than 30 minutes). >>> d = {} >>> idx = 0 >>> while idx < 1e10: .... d[idx] = idx .... idx += 1 .... Traceback (most recent call last): File "<stdin>", line 2, in <module> MemoryError >>> len(d) 22369622 >>> del d |
Re: Populating a dictionary, fast [SOLVED SOLVED]
On Nov 14, 6:26 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote: > On systems with multiple CPUs or 64-bit systems, or both, creating and/or > deleting a multi-megabyte dictionary in recent versions of Python (2.3, > 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minutes, > compared to seconds if the system only has a single CPU. Please don't propagate this nonsense. If you see this then the problem exists between the chair and monitor. There is nothing wrong with neither creating nor deleting dictionaries. i. |
Re: Populating a dictionary, fast [SOLVED SOLVED]
On Nov 15, 2:11 pm, Istvan Albert <istvan.alb...@gmail.com> wrote:
> There is nothing wrong with neither creating nor deleting > dictionaries. I suspect what happened is this: on 64 bit machines the data structures for creating dictionaries are larger (because pointers take twice as much space), so you run into memory contention issues sooner than on 32 bit machines, for similar memory sizes. If there is something deeper going on please correct me, I would very much like to know. -- Aaron Watters === http://www.xfeedme.com/nucular/pydis...T=alien+friend |
Re: Populating a dictionary, fast [SOLVED SOLVED]
Steven D'Aprano <steve@REMOVE-THIS-cybersource.com.au> writes:
>>> Someone please summarize. >> >> Yes, that would be good. > > On systems with multiple CPUs or 64-bit systems, or both, creating and/or > deleting a multi-megabyte dictionary in recent versions of Python (2.3, > 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minutes, > compared to seconds if the system only has a single CPU. Can you post minimal code that exhibits this behavior on Python 2.5.1? The OP posted a lot of different versions, most of which worked just fine for most people. |
| All times are GMT. The time now is 12:43 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.