Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Will Python 3.0 remove the global interpreter lock (GIL)

Reply
Thread Tools

Will Python 3.0 remove the global interpreter lock (GIL)

 
 
Steven D'Aprano
Guest
Posts: n/a
 
      09-20-2007
On Wed, 19 Sep 2007 11:07:48 -0700, TheFlyingDutchman wrote:

> On Sep 19, 8:51 am, Steven D'Aprano <st...@REMOVE-THIS-
> cybersource.com.au> wrote:
>> On Tue, 18 Sep 2007 18:09:26 -0700, TheFlyingDutchman wrote:
>> > How much faster/slower would Greg Stein's code be on today's
>> > processors versus CPython running on the processors of the late
>> > 1990's?

>>
>> I think a better question is, how much faster/slower would Stein's code
>> be on today's processors, versus CPython being hand-simulated in a
>> giant virtual machine made of clockwork?
>>
>> --
>> Steven.

>
> Steven, You forgot this part:
>
> "And if you decide to answer, please add a true/false response to this
> statement - "CPython in the late 1990's ran too slow"'.



No, I ignored it, because it doesn't have a true/false response. It's a
malformed request. "Too slow" for what task? Compared to what
alternative? Fast and slow are not absolute terms, they are relative. A
sloth is "fast" compared to continental drift, but "slow" compared to the
space shuttle.

BUT even if we all agreed that CPython was (or wasn't) "too slow" in the
late 1990s, why on earth do you imagine that is important? It is no
longer the late 1990s, it is now 2007, and we are not using Python 1.4
any more.



--
Steven.
 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      09-20-2007
On Wed, 19 Sep 2007 15:59:59 -0700, TheFlyingDutchman wrote:

> Paul it's a pleasure to see that you are not entirely against
> complaints.


I'm not against complaints either, so long as they are well-thought out.
I've made a few of my own over the years, some of which may have been
less well-thought out than others.


> The very fastest Intel processor of the last 1990's that I found came
> out in October 1999 and had a speed around 783Mhz. Current fastest
> processors are something like 3.74 Ghz, with larger caches. Memory is
> also faster and larger. It appears that someone running a non-GIL
> implementation of CPython today would have significantly faster
> performance than a GIL CPython implementation of the late 1990's.


That's an irrelevant comparison. It's a STUPID comparison. The two
alternatives aren't "non-GIL CPython on 2007 hardware" versus "GIL
CPython on 1999 hardware" because we aren't using GIL CPython on 1999
hardware, we're using it on 2007 hardware. *That's* the alternative to
the non-GIL CPython that you need to compare against.

Why turn your back on eight years of faster hardware? What's the point of
getting rid of the GIL unless it leads to faster code? "Get the speed and
performance of 1999 today!" doesn't seem much of a selling point in 2007.


> Correct me if I am wrong, but it seems that saying non-GIL CPython is
> too slow, while once valid, has become invalid due to the increase in
> computing power that has taken place.


You're wrong, because the finishing line has shifted -- performance we
were satisfied with in 1998 would be considered unbearable to work with
in 2007.

I remember in 1996 (give or take a year) being pleased that my new
computer allowed my Pascal compiler to compile a basic, bare-bones GUI
text editor in a mere two or four hours, because it used to take up to
half a day on my older computer. Now, I expect to compile a basic text
editor in minutes, not hours.

According to http://linuxreviews.org/gentoo/compiletimes/

the whole of Openoffice-ximian takes around six hours to compile. Given
the speed of my 1996 computer, it would probably take six YEARS to
compile something of Openoffice's complexity.


As a purely academic exercise, we might concede that the non-GIL version
of CPython 1.5 running on a modern, dual-core CPU with lots of RAM will
be faster than CPython 2.5 running on an eight-year old CPU with minimal
RAM. But so what? That's of zero practical interest for anyone running
CPython 2.5 on a modern PC.

If you are running a 1999 PC, your best bet is to stick with the standard
CPython 1.5 including the GIL, because it is faster than the non-GIL
version.

If you are running a 2007 PC, your best bet is *still* to stick with the
standard CPython (version 2.5 now, not 1.5), because it will still be
faster than the non-GIL version (unless you have four or more processors,
and maybe not even then).

Otherwise, there's always Jython or IronPython.



--
Steven.
 
Reply With Quote
 
 
 
 
Paul Rubin
Guest
Posts: n/a
 
      09-20-2007
TheFlyingDutchman <(E-Mail Removed)> writes:
> The very fastest Intel processor of the last 1990's that I found came
> out in October 1999 and had a speed around 783Mhz. Current fastest
> processors are something like 3.74 Ghz, with larger caches. Memory is
> also faster and larger. It appears that someone running a non-GIL
> implementation of CPython today would have significantly faster
> performance than a GIL CPython implementation of the late 1990's.
> Correct me if I am wrong, but it seems that saying non-GIL CPython is
> too slow, while once valid, has become invalid due to the increase in
> computing power that has taken place.


This reasoning is invalid. For one thing, disk and memory sizes and
network bandwith have increased by a much larger factor than CPU speed
since the late 1990's. A big disk drive in 1999 was maybe 20gb; today
it's 750gb, almost 40x larger, way outstripping the 5x cpu mhz
increase. A fast business network connection was a 1.4 mbit/sec T-1
line, today it's often 100 mbit or more, again far oustripping CPU
mhz. If Python was just fast enough to firewall your T1 net
connection or index your 20gb hard drive in 1999, it's way too slow to
do the same with today's net connections and hard drives, just because
of that change in the hardware landscape. We have just about stopped
seeing increases in cpu mhz: that 3.74ghz speed was probably reached a
couple years ago. We get cpu speed increases now through parallelism,
not mhz. Intel and AMD both have 4-core cpu's now and Intel has a
16-core chip coming. Python is at a serious disadvantage compared
with other languages if the other languages keep up with developments
and Python does not.

Also, Python in the late 90's was pitched as a "scripting language",
intended for small throwaway tasks, while today it's used for complex
applications, and the language has evolved accordingly. CPython is
way behind the times, not only from the GIL, but because its slow
bytecode interpreter, its non-compacting GC, etc. The platitude that
performance doesn't matter, that programmer time is more valuable than
machine time, etc. is at best an excuse for laziness. And more and
more often, in the application areas where Python is deployed, it's
just plain wrong. Take web servers: a big site like Google has
something like a half million of them. Even the comparatively wimpy
site where I work has a couple thousand. If each server uses 150
watts of power (plus air conditioning), then if making the software 2x
faster lets us shut down 1000 of them, the savings in electricity
bills alone is larger than my salary. Of course that doesn't include
environmental benefits, hardware and hosting costs, the costs and
headaches of administering that many boxes, etc. For a lot of Python
users, significant speedups are a huge win.

However, I don't think fixing CPython (through GIL removal or anything
else) is the answer, and Jython isn't the answer either. Python's
future is in PyPy, or should be. Why would a self-respecting Python
implementation be written in (yikes) C or (yucch) Java, if it can be
written in Python? So I hope that PyPy's future directions include
true parallelism.
 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      09-20-2007
On Wed, 19 Sep 2007 19:14:39 -0700, Paul Rubin wrote:

> We get cpu speed increases now through parallelism, not mhz. Intel and
> AMD both have 4-core cpu's now and Intel has a 16-core chip coming.
> Python is at a serious disadvantage compared with other languages if the
> other languages keep up with developments and Python does not.


I think what you mean to say is that Python _will be_ at a serious
disadvantage if other languages keep up and Python doesn't. Python can't
be at a disadvantage _now_ because of what happens in the future.

Although, with the rapid take-up of multi-core CPUs, the future is
*really close*, so I welcome the earlier comment from Terry Reedy that
Guido has said he is willing to make changes to the CPython internals to
support multiprocessors, and that people have begun to investigate
practical methods of removing the GIL (as opposed to just bitching about
it for the sake of bitching).


> The platitude that performance doesn't matter


Who on earth says that? I've never heard anyone say that.

What I've heard people say is that _machine_ performance isn't the only
thing that needs to be maximized, or even the most important thing.
Otherwise we'd all be writing hand-optimized assembly language, and there
would be a waiting line of about five years to get access to the few
programmers capable of writing that hand-optimized assembly language.


> that programmer time is more valuable than machine time


Programmer time is more valuable than machine time in many cases,
especially when tasks are easily parallisable across many machines.
That's why your "comparatively wimpy site" preferred to throw extra web
servers at the job of serving webpages rather than investing in smarter,
harder-working programmers to pull the last skerricks of performance out
of the hardware you already had.


> etc. is at best an excuse for laziness.


What are you doing about solving the problem? Apart from standing on the
side-lines calling out "Get yer lazy behinds movin', yer lazy bums!!!" at
the people who aren't even convinced there is a problem that needs
solving?


> And more and more often, in the
> application areas where Python is deployed, it's just plain wrong. Take
> web servers: a big site like Google has something like a half million of
> them. Even the comparatively wimpy site where I work has a couple
> thousand. If each server uses 150 watts of power (plus air
> conditioning), then if making the software 2x faster lets us shut down
> 1000 of them,


What on earth makes you think that would be anything more than a
temporary, VERY temporary, shutdown? My prediction is that the last of
the machines wouldn't have even been unplugged before management decided
that running twice as fast, or servicing twice as many people at the same
speed, is more important than saving on the electricity bill, and they'd
be plugged back in.


> the savings in electricity bills alone is larger than my
> salary. Of course that doesn't include environmental benefits, hardware
> and hosting costs, the costs and headaches of administering that many
> boxes, etc. For a lot of Python users, significant speedups are a huge
> win.


Oh, I wouldn't say "No thanks!" to a Python speed up. My newest PC has a
dual-core CPU (no cutting edge for me...) and while Python is faster on
it than it was on my old PC, it isn't twice as fast.

But Python speed ups don't come for free. For instance, I'd *really*
object if Python ran twice as fast for users with a quad-core CPU, but
twice as slow for users like me with only a dual-core CPU.

I'd also object if the cost of Python running twice as fast was for the
startup time to quadruple, because I already run a lot of small scripts
where the time to launch the interpreter is a significant fraction of the
total run time. If I wanted something like Java, that runs fast once it
is started but takes a LONG time to actually start, I know where to find
it.

I'd also object if the cost of Python running twice as fast was for Guido
and the rest of the Python-dev team to present me with their wages bill
for six months of development. I'm grateful that somebody is paying their
wages, but if I had to pay for it myself it wouldn't be done. It simply
isn't that important to me (and even if it was, I couldn't afford it).

Now there's a thought... given that Google:

(1) has lots of money;
(2) uses Python a lot;
(3) already employs both Guido and (I think...) Alex Martelli and
possibly other Python gurus;
(4) is not shy in investing in Open Source projects;
(5) and most importantly uses technologies that need to be used across
multiple processors and multiple machines

one wonders if Google's opinion of where core Python development needs to
go is the same as your opinion?



--
Steven.
 
Reply With Quote
 
TheFlyingDutchman
Guest
Posts: n/a
 
      09-20-2007
On Sep 19, 8:54 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote:
> On Wed, 19 Sep 2007 19:14:39 -0700, Paul Rubin wrote:



>
> > etc. is at best an excuse for laziness.

>
> What are you doing about solving the problem? Apart from standing on the
> side-lines calling out "Get yer lazy behinds movin', yer lazy bums!!!" at
> the people who aren't even convinced there is a problem that needs
> solving?


He's trying to convince the developers that there is a problem. That
is not the same as your strawman argument.

>
> > And more and more often, in the
> > application areas where Python is deployed, it's just plain wrong. Take
> > web servers: a big site like Google has something like a half million of
> > them. Even the comparatively wimpy site where I work has a couple
> > thousand. If each server uses 150 watts of power (plus air
> > conditioning), then if making the software 2x faster lets us shut down
> > 1000 of them,

>
> What on earth makes you think that would be anything more than a
> temporary, VERY temporary, shutdown? My prediction is that the last of
> the machines wouldn't have even been unplugged before management decided
> that running twice as fast, or servicing twice as many people at the same
> speed, is more important than saving on the electricity bill, and they'd
> be plugged back in.
>

Plugging back in 1000 servers would be preferable to buying and
plugging in 2000 new servers which is what would occur if the software
in this example had not been sped up 2x and management had still
desired a 2x speed up in system performance as you suggest.

 
Reply With Quote
 
TheFlyingDutchman
Guest
Posts: n/a
 
      09-20-2007
On Sep 19, 5:08 pm, "Terry Reedy" <(E-Mail Removed)> wrote:
> "Terry Reedy" <(E-Mail Removed)> wrote in message


This is a little confusing because google groups does not show your
original post (not uncommon for them to lose a post in a thread - but
somehow still reflect the fact that it exists in the total-posts
number that they display) that you are replying to.


>
> This assumes that comparing versions of 1.5 is still relevant. As far as I
> know, his patch has not been maintained to apply against current Python.
> This tells me that no one to date really wants to dump the GIL at the cost
> of half Python's speed. Of course not. The point of dumping the GIL is to
> use multiprocessors to get more speed! So with two cores and extra
> overhead, Stein-patched 1.5 would not even break even.
>
> Quad (and more) cores are a different matter. Hence, I think, the
> resurgence of interest.


I am confused about the benefits/disadvantages of the "GIL removal".
Is it correct that the GIL is preventing CPython from having threads?

Is it correct that the only issue with the GIL is the prevention of
being able to do multi-threading?

If you only planned on writing single-threaded applications would GIL-
removal have no benefit?

Can threading have a performance benefit on a single-core machine
versus running multiple processes?

> So now this question for you: "CPython 2.5 runs too slow in 2007: true or
> false?"


I guess I gotta go with Steven D'Aprano - both true and false
depending on your situation.

> If you answer false, then there is no need for GIL removal.


OK, I see that.

> If you answer true, then cutting its speed for 90+% of people is bad.


OK, seems reasonable, assuming that multi-threading cannot be
implemented without a performance hit on single-threaded applications.
Is that a computer science maxim - giving an interpreted language
multi-threading will always negatively impact the performance of
single-threaded applications?

>
> | Most people are not currently bothered by the GIL and would not want its
> | speed halved.
>
> And another question: why should such people spend time they do not have to
> make Python worse for themselves?
>

Saying they don't have time to make a change, any change, is always
valid in my book. I cannot argue against that. Ditto for them saying
they don't want to make a change with no explanation. But it seems if
they make statements about why a change is not good, then it is fair
to make a counter-argument. I do agree with the theme of Steven
D'Aprano's comments in that it should be a cordial counter-argument
and not a demand.


 
Reply With Quote
 
TheFlyingDutchman
Guest
Posts: n/a
 
      09-20-2007
2

On Sep 19, 5:08 pm, "Terry Reedy" <(E-Mail Removed)> wrote:
> "Terry Reedy" <(E-Mail Removed)> wrote in message
>


>
> This assumes that comparing versions of 1.5 is still relevant. As far as I
> know, his patch has not been maintained to apply against current Python.
> This tells me that no one to date really wants to dump the GIL at the cost
> of half Python's speed. Of course not. The point of dumping the GIL is to
> use multiprocessors to get more speed! So with two cores and extra
> overhead, Stein-patched 1.5 would not even break even.


Is the only point in getting rid of the GIL to allow multi-threaded
applications?

Can't multiple threads also provide a performance boost versus
multiple processes on a single-core machine?

>
> So now this question for you: "CPython 2.5 runs too slow in 2007: true or
> false?"


Ugh, I guess I have to agree with Steven D'Aprano - it depends.

>
> If you answer false, then there is no need for GIL removal.


OK, I can see that.

> If you answer true, then cutting its speed for 90+% of people is bad.


OK, have to agree. Sounds like it could be a good candidate for a
fork. One question - is it a computer science maxim that an
interpreter that implements multi-threading will always be slower when
running single threaded apps?

>
> And another question: why should such people spend time they do not have to
> make Python worse for themselves?


I can't make an argument for someone doing something for free that
they don't have the time for. Ditto for doing something for free that
they don't want to do. But it does seem that if they give a reason for
why it's the wrong thing to do, it's fair to make a counter-argument.
Although I agree with Steven D'Aprano's theme in that it should be a
cordial rebuttal and not a demand.


 
Reply With Quote
 
Hendrik van Rooyen
Guest
Posts: n/a
 
      09-20-2007

"Steven D'Aprano" <steve@REMOVEau> wrote:

>
> I think a better question is, how much faster/slower would Stein's code
> be on today's processors, versus CPython being hand-simulated in a giant
> virtual machine made of clockwork?


This obviously depends on whether or not the clockwork is orange

- Hendrik

 
Reply With Quote
 
Bruno Desthuilliers
Guest
Posts: n/a
 
      09-20-2007
Ben Finney a écrit :
(snip)
> One common response to that is "Processes are expensive on Win32". My
> response to that is that if you're programming on Win32 and expecting
> the application to scale well, you already have problems that must
> first be addressed that are far more fundamental than the GIL.


Lol ! +1 QOTW !

 
Reply With Quote
 
Paul Boddie
Guest
Posts: n/a
 
      09-20-2007
On 20 Sep, 00:59, TheFlyingDutchman <(E-Mail Removed)> wrote:
>
> Paul it's a pleasure to see that you are not entirely against
> complaints.


Well, it seems to me that I'm usually the one making them.

> The very fastest Intel processor of the last 1990's that I found came
> out in October 1999 and had a speed around 783Mhz. Current fastest
> processors are something like 3.74 Ghz, with larger caches.


True, although you're paying silly money for a 3.8 GHz CPU with a
reasonable cache. However, as always, you can get something not too
far off for a reasonable sum. When I bought my CPU two or so years
ago, there was a substantial premium for as little as 200 MHz over the
3.0 GHz CPU I went for, and likewise a 3.4 GHz CPU seems to be had for
a reasonable price these days in comparison to the unit with an extra
400 MHz.

Returning to the subject under discussion, though, one big difference
between then and now is the availability of dual core CPUs, and these
seem to be fairly competitive on price with single cores, although the
frequencies of each core are lower and you have to decide whether you
believe the AMD marketing numbers: is a dual 2.2 GHz core CPU "4200+"
or not, for example? One can argue whether it's better to have two
cores, especially for certain kinds of applications (and CPython,
naturally), but if I were compiling lots of stuff, the ability to do a
"make -j2" and have a decent speed-up would almost certainly push me
in the direction of multicore units, especially if the CPU consumed
less power. And if anyone thinks all this parallelism is just
hypothetical, they should take a look at distcc to see a fairly clear
roadmap for certain kinds of workloads.

> Memory is also faster and larger. It appears that someone running a non-GIL
> implementation of CPython today would have significantly faster
> performance than a GIL CPython implementation of the late 1990's.
> Correct me if I am wrong, but it seems that saying non-GIL CPython is
> too slow, while once valid, has become invalid due to the increase in
> computing power that has taken place.


Although others have picked over these arguments, I can see what
you're getting at: even if we take a fair proportion of the increase
in computing power since the late 1990s, rather than 100% of it,
CPython without the GIL would still faster and have more potential for
further speed increases in more parallel architectures, rather than
running as fast as possible on a "sequential" architecture where not
even obscene amounts of money will buy you significantly better
performance. But I don't think it's so interesting to consider this
situation as merely a case of removing the GIL and using lots of
threads.

Let us return to the case I referenced above: even across networks,
where the communications cost is significantly higher than that of
physical memory, distributed compilation can provide a good
performance curve. Now I'm not arguing that every computational task
can be distributed in such a way, but we can see that some
applications of parallelisation are mature, even mainstream. There are
also less extreme cases: various network services can be scaled up
relatively effectively by employing multiple processes, as is the UNIX
way; some kinds of computation can be done in separate processes and
the results collected later on - we do this with relational databases
all the time. So, we already know that monolithic multithreaded
processes are not the only answer. (Java put an emphasis on extensive
multithreading and sandboxing because of the requirements of running
different people's code side-by-side on embedded platforms with
relatively few operating system conveniences, as well as on Microsoft
Windows, of course.)

If the programmer cost in removing the GIL and maintaining a GIL-free
CPython ecosystem is too great, then perhaps it is less expensive to
employ other, already well-understood mechanisms instead. Of course,
there's no "global programmer lock", so everyone interested in doing
something about removing the GIL, writing their own Python
implementation, or whatever they see to be the solution can freely do
so without waiting for someone else to get round to it. Like those
more readily parallelisable applications mentioned above, more stuff
can get done provided that everyone doesn't decide to join the same
project. A lesson from the real world, indeed.

Paul

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
global interpreter lock Tommy.Ryding@gmail.com Python 2 10-19-2005 06:44 AM
global interpreter lock Paul Rubin Python 59 09-15-2005 07:50 PM
Global Interpreter Lock Tomas Christiansen Python 3 09-24-2004 10:00 PM
Threading - Why Not Lock Objects Rather than lock the interpreter Fuzzyman Python 3 12-05-2003 10:43 PM
Re: Thread State and the Global Interpreter Lock Aahz Python 0 06-28-2003 01:20 PM



Advertisments