Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Will Python 3.0 remove the global interpreter lock (GIL)

Reply
Thread Tools

Will Python 3.0 remove the global interpreter lock (GIL)

 
 
Bruno Desthuilliers
Guest
Posts: n/a
 
      09-20-2007
TheFlyingDutchman a écrit :
(snip)

> I am confused about the benefits/disadvantages of the "GIL removal".
> Is it correct that the GIL is preventing CPython from having threads?
>
> Is it correct that the only issue with the GIL is the prevention of
> being able to do multi-threading?


http://docs.python.org/lib/module-thread.html
http://docs.python.org/lib/module-threading.html

 
Reply With Quote
 
 
 
 
Chris Mellon
Guest
Posts: n/a
 
      09-20-2007
On 9/19/07, TheFlyingDutchman <(E-Mail Removed)> wrote:
> On Sep 19, 5:08 pm, "Terry Reedy" <(E-Mail Removed)> wrote:
> > "Terry Reedy" <(E-Mail Removed)> wrote in message

>
> This is a little confusing because google groups does not show your
> original post (not uncommon for them to lose a post in a thread - but
> somehow still reflect the fact that it exists in the total-posts
> number that they display) that you are replying to.
>
>
> >
> > This assumes that comparing versions of 1.5 is still relevant. As far as I
> > know, his patch has not been maintained to apply against current Python.
> > This tells me that no one to date really wants to dump the GIL at the cost
> > of half Python's speed. Of course not. The point of dumping the GIL is to
> > use multiprocessors to get more speed! So with two cores and extra
> > overhead, Stein-patched 1.5 would not even break even.
> >
> > Quad (and more) cores are a different matter. Hence, I think, the
> > resurgence of interest.

>
> I am confused about the benefits/disadvantages of the "GIL removal".
> Is it correct that the GIL is preventing CPython from having threads?
>


No. Python has threads, and they're wrappers around true OS level
system threads. What the GIL does is prevent *Python* code in those
threads from running concurrently.

> Is it correct that the only issue with the GIL is the prevention of
> being able to do multi-threading?
>


This sentence doesn't parse in a way that makes sense.

> If you only planned on writing single-threaded applications would GIL-
> removal have no benefit?
>


Yes.

> Can threading have a performance benefit on a single-core machine
> versus running multiple processes?
>


A simple question with a complicated answer. With the qualifier "can",
I have to say yes to be honest although you will only see absolute
performance increases on a single core from special purposed APIs that
call into C code anyway - and the GIL doesn't effect those so GIL
removal won't have an effect on the scalability of those operations.

Pure CPU bound threads (all pure Python code) will not increase
performance on a single core (there's CPU level concurrency that can,
but not OS level threads). You can improve *perceived* performance
this way (latency at the expense of throughput), but not raw
performance.

Very, very few operations are CPU bound these days, and even fewer of
the ones were Python is involved. The largest benefits to the desktop
user of multiple cores are the increase in cross-process performance
(multitasking), not single applications.

Servers vary more widely. However, in general, there's not a huge
benefit to faster threading when you can use multiple processes
instead. Python is not especially fast in terms of pure CPU time, so
if you're CPU bound anyway moving your CPU bound code into C (or
something else) is likely to reap far more benefits - and sidestepping
the GIL in the process.

In short, I think any problem that would be directly addressed by
removing the GIL is better addressed by other solutions.

> > So now this question for you: "CPython 2.5 runs too slow in 2007: true or
> > false?"

>
> I guess I gotta go with Steven D'Aprano - both true and false
> depending on your situation.
>
> > If you answer false, then there is no need for GIL removal.

>
> OK, I see that.
>
> > If you answer true, then cutting its speed for 90+% of people is bad.

>
> OK, seems reasonable, assuming that multi-threading cannot be
> implemented without a performance hit on single-threaded applications.
> Is that a computer science maxim - giving an interpreted language
> multi-threading will always negatively impact the performance of
> single-threaded applications?
>


It's not a maxim, per se - it's possible to have lockless concurrency,
although when you do this it's more like the shared nothing process
approach - but in general, yes. The cost of threading is the cost of
the locking needed to ensure safety, and the amount of locking is
proportional to the amount of shared state. Most of the common uses of
threading in the real world do not improve absolute performance and
won't no matter how many cores you use.

> >
> > | Most people are not currently bothered by the GIL and would not want its
> > | speed halved.
> >
> > And another question: why should such people spend time they do not have to
> > make Python worse for themselves?
> >

> Saying they don't have time to make a change, any change, is always
> valid in my book. I cannot argue against that. Ditto for them saying
> they don't want to make a change with no explanation. But it seems if
> they make statements about why a change is not good, then it is fair
> to make a counter-argument. I do agree with the theme of Steven
> D'Aprano's comments in that it should be a cordial counter-argument
> and not a demand.
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>

 
Reply With Quote
 
 
 
 
Grant Edwards
Guest
Posts: n/a
 
      09-20-2007
On 2007-09-20, TheFlyingDutchman <(E-Mail Removed)> wrote:

> Is the only point in getting rid of the GIL to allow multi-threaded
> applications?


That's the main point.

> Can't multiple threads also provide a performance boost versus
> multiple processes on a single-core machine?


That depends on the algorithm, the code, and the
synchronization requirements.

> OK, have to agree. Sounds like it could be a good candidate
> for a fork. One question - is it a computer science maxim that
> an interpreter that implements multi-threading will always be
> slower when running single threaded apps?


I presume you're referring to Amdahl's law.

http://en.wikipedia.org/wiki/Amdahl's_law

Remember there are reasons other than speed on a
multi-processor platorm for wanting to do multi-threading.
Sometimes it just maps onto the application better than
a single-threaded solution.

--
Grant Edwards grante Yow! I want you to MEMORIZE
at the collected poems of
visi.com EDNA ST VINCENT MILLAY
... BACKWARDS!!
 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      09-20-2007
Steven D'Aprano <(E-Mail Removed)> writes:
> That's why your "comparatively wimpy site" preferred to throw extra web
> servers at the job of serving webpages rather than investing in smarter,
> harder-working programmers to pull the last skerricks of performance out
> of the hardware you already had.


The compute intensive stuff (image rendering and crunching) has
already had most of those skerricks pulled out. It is written in C
and assembler (not by us). Only a small part of our stuff is written
in Python: it just happens to be the part I'm involved with.

> But Python speed ups don't come for free. For instance, I'd *really*
> object if Python ran twice as fast for users with a quad-core CPU, but
> twice as slow for users like me with only a dual-core CPU.


Hmm. Well if the tradeoff were selectable at python configuration
time, then this option would certainly be worth doing. You might not
have a 4-core cpu today but you WILL have one soon.

> What on earth makes you think that would be anything more than a
> temporary, VERY temporary, shutdown? My prediction is that the last of
> the machines wouldn't have even been unplugged


Of course that example was a reductio ad absurdum. In reality they'd
use the speedup to compute 2x as much stuff, rather than ever powering
any servers down. Getting the extra computation is more valuable than
saving the electricity. It's just easier to put a dollar value on
electricity than on computation in an example like this. It's also
the case for our specfiic site that our server cluster is in large
part a disk farm and not just a compute farm, so even if we sped up
the software infinitely we'd still need a lot of boxes to bolt the
disks into and keep them spinning.

> Now there's a thought... given that Google:
>
> (1) has lots of money;
> (2) uses Python a lot;
> (3) already employs both Guido and (I think...) Alex Martelli and
> possibly other Python gurus;
> (4) is not shy in investing in Open Source projects;
> (5) and most importantly uses technologies that need to be used across
> multiple processors and multiple machines
>
> one wonders if Google's opinion of where core Python development needs to
> go is the same as your opinion?


I think Google's approach has been to do cpu-intensive tasks in other
languages, primarily C++. It would still be great if they put some
funding into PyPy development, since I think I saw something about the
EU funding being interrupted.
 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      09-20-2007
"Chris Mellon" <(E-Mail Removed)> writes:
> No. Python has threads, and they're wrappers around true OS level
> system threads. What the GIL does is prevent *Python* code in those
> threads from running concurrently.


Well, C libraries can release the GIL if they are written for thread
safety, but as far as I know, most don't release it. For example I
don't think cElementTree releases the GIL, and it's a huge CPU
consumer in some of the code I run, despite being written in C pretty
carefully. Also, many of the most basic builtin types (such as dicts)
are implemented in C and don't release the GIL.

> Very, very few operations are CPU bound these days, and even fewer of
> the ones were Python is involved. The largest benefits to the desktop
> user of multiple cores are the increase in cross-process performance
> (multitasking), not single applications.


If you add up all the CPU cycles being used by Python everywhere,
I wonder how many of them are on desktops and how many are on servers.

> Python is not especially fast in terms of pure CPU time, so
> if you're CPU bound anyway moving your CPU bound code into C (or
> something else) is likely to reap far more benefits - and sidestepping
> the GIL in the process.


If moving code into C is so easy, why not move all the code there
instead of just the CPU-bound code? Really, coding in C adds a huge
cost in complexity and unreliability. Python makes life a lot better
for developers, and so reimplementing Python code in C should be seen
as a difficult desperation measure rather than an easy way to get
speedups. Therefore, Python's slowness is a serious weakness and not
just a wart with an easy workaround.

> In short, I think any problem that would be directly addressed by
> removing the GIL is better addressed by other solutions.


It does sound like removing the GIL from CPython would have very high
costs in more than one area. Is my hope that Python will transition
from CPython to PyPy overoptimistic?
 
Reply With Quote
 
Chris Mellon
Guest
Posts: n/a
 
      09-20-2007
On 20 Sep 2007 07:43:18 -0700, Paul Rubin
<"http://phr.cx"@nospam.invalid> wrote:
> Steven D'Aprano <(E-Mail Removed)> writes:
> > That's why your "comparatively wimpy site" preferred to throw extra web
> > servers at the job of serving webpages rather than investing in smarter,
> > harder-working programmers to pull the last skerricks of performance out
> > of the hardware you already had.

>
> The compute intensive stuff (image rendering and crunching) has
> already had most of those skerricks pulled out. It is written in C
> and assembler (not by us). Only a small part of our stuff is written
> in Python: it just happens to be the part I'm involved with.
>


That means that this part is also unaffected by the GIL.

> > But Python speed ups don't come for free. For instance, I'd *really*
> > object if Python ran twice as fast for users with a quad-core CPU, but
> > twice as slow for users like me with only a dual-core CPU.

>
> Hmm. Well if the tradeoff were selectable at python configuration
> time, then this option would certainly be worth doing. You might not
> have a 4-core cpu today but you WILL have one soon.
>
> > What on earth makes you think that would be anything more than a
> > temporary, VERY temporary, shutdown? My prediction is that the last of
> > the machines wouldn't have even been unplugged

>
> Of course that example was a reductio ad absurdum. In reality they'd
> use the speedup to compute 2x as much stuff, rather than ever powering
> any servers down. Getting the extra computation is more valuable than
> saving the electricity. It's just easier to put a dollar value on
> electricity than on computation in an example like this. It's also
> the case for our specfiic site that our server cluster is in large
> part a disk farm and not just a compute farm, so even if we sped up
> the software infinitely we'd still need a lot of boxes to bolt the
> disks into and keep them spinning.
>


I think this is instructive, because it's pretty typical of GIL
complaints. Someone gives an example where the GIL is limited, but
upon inspection it turns out that the actual bottleneck is elsewhere,
that the GIL is being sidestepped anyway, and that the supposed
benefits of removing the GIL wouldn't materialize because the problem
space isn't really as described.

> > Now there's a thought... given that Google:
> >
> > (1) has lots of money;
> > (2) uses Python a lot;
> > (3) already employs both Guido and (I think...) Alex Martelli and
> > possibly other Python gurus;
> > (4) is not shy in investing in Open Source projects;
> > (5) and most importantly uses technologies that need to be used across
> > multiple processors and multiple machines
> >
> > one wonders if Google's opinion of where core Python development needs to
> > go is the same as your opinion?

>
> I think Google's approach has been to do cpu-intensive tasks in other
> languages, primarily C++. It would still be great if they put some
> funding into PyPy development, since I think I saw something about the
> EU funding being interrupted.
> --


At the really high levels of scalability, such as across a server
farm, threading is useless. The entire point of threads, rather than
processes, is that you've got shared, mutable state. A shared nothing
process (or Actor, if you will) model is the only one that makes sense
if you really want to scale because it's the only one that allows you
to distribute over machines. The fact that it also scales very well
over multiple cores (better than threads, in many cases) is just
gravy.

The only hard example I've seen given of the GIL actually limiting
scalability is on single server, high volume Django sites, and I don't
think that the architecture of those sites is very scalable anyway.
 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      09-20-2007
"Chris Mellon" <(E-Mail Removed)> writes:
> > The compute intensive stuff (image rendering and crunching) has
> > already had most of those skerricks pulled out. It is written in C
> > and assembler

> That means that this part is also unaffected by the GIL.


Right, it was a counterexample against the "speed doesn't matter"
meme, not specifically against the GIL. And that code is fast because
someone undertook comparatively enormous effort to code it in messy,
unsafe languages instead of Python, because Python is so slow.

> At the really high levels of scalability, such as across a server
> farm, threading is useless. The entire point of threads, rather than
> processes, is that you've got shared, mutable state. A shared nothing
> process (or Actor, if you will) model is the only one that makes sense
> if you really want to scale because it's the only one that allows you
> to distribute over machines. The fact that it also scales very well
> over multiple cores (better than threads, in many cases) is just
> gravy.


In reality you want to organize the problem so that memory intensive
stuff is kept local, and that's where you want threads, to avoid the
communications costs of serializing stuff between processes, either
between boxes or between cores. If communications costs could be
ignored there would be no need for gigabytes of ram in computers.
We'd just use disks for everything. As it is, we use tons of ram,
most of which is usually twiddling its thumbs doing nothing (as DJ
Bernstein put it) because the cpu isn't addressing it at that instant.
The memory just sits there waiting for the cpu to access it. We
actually can get better-than-linear speedups by designing the hardware
to avoid this. See:
http://cr.yp.to/snuffle/bruteforce-20050425.pdf
for an example.

> The only hard example I've seen given of the GIL actually limiting
> scalability is on single server, high volume Django sites, and I don't
> think that the architecture of those sites is very scalable anyway.


The stuff I'm doing now happens to work ok with multiple processes but
would have been easier to write with threads.
 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      09-20-2007

"Paul Rubin" <"http://phr.cx"@NOSPAM.invalid> wrote in message
news:(E-Mail Removed)...
| funding into PyPy development, since I think I saw something about the
| EU funding being interrupted.

As far as I know, the project was completed and promised funds paid. But I
don't know of any major follow-on funding, which I am sure they could use.



 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      09-20-2007

"Paul Rubin" <"http://phr.cx"@NOSPAM.invalid> wrote in message
news:(E-Mail Removed)...
| It does sound like removing the GIL from CPython would have very high
| costs in more than one area. Is my hope that Python will transition
| from CPython to PyPy overoptimistic?

I presume you mean 'will the leading edge reference version transition...
Or more plainly, "will Guido switch to PyPy for further development of
Python?" I once thought so, but 1) Google sped the arrival of Py3.0 by
hiring Guido with a major chunk of time devoted to Python development, so
he started before PyPy was even remotely ready (and it still is not); and
2) PyPy did not focus only or specifically on being a CPython replacement
but became an umbrella for a variety of experiment (including, for
instance, a Scheme frontend).



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
global interpreter lock Tommy.Ryding@gmail.com Python 2 10-19-2005 06:44 AM
global interpreter lock Paul Rubin Python 59 09-15-2005 07:50 PM
Global Interpreter Lock Tomas Christiansen Python 3 09-24-2004 10:00 PM
Threading - Why Not Lock Objects Rather than lock the interpreter Fuzzyman Python 3 12-05-2003 10:43 PM
Re: Thread State and the Global Interpreter Lock Aahz Python 0 06-28-2003 01:20 PM



Advertisments