Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: Questions about GIL and web services from a n00b

Reply
Thread Tools

Re: Questions about GIL and web services from a n00b

 
 
Chris H
Guest
Posts: n/a
 
      04-15-2011
On 4/15/11 1:03 PM, Tim Wintle wrote:
> On Fri, 2011-04-15 at 12:33 -0400, Chris H wrote:
>> 1. Are you sure you want to use python because threading is not good
>> due to the Global Lock (GIL)? Is this really an issue for
>> multi-threaded web services as seems to be indicated by the articles
>> from a Google search? If not, how do you avoid this issue in a
>> multi-threaded process to take advantage of all the CPU cores
>> available?

> Is the limiting factor CPU?
>
> If it isn't (i.e. you're blocking on IO to/from a web service) then the
> GIL won't get in your way.
>
> If it is, then run as many parallel *processes* as you have cores/CPUs
> (assuming you're designing an application that can have multiple
> instances running in parallel so that you can run over multiple servers
> anyway).
>
> Tim Wintle


Great question. At this point, there isn't a limiting factor, but yes
the concern is around CPU in the future with lots of threads handling
many simultaneous transactions.

Chris
 
Reply With Quote
 
 
 
 
Raymond Hettinger
Guest
Posts: n/a
 
      04-16-2011
> > Is the limiting factor CPU?
>
> > If it isn't (i.e. you're blocking on IO to/from a web service) then the
> > GIL won't get in your way.

>
> > If it is, then run as many parallel *processes* as you have cores/CPUs
> > (assuming you're designing an application that can have multiple
> > instances running in parallel so that you can run over multiple servers
> > anyway).

>
> Great question. *At this point, there isn't a limiting factor, but yes
> the concern is around CPU in the future with lots of threads handling
> many simultaneous transactions.


In the Python world, the usual solution to high transaction loads is
to use event-driven processing (using an async library such as
Twisted) rather than using multi-threading which doesn't scale well in
any language.

Also, the usual way to take advantage of multiple-cores is to run
multiple pythons in separate processes.

Threading is really only an answer if you need to share data between
threads, if you only have limited scaling needs, and are I/O bound
rather than CPU bound


Raymond

 
Reply With Quote
 
 
 
 
David Cournapeau
Guest
Posts: n/a
 
      04-16-2011
On Sat, Apr 16, 2011 at 10:05 AM, Raymond Hettinger <(E-Mail Removed)> wrote:
>> > Is the limiting factor CPU?

>>
>> > If it isn't (i.e. you're blocking on IO to/from a web service) then the
>> > GIL won't get in your way.

>>
>> > If it is, then run as many parallel *processes* as you have cores/CPUs
>> > (assuming you're designing an application that can have multiple
>> > instances running in parallel so that you can run over multiple servers
>> > anyway).

>>
>> Great question. *At this point, there isn't a limiting factor, but yes
>> the concern is around CPU in the future with lots of threads handling
>> many simultaneous transactions.

>
> In the Python world, the usual solution to high transaction loads is
> to use event-driven processing (using an async library such as
> Twisted) rather than using multi-threading which doesn't scale well in
> any language.


My experience is that if you are CPU bound, asynchronous programming
in python can be more a curse than a blessing, mostly because the
need to insert "scheduling points" at the right points to avoid
blocking and because profiling becomes that much harder in something
like twisted.

It depends of course of the application, but designing from the ground
up with the idea of running multiple processes is what seems to be the
most natural way of scaling - this does not prevent using async in
each process. This has its own issues, though (e.g. in terms of
administration and monitoring).

Chris, the tornado documention mentions a simple way to get multiple
processes on one box: http://www.tornadoweb.org/documentation (section
mentiong nginx for load balancing). The principle is quite common and
is applicable to most frameworks (the solution is not specific to
tornado).

cheers.

David
 
Reply With Quote
 
Aahz
Guest
Posts: n/a
 
      04-16-2011
In article <(E-Mail Removed)>,
Raymond Hettinger <(E-Mail Removed)> wrote:
>
>Threading is really only an answer if you need to share data between
>threads, if you only have limited scaling needs, and are I/O bound
>rather than CPU bound


Threads are also useful for user interaction (i.e. GUI apps).

I think that "limited scaling" needs to be defined, too; CherryPy
performs pretty well, and the blocking model does simplify development.

One problem that my company has run into with threading is that it's not
always obvious where you'll hit GIL blocks. For example, one would
think that pickle.loads() releases the GIL, but it doesn't; you need to
use pickle.load() (and cStringIO if you want to do it in memory).
--
Aahz ((E-Mail Removed)) <*> http://www.pythoncraft.com/

"At Resolver we've found it useful to short-circuit any doubt and just
refer to comments in code as 'lies'. "
--Michael Foord paraphrases Christian Muirhead on python-dev, 2009-03-22
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      04-16-2011
On Sun, Apr 17, 2011 at 12:44 AM, Aahz <(E-Mail Removed)> wrote:
> In article <(E-Mail Removed)>,
> Raymond Hettinger *<(E-Mail Removed)> wrote:
>>
>>Threading is really only an answer if you need to share data between
>>threads, if you only have limited scaling needs, and are I/O bound
>>rather than CPU bound

>
> Threads are also useful for user interaction (i.e. GUI apps).


I agree; user interaction is effectively I/O on, usually, some sort of
event queue that collects from a variety of sources; with the
specialty that, in some GUI environments, the process's first thread
is somehow "special". But ultimately it's still a "worker thread" /
"interaction thread" model, which is quite a good one. The interaction
thread spends most of its time waiting for the user, maybe waiting for
STDIN, maybe waiting for a GUI event, maybe waiting on some I/O device
(TCP socket comes to mind).

Chris Angelico
 
Reply With Quote
 
Jean-Paul Calderone
Guest
Posts: n/a
 
      04-16-2011
On Apr 16, 10:44*am, (E-Mail Removed) (Aahz) wrote:
> In article <(E-Mail Removed)>,
> Raymond Hettinger *<(E-Mail Removed)> wrote:
>
>
>
> >Threading is really only an answer if you need to share data between
> >threads, if you only have limited scaling needs, and are I/O bound
> >rather than CPU bound

>
> Threads are also useful for user interaction (i.e. GUI apps). *
>


I suppose that's why most GUI toolkits use a multithreaded model.

Jean-Paul
 
Reply With Quote
 
Michael Torrie
Guest
Posts: n/a
 
      04-16-2011
On 04/16/2011 02:53 PM, Jean-Paul Calderone wrote:
> On Apr 16, 10:44 am, (E-Mail Removed) (Aahz) wrote:
>> In article <(E-Mail Removed)>,
>> Raymond Hettinger <(E-Mail Removed)> wrote:
>>
>>
>>
>>> Threading is really only an answer if you need to share data between
>>> threads, if you only have limited scaling needs, and are I/O bound
>>> rather than CPU bound

>>
>> Threads are also useful for user interaction (i.e. GUI apps).
>>

>
> I suppose that's why most GUI toolkits use a multithreaded model.


Many GUI toolkits are single-threaded. And in fact with GTK and MFC you
can't (or shouldn't) call GUI calls from a thread other than the main
GUI thread. That's not to say GUI programs don't use threads and put
the GUI it its own thread. But GUI toolkits are often *not*
multithreaded. They are, however, often asynchronous, which is often
more cost-effective than multi-threaded.
 
Reply With Quote
 
sturlamolden
Guest
Posts: n/a
 
      04-17-2011
On Apr 16, 4:59*am, David Cournapeau <(E-Mail Removed)> wrote:

> My experience is that if you are CPU bound, asynchronous programming
> in python can be *more a curse than a blessing, mostly because the
> need to insert "scheduling points" at the right points to avoid
> blocking and because profiling becomes that much harder in something
> like twisted.


I think Raymond's argument was that multi-threaded server design does
not scale well in any language. There is a reason that Windows I/O
completion ports use a pool of worker threads, and not one thread per
asynchronous I/O request. A multi-threaded design for a webservice
will hit the wall from inscalability long before CPU saturation
becomes an issue.

 
Reply With Quote
 
sturlamolden
Guest
Posts: n/a
 
      04-17-2011
On Apr 17, 12:10*am, Michael Torrie <(E-Mail Removed)> wrote:

> Many GUI toolkits are single-threaded. *And in fact with GTK and MFC you
> can't (or shouldn't) call GUI calls from a thread other than the main
> GUI thread.


Most of them (if not all?) have a single GUI thread, and a mechanism
by which
to synchronize with the GUI thread.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Questions about GIL and web services from a n00b Chris H Python 4 04-20-2011 08:01 PM
n00b questions for javascript! disappearedng Javascript 61 11-24-2008 05:04 PM
How .NET web services client handles exceptions from Java web services? John ASP .Net Web Services 4 03-31-2006 10:13 PM
Couple of n00b questions - please help :) Hellraiser UK VOIP 17 03-20-2006 10:49 PM
DDR333 (CL=2) vs DDR400 (CL=3) and other questions from a n00b Jesse Computer Information 3 11-20-2003 09:55 PM



Advertisments