Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Idea for removing the GIL... (http://www.velocityreviews.com/forums/t743222-idea-for-removing-the-gil.html)

Vishal 02-08-2011 09:39 AM

Idea for removing the GIL...
 
Hello,

This might sound crazy..and dont know if its even possible, but...

Is it possible that the Python process, creates copies of the
interpreter for each thread that is launched, and some how the thread
is bound to its own interpreter ?

This will increase the python process size...for sure, however data
sharing will remain just like it is in threads.

and it "may" also allow the two threads to run in parallel, assuming
the processors of today can send independent instructions from the
same process to multiple cores?

Comments, suggestions, brush offs are welcome :))

I heard that this has been tried before...any info about that?

Thanks and best regards,
Vishal Sapre

Adam Tauno Williams 02-08-2011 10:05 AM

Re: Idea for removing the GIL...
 
On Tue, 2011-02-08 at 01:39 -0800, Vishal wrote:
> Is it possible that the Python process, creates copies of the
> interpreter for each thread that is launched, and some how the thread
> is bound to its own interpreter ?
> and it "may" also allow the two threads to run in parallel, assuming
> the processors of today can send independent instructions from the
> same process to multiple cores?
> Comments, suggestions, brush offs are welcome :))


Yes, it is possible, and done. See the multiprocessing module. It
works very well.
<http://docs.python.org/library/multiprocessing.html>

It isn't exactly the same as threads, but provides many similar
constructs.


Vishal 02-08-2011 12:34 PM

Re: Idea for removing the GIL...
 
On Feb 8, 3:05*pm, Adam Tauno Williams <awill...@whitemice.org> wrote:
> On Tue, 2011-02-08 at 01:39 -0800, Vishal wrote:
> > Is it possible that the Python process, creates copies of the
> > interpreter for each thread that is launched, and some how the thread
> > is bound to its own interpreter ?
> > and it "may" also allow the two threads to run in parallel, assuming
> > the processors of today can send independent instructions from the
> > same process to multiple cores?
> > Comments, suggestions, brush offs *are welcome :))

>
> Yes, it is possible, and done. *See the multiprocessing module. *It
> works very well.
> <http://docs.python.org/library/multiprocessing.html>
>
> It isn't exactly the same as threads, but provides many similar
> constructs.


Hi,

Pardon me for my ignorance here, but 'multiprocessing' creates actual
processes using fork() or CreateProcess().
I was talking of a single process, running multiple instances of the
interpreter. Each thread, bound with its own interpreter.
so the GIL wont be an issue anymore...each interpreter has only one
thing to do, and that one thing holds the lock on its own interpreter.
Since its still the same process, data sharing should happen just like
in Threads.

Also, multiprocessing has issues on Windows (most probably because of
the way CreateProcess() functions...)

Thanks and best regards,
Vishal

Jean-Paul Calderone 02-08-2011 12:53 PM

Re: Idea for removing the GIL...
 
On Feb 8, 7:34*am, Vishal <vsapr...@gmail.com> wrote:
> On Feb 8, 3:05*pm, Adam Tauno Williams <awill...@whitemice.org> wrote:
>
> > On Tue, 2011-02-08 at 01:39 -0800, Vishal wrote:
> > > Is it possible that the Python process, creates copies of the
> > > interpreter for each thread that is launched, and some how the thread
> > > is bound to its own interpreter ?
> > > and it "may" also allow the two threads to run in parallel, assuming
> > > the processors of today can send independent instructions from the
> > > same process to multiple cores?
> > > Comments, suggestions, brush offs *are welcome :))

>
> > Yes, it is possible, and done. *See the multiprocessing module. *It
> > works very well.
> > <http://docs.python.org/library/multiprocessing.html>

>
> > It isn't exactly the same as threads, but provides many similar
> > constructs.

>
> Hi,
>
> Pardon me for my ignorance here, but 'multiprocessing' creates actual
> processes using fork() or CreateProcess().
> I was talking of a single process, running multiple instances of the
> interpreter. Each thread, bound with its own interpreter.
> so the GIL wont be an issue anymore...each interpreter has only one
> thing to do, and that one thing holds the lock on its own interpreter.
> Since its still the same process, data sharing should happen just like
> in Threads.


CPython does support multiple interpreters in a single process.
However,
you cannot have your cake and eat it too. If you create multiple
interpreters,
then why do you think you'll be able to share objects between them for
free?

In what sense would you have *multiple* interpreters in that scenario?

You will need some sort of locking between the interpreters. Then
you're either
back to the GIL or to some more limited form of sharing - such as you
might
get with the multiprocessing module.

Jean-Paul

Robert Kern 02-08-2011 04:38 PM

Re: Idea for removing the GIL...
 
On 2/8/11 10:11 AM, Brian Curtin wrote:
> On Tue, Feb 8, 2011 at 06:34, Vishal <vsapre80@gmail.com
> <mailto:vsapre80@gmail.com>> wrote:
>
> Also, multiprocessing has issues on Windows (most probably because of
> the way CreateProcess() functions...)
>
> Such as?


Unlike a UNIX fork, CreateProcess() does not have the same copy-on-write
semantics for initializing the memory of the new process. If you want to pass
data to the children, the data must be pickled and sent across the process
boundary. He's not saying that multiprocessing isn't useful at all on Windows,
just less useful for the scenarios he is considering here.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco


Roy Smith 02-08-2011 04:52 PM

Re: Idea for removing the GIL...
 
In article <mailman.13.1297183120.1633.python-list@python.org>,
Robert Kern <robert.kern@gmail.com> wrote:

> Unlike a UNIX fork, CreateProcess() does not have the same copy-on-write
> semantics for initializing the memory of the new process. If you want to pass
> data to the children, the data must be pickled and sent across the process
> boundary. He's not saying that multiprocessing isn't useful at all on
> Windows, just less useful for the scenarios he is considering here.


Amen, brother! I used to work on a project that had a build system
which was very fork() intensive (lots of little perl and shell scripts
driven by make). A full system build on a linux box took 30-60 minutes.
Building the same code on windows/cygwin took about 12 hours. Identical
hardware (8-core, 16 gig Dell server, or something like that).

As far as we could tell, it was entirely due to how bad Windows was at
process creation.

Stefan Behnel 02-08-2011 05:05 PM

Re: Idea for removing the GIL...
 
Roy Smith, 08.02.2011 17:52:
> Robert Kern wrote:
>
>> Unlike a UNIX fork, CreateProcess() does not have the same copy-on-write
>> semantics for initializing the memory of the new process. If you want to pass
>> data to the children, the data must be pickled and sent across the process
>> boundary. He's not saying that multiprocessing isn't useful at all on
>> Windows, just less useful for the scenarios he is considering here.

>
> Amen, brother! I used to work on a project that had a build system
> which was very fork() intensive (lots of little perl and shell scripts
> driven by make). A full system build on a linux box took 30-60 minutes.
> Building the same code on windows/cygwin took about 12 hours. Identical
> hardware (8-core, 16 gig Dell server, or something like that).
>
> As far as we could tell, it was entirely due to how bad Windows was at
> process creation.


Unlikely. Since you mention cygwin, it was likely due to the heavy lifting
cygwin does in order to emulate fork() on Windows.

http://www.cygwin.com/faq/faq-nochun...l#faq.api.fork

Stefan


Adam Tauno Williams 02-08-2011 07:07 PM

Re: Idea for removing the GIL...
 
On Tue, 2011-02-08 at 11:52 -0500, Roy Smith wrote:
> In article <mailman.13.1297183120.1633.python-list@python.org>,
> Robert Kern <robert.kern@gmail.com> wrote:
> > Unlike a UNIX fork, CreateProcess() does not have the same copy-on-write
> > semantics for initializing the memory of the new process. If you want to pass
> > data to the children, the data must be pickled and sent across the process
> > boundary. He's not saying that multiprocessing isn't useful at all on
> > Windows, just less useful for the scenarios he is considering here.

> Amen, brother! I used to work on a project that had a build system
> which was very fork() intensive (lots of little perl and shell scripts


Comparing issues that are simply fork() to using "multiprocessing" is a
bit of a false comparison. multiprocessing provides a fairly large set
of information sharing techniques. Just-doing-a-fork isn't really using
multiprocessing - fork'ing scripts isn't at all an equivalent to using
threads.

> As far as we could tell, it was entirely due to how bad Windows was at
> process creation.


Nope. If you want performance DO NOT USE cygwin.


John Nagle 02-08-2011 07:49 PM

Re: Idea for removing the GIL...
 
On 2/8/2011 1:39 AM, Vishal wrote:
> Hello,
>
> This might sound crazy..and dont know if its even possible, but...
>
> Is it possible that the Python process, creates copies of the
> interpreter for each thread that is launched, and some how the thread
> is bound to its own interpreter ?
>
> This will increase the python process size...for sure, however data
> sharing will remain just like it is in threads.
>
> and it "may" also allow the two threads to run in parallel, assuming
> the processors of today can send independent instructions from the
> same process to multiple cores?


Won't work. You'd have two threads updating the same shared data
structures without locking. In CPython, there's a reference count
shared across threads, but no locking at the object level.

The real reason for the GIL, though, is to support dynamic
code modification in multi-thread progrems. It's the ability
to replace a function while it's being executed in another thread
that's hard to do without a global lock. If it were just a data-side
problem, local object locks, a lock at the allocator, and a
concurrent garbage collector would work.

John Nagle

Carl Banks 02-08-2011 09:20 PM

Re: Idea for removing the GIL...
 
On Feb 8, 11:49*am, John Nagle <na...@animats.com> wrote:
> * * The real reason for the GIL, though, is to support dynamic
> code modification in multi-thread progrems. *It's the ability
> to replace a function while it's being executed in another thread
> that's hard to do without a global lock. *If it were just a data-side
> problem, local object locks, a lock at the allocator, and a
> concurrent garbage collector would work.


I realize that you believe that Python's hyper-dynamicism is the cause
of all evils in the world, but in this case you're not correct.

Concurrent garbage collectors work just fine in IronPython and Jython,
which are just as dynamic as CPython. I'm not sure why you think an
executing function would be considered inaccessible and subject to
collection. If you replace a function (code object, actually) in
another thread it only deletes the reference from that namespace,
references on the executing stack still exist.

The real reason they never replaced the GIL is that fine-grained
locking is expensive with reference counting. The only way the cost
of finer-grained locking would be acceptable, then, is if they got rid
of the reference counting altogether, and that was considered too
drastic a change.


Carl Banks


All times are GMT. The time now is 01:57 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.