Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > questions about multiprocessing

Reply
Thread Tools

questions about multiprocessing

 
 
Vincent Ren
Guest
Posts: n/a
 
      03-05-2011
Hello, everyone, recently I am trying to learn python's
multiprocessing, but
I got confused as a beginner.

If I run the code below:

from multiprocessing import Pool
import urllib2
otasks = [
'http://www.php.net'
'http://www.python.org'
'http://www.perl.org'
'http://www.gnu.org'
]

def f(url):
return urllib2.urlopen(url).read()

pool = Pool(processes = 2)
print pool.map(f, tasks)


I'll receive this message:

Traceback (most recent call last):
File "<stdin>", line 14, in <module>
File "/usr/lib/python2.6/multiprocessing/pool.py", line 148, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.6/multiprocessing/pool.py", line 422, in get
raise self._value
httplib.InvalidURL: nonnumeric port: ''



I run Python 2.6 on Ubuntu 10.10


Regards
Vincent


 
Reply With Quote
 
 
 
 
Philip Semanchuk
Guest
Posts: n/a
 
      03-05-2011

On Mar 4, 2011, at 11:08 PM, Vincent Ren wrote:

> Hello, everyone, recently I am trying to learn python's
> multiprocessing, but
> I got confused as a beginner.
>
> If I run the code below:
>
> from multiprocessing import Pool
> import urllib2
> otasks = [
> 'http://www.php.net'
> 'http://www.python.org'
> 'http://www.perl.org'
> 'http://www.gnu.org'
> ]
>
> def f(url):
> return urllib2.urlopen(url).read()
>
> pool = Pool(processes = 2)
> print pool.map(f, tasks)


Hi Vincent,
I don't think that's the code you're running, because that code won't run. Here's what I get when I run the code you gave us:

Traceback (most recent call last):
File "x.py", line 14, in <module>
print pool.map(f, tasks)
NameError: name 'tasks' is not defined


When I change the name of "otasks" to "tasks", I get the nonnumeric port error that you reported.

Me, I would debug it by adding a print statement to f():
def f(url):
print url
return urllib2.urlopen(url).read()


Your problem isn't related to multiprocessing.

Good luck
Philip




>
>
> I'll receive this message:
>
> Traceback (most recent call last):
> File "<stdin>", line 14, in <module>
> File "/usr/lib/python2.6/multiprocessing/pool.py", line 148, in map
> return self.map_async(func, iterable, chunksize).get()
> File "/usr/lib/python2.6/multiprocessing/pool.py", line 422, in get
> raise self._value
> httplib.InvalidURL: nonnumeric port: ''
>
>
>
> I run Python 2.6 on Ubuntu 10.10
>
>
> Regards
> Vincent
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list


 
Reply With Quote
 
 
 
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      03-05-2011
On Fri, 4 Mar 2011 20:08:21 -0800 (PST), Vincent Ren
<> declaimed the following in
gmane.comp.python.general:

> Hello, everyone, recently I am trying to learn python's
> multiprocessing, but
> I got confused as a beginner.
>
> If I run the code below:
>
> from multiprocessing import Pool
> import urllib2


> otasks = [
> 'http://www.php.net'
> 'http://www.python.org'
> 'http://www.perl.org'
> 'http://www.gnu.org'
> ]
>

You've just defined a list with ONE element -- a string of:

"http://www.php.nethttp://www.python.orghttp://www.perl.orghttp://http://www.gnu.org"


Python concatenates adjacent strings -- which includes those on
multiple lines when inside an open ( [ { structure.

You need to put commas after the closing quotes on those lines.

> def f(url):
> return urllib2.urlopen(url).read()
>
> pool = Pool(processes = 2)
> print pool.map(f, tasks)


And I'm presuming the others are correct -- and that should be

(f, otasks)

> httplib.InvalidURL: nonnumeric port: ''


No surprise... URL nomenclature expects a port number after the
second : in URL, and with concatenation you've got four : in a single
URL.
--
Wulfraed Dennis Lee Bieber AF6VN
HTTP://wlfraed.home.netcom.com/

 
Reply With Quote
 
Vincent Ren
Guest
Posts: n/a
 
      03-05-2011
Got it.
After putting commas, it works (The 'o' was a mistake when I posted,
sorry about it ).

Thanks to all of you


On Mar 5, 5:12*pm, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> On Fri, 4 Mar 2011 20:08:21 -0800 (PST), Vincent Ren
> <renws1...@gmail.com> declaimed the following in
> gmane.comp.python.general:
>
> > Hello, everyone, recently I am trying to learn python's
> > multiprocessing, but
> > I got confused as a beginner.

>
> > If I run the code below:

>
> > from multiprocessing import Pool
> > import urllib2
> > otasks = [
> > * * *'http://www.php.net'
> > * * *'http://www.python.org'
> > * * *'http://www.perl.org'
> > * * *'http://www.gnu.org'
> > * * *]

>
> * * * * You've just defined a list with ONE element -- a string of:
>
> "http://www.php.nethttp://www.python.orghttp://www.perl.orghttp://http..."
>
> * * * * Python concatenates adjacent strings -- which includes those on
> multiple lines when inside an open ( [ { structure.
>
> * * * * You need to put commas after the closing quotes on those lines.
>
> > def f(url):
> > * * *return urllib2.urlopen(url).read()

>
> > pool = Pool(processes = 2)
> > print pool.map(f, tasks)

>
> * * * * And I'm presuming the others are correct -- and that should be
>
> (f, otasks)
>
> > httplib.InvalidURL: nonnumeric port: ''

>
> * * * * No surprise... URL nomenclature expects a port number after the
> second : in URL, and with concatenation you've got four : in a single
> URL.
> --
> * * * * Wulfraed * * * * * * * * Dennis Lee Bieber * * * * AF6VN
> * * * * wlfr...@ix.netcom.com * *HTTP://wlfraed.home.netcom.com/


 
Reply With Quote
 
Vincent Ren
Guest
Posts: n/a
 
      03-07-2011
I've got some new problems and I tried to search on Google but got no
useful information.


I want to download some images with multiprocessing.pool
In my class named Renren, I defined two methods:

def getPotrait(self, url):
# get the current potraits of a friend on Renren.com
try:
r = urllib2.urlopen(url)
except urllib2.URLError:
print "Time out"

tmp = re.search('large_[\d\D]*.jpg', url)
image_name = tmp.group()

img = r.read()
output = open(image_name, 'wb')
output.write(img)
output.close()

def getLargePotraits(self):

tasks = self.makeTaskList()
pool = Pool(processes = 3)
pool.map(self.getPotrait, tasks)


tasks is a list of URLs of images, I want to download these images and
save them locally.

In another python file, I wrote this:

from renren import Renren

# get username and password for RenRen.com
username = raw_input('Email: ')
password = raw_input('Password: ')
print


a = Renren(username, password)
a.login()
a.getLargePotraits()



However, when I try to run this file, I received an error message:

Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.6/threading.py", line 532, in
__bootstrap_inner
self.run()
File "/usr/lib/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib/python2.6/multiprocessing/pool.py", line 225, in
_handle_tasks
put(task)
PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup
__builtin__.instancemethod failed


 
Reply With Quote
 
Jean-Michel Pichavant
Guest
Posts: n/a
 
      03-07-2011
Vincent Ren wrote:
> Hello, everyone, recently I am trying to learn python's
> multiprocessing, but
> I got confused as a beginner.


> [SNIP]
> httplib.InvalidURL: nonnumeric port: ''
>
> Regards
> Vincent
>
>

It's a mistake many beginners do, I don't understand why, but it's a
very common thing. RTFM should stand for "Read The Formidable (error)
Message" as well.
Your url is invalid, check your url definition.

JM
 
Reply With Quote
 
Vincent Ren
Guest
Posts: n/a
 
      03-07-2011
On Mar 7, 9:21*pm, Jean-Michel Pichavant <jeanmic...@sequans.com>
wrote:

> It's a mistake many beginners do, I don't understand why, but it's a
> very common thing. RTFM should stand for "Read The Formidable (error)
> Message" as *well.
> Your url is invalid, check your url definition.
>
> JM


I've fixed that problem. But I got a new one

PicklingError: Can't pickle <type 'instancemethod'>: attribute
lookup
__builtin__.instancemethod failed

The details were listed in my last post in this thread.
Thanks for your reply
 
Reply With Quote
 
Robert Kern
Guest
Posts: n/a
 
      03-08-2011
On 3/7/11 3:27 PM, Vincent Ren wrote:
> On Mar 7, 9:21 pm, Jean-Michel Pichavant<jeanmic...@sequans.com>
> wrote:
>
>> It's a mistake many beginners do, I don't understand why, but it's a
>> very common thing. RTFM should stand for "Read The Formidable (error)
>> Message" as well.
>> Your url is invalid, check your url definition.
>>
>> JM

>
> I've fixed that problem. But I got a new one
>
> PicklingError: Can't pickle<type 'instancemethod'>: attribute
> lookup
> __builtin__.instancemethod failed
>
> The details were listed in my last post in this thread.
> Thanks for your reply


I'm afraid his response applies to this as well: you can't pass methods to
pool.map() or any other such communication channel to your subprocesses.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

 
Reply With Quote
 
Vincent Ren
Guest
Posts: n/a
 
      03-08-2011
Got it, thanks.
But what should I do if I want to improve the efficiency of my
program?

On Mar 8, 11:37*am, Robert Kern <robert.k...@gmail.com> wrote:

> I'm afraid his response applies to this as well: you can't pass methods to
> pool.map() or any other such communication channel to your subprocesses.



 
Reply With Quote
 
Benjamin Kaplan
Guest
Posts: n/a
 
      03-08-2011
On Mon, Mar 7, 2011 at 7:47 PM, Vincent Ren <> wrote:
> Got it, thanks.
> But what should I do if I want to improve the efficiency of my
> program?
>


Is there any particular reason you're using processes and not threads?
Functions that wait for stuff to happen in C land, such as I/O calls,
release the GIL so threads can be run in parallel. It's only stuff
that happens in Python land (i.e. manipulating Python objects) that
can't be run concurrently.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
2.6 multiprocessing and pdb Aaron \Castironpi\ Brady Python 1 10-03-2008 08:05 AM
multiprocessing eats memory Max Ivanov Python 6 09-27-2008 10:29 PM
multiprocessing module (PEP 371) sturlamolden Python 6 06-08-2008 04:33 AM
Parallel/Multiprocessing script design question Amit N Python 4 09-13-2007 08:07 PM
Re: Questions....questions....questions Patrick Michael A+ Certification 0 06-16-2004 04:53 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57