Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: How to read webpage

Reply
Thread Tools

Re: How to read webpage

 
 
MRAB
Guest
Posts: n/a
 
      08-01-2009
tarun wrote:
> Dear All,
> I want to read a webpage and copy the contents of it in word file. I
> tried to write following code:
>
> import urllib2
> urllib2.urlopen("http://www.rediff.com/")
>
> *Error:-*
>
> urllib2.urlopen("http://www.icicibank.com/")
> File "C:\Python25\lib\urllib2.py", line 121, in urlopen
> return _opener.open(url, data)
> File "C:\Python25\lib\urllib2.py", line 374, in open
> response = self._open(req, data)
> File "C:\Python25\lib\urllib2.py", line 392, in _open
> '_open', req)
> File "C:\Python25\lib\urllib2.py", line 353, in _call_chain
> result = func(*args)
> File "C:\Python25\lib\urllib2.py", line 1100, in http_open
> return self.do_open(httplib.HTTPConnection, req)
> File "C:\Python25\lib\urllib2.py", line 1075, in do_open
> raise URLError(err)
> urllib2.URLError: <urlopen error (11001, 'getaddrinfo failed')>
>

I've just tried it. I didn't get an exception, so your problem must be
elsewhere.
 
Reply With Quote
 
 
 
 
koranthala
Guest
Posts: n/a
 
      08-01-2009
On Aug 1, 6:52*pm, MRAB <(E-Mail Removed)> wrote:
> tarun wrote:
> > Dear All,
> > I want to read a webpage and copy the contents of it in word file. I
> > tried to write following code:

>
> > import urllib2
> > urllib2.urlopen("http://www.rediff.com/")

>
> > *Error:-*

>
> > * * urllib2.urlopen("http://www.icicibank.com/")
> > * File "C:\Python25\lib\urllib2.py", line 121, in urlopen
> > * * return _opener.open(url, data)
> > * File "C:\Python25\lib\urllib2.py", line 374, in open
> > * * response = self._open(req, data)
> > * File "C:\Python25\lib\urllib2.py", line 392, in _open
> > * * '_open', req)
> > * File "C:\Python25\lib\urllib2.py", line 353, in _call_chain
> > * * result = func(*args)
> > * File "C:\Python25\lib\urllib2.py", line 1100, in http_open
> > * * return self.do_open(httplib.HTTPConnection, req)
> > * File "C:\Python25\lib\urllib2.py", line 1075, in do_open
> > * * raise URLError(err)
> > urllib2.URLError: <urlopen error (11001, 'getaddrinfo failed')>

>
> I've just tried it. I didn't get an exception, so your problem must be
> elsewhere.


Is it that the website expects a valid browser?
In that case, spoof a browser and try to get the site.
 
Reply With Quote
 
 
 
 
Jon Clements
Guest
Posts: n/a
 
      08-01-2009
On 1 Aug, 14:52, MRAB <(E-Mail Removed)> wrote:
> tarun wrote:
> > Dear All,
> > I want to read a webpage and copy the contents of it in word file. I
> > tried to write following code:

>
> > import urllib2
> > urllib2.urlopen("http://www.rediff.com/")

>
> > *Error:-*

>
> > * * urllib2.urlopen("http://www.icicibank.com/")
> > * File "C:\Python25\lib\urllib2.py", line 121, in urlopen
> > * * return _opener.open(url, data)
> > * File "C:\Python25\lib\urllib2.py", line 374, in open
> > * * response = self._open(req, data)
> > * File "C:\Python25\lib\urllib2.py", line 392, in _open
> > * * '_open', req)
> > * File "C:\Python25\lib\urllib2.py", line 353, in _call_chain
> > * * result = func(*args)
> > * File "C:\Python25\lib\urllib2.py", line 1100, in http_open
> > * * return self.do_open(httplib.HTTPConnection, req)
> > * File "C:\Python25\lib\urllib2.py", line 1075, in do_open
> > * * raise URLError(err)
> > urllib2.URLError: <urlopen error (11001, 'getaddrinfo failed')>

>
> I've just tried it. I didn't get an exception, so your problem must be
> elsewhere.


I'm hoping this adds to MRAB's reply; it is intended however for the
OP.

Jeeze -- been a while since I've had to deal with Sockets (directly
anyway).
If memory serves correctly, it's where the system can't name resolve
the required address.
So best guess is it's either a temporary glitch, or an issue with your
routing.

Jon.
Jon.
 
Reply With Quote
 
catafest
Guest
Posts: n/a
 
      08-02-2009
Maybe your python2.5 not working good!?
But, I use python 2.6 , and i use this for your problem:
import urllib
html = urllib.urlopen("http://www.rediff.com/").read()
print html

If you want use authenticate then...
You make working urllib2 and use this
>>>auth = urllib2.Request(auth_uri, authreq_data)


On Aug 1, 4:52*pm, MRAB <(E-Mail Removed)> wrote:
> tarun wrote:
> > Dear All,
> > I want to read a webpage and copy the contents of it in word file. I
> > tried to write following code:

>
> > import urllib2
> > urllib2.urlopen("http://www.rediff.com/")

>
> > *Error:-*

>
> > * * urllib2.urlopen("http://www.icicibank.com/")
> > * File "C:\Python25\lib\urllib2.py", line 121, in urlopen
> > * * return _opener.open(url, data)
> > * File "C:\Python25\lib\urllib2.py", line 374, in open
> > * * response = self._open(req, data)
> > * File "C:\Python25\lib\urllib2.py", line 392, in _open
> > * * '_open', req)
> > * File "C:\Python25\lib\urllib2.py", line 353, in _call_chain
> > * * result = func(*args)
> > * File "C:\Python25\lib\urllib2.py", line 1100, in http_open
> > * * return self.do_open(httplib.HTTPConnection, req)
> > * File "C:\Python25\lib\urllib2.py", line 1075, in do_open
> > * * raise URLError(err)
> > urllib2.URLError: <urlopen error (11001, 'getaddrinfo failed')>

>
> I've just tried it. I didn't get an exception, so your problem must be
> elsewhere.


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
cause webpage one to reload when webpage two is closed. Paul ASP .Net 14 06-19-2008 03:02 PM
Clipping a remote webpage with Javascript/XPath and including in a "local" webpage soren625 Javascript 2 12-12-2006 02:09 PM
check if a webpage is forwarding to a other webpage martijn@gamecreators.nl Python 1 09-06-2005 02:27 PM
Email contents of webpage or Form on webpage w/o using Server scripting sifar Javascript 5 08-24-2005 05:47 PM



Advertisments