Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Simple Question : files and URLLIB

Reply
Thread Tools

Simple Question : files and URLLIB

 
 
Richard Shea
Guest
Posts: n/a
 
      10-14-2003
Hi - I'm new to Python. I've been trying to use URLLIB and the 'tidy'
function (part of the mx.tidy package). There's one thing I'm having
real difficulties understanding. When I did this ...

finA= urllib.urlopen('http://www.python.org/')
foutA=open('C:\\testout.html','w')
tidy(finA,foutA,None)

I get ...

Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "mx\Tidy\Tidy.py", line 38, in tidy
return mxTidy.tidy(input, output, errors, kws)
TypeError: inputstream must be a file object or string

.... what I don't understand is surely the result of a urllib is a file
object ? Isn't it ? To quote the manual at :

http://www.python.org/doc/current/li...le-urllib.html

"If all went well, a file-like object is returned". I can make the
tidy function happy but changing the code to read ...

finA= urllib.urlopen('http://www.python.org/').read()

.... I haven't had time to look into this properly yet but I suspect
finA is now a string not a file handle ?

Anyway if anyone can throw light on this I would be grateful.

thanks

richard.shea.
 
Reply With Quote
 
 
 
 
bromden
Guest
Posts: n/a
 
      10-14-2003
> "If all went well, a file-like object is returned". I can make the

file-like means having similar interface to a file object (methods read,
readline, etc.), but not a real file though,

mxTidy.tidy most probably requires a real file to be passed,
just you look into Tidy.py (line 3 and you'll know for sure

--
bromden[at]gazeta.pl

 
Reply With Quote
 
 
 
 
Mark Carter
Guest
Posts: n/a
 
      10-14-2003
> finA= urllib.urlopen('http://www.python.org/').read()
>
> ... I haven't had time to look into this properly yet but I suspect
> finA is now a string not a file handle ?


Correct. If you do:
print type(finA)
you obtain the result:
<type 'str'>

If you do:
finA= urllib.urlopen('http://www.python.org/')
print type(finA)
then you obtain the result:
<type 'instance'>

Compare this with:
finA = open("blah", "w")
print type(finA)
which gives the result:
<type 'file'>

According to the docs on urlopen( url[, data[, proxies]]) :
"If all went well, a file-like object is returned."
So the answer would appear to be: "close, but no cigar".
 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      10-14-2003

"Richard Shea" <> wrote in message
news: m...
> Hi - I'm new to Python. I've been trying to use URLLIB and the

'tidy'
> function (part of the mx.tidy package). There's one thing I'm having
> real difficulties understanding. When I did this ...
>
> finA= urllib.urlopen('http://www.python.org/')
> foutA=open('C:\\testout.html','w')
> tidy(finA,foutA,None)
>
> I get ...
>
> Traceback (most recent call last):
> File "<interactive input>", line 1, in ?
> File "mx\Tidy\Tidy.py", line 38, in tidy
> return mxTidy.tidy(input, output, errors, kws)
> TypeError: inputstream must be a file object or string
>
> ... what I don't understand is surely the result of a urllib is a

file
> object ? Isn't it ? To quote the manual at :
>
> http://www.python.org/doc/current/li...le-urllib.html
>
> "If all went well, a file-like object is returned".


'file-like object' is different from 'file object' From urllib.py doc
string:
"The object returned by URLopener().open(file) will differ per
protocol. All you know is that is has methods read(), readline(),
readlines(), fileno(), close() and info()."

Why this is not good enough for mx.tidy is a question for it's author.

> I can make the tidy function happy by changing the code to read ...
>
> finA= urllib.urlopen('http://www.python.org/').read()
>
> ... I haven't had time to look into this properly yet but I suspect
> finA is now a string not a file handle ?


Yes. So it meets the 'file or string' requirement.

Terry J. Reedy


 
Reply With Quote
 
Richard Shea
Guest
Posts: n/a
 
      10-14-2003
Thanks to everyone for the info/feedback. In particular I didn't know
you could that ...

type(finA)

.... business (which shows you how new to Python I am probably) but
it'll come in handy.

As I think you realised I had misunderstood exactly what urllib was
offering however the blah.read() approach is quite good enough. Just
out of curiousity though if 'tidy' demanded a file (rather than being
prepared to take a string as it is)would the only sure approach be to
....

f1=open('C:\\workfile.html','w')
strHTML= urllib.urlopen('http://www.python.org/').read()
f1.write(strHTML)
tidy(f1,strOut,None)

.... that is to take the string that results from the read on urllib
file-like object and write it back out to a file ?

Just wondering ...

Thanks again for the information on my original question.

regards

richard.





(Richard Shea) wrote in message news:< om>...
> Hi - I'm new to Python. I've been trying to use URLLIB and the 'tidy'
> function (part of the mx.tidy package). There's one thing I'm having
> real difficulties understanding. When I did this ...
>
> finA= urllib.urlopen('http://www.python.org/')
> foutA=open('C:\\testout.html','w')
> tidy(finA,foutA,None)
>
> I get ...
>
> Traceback (most recent call last):
> File "<interactive input>", line 1, in ?
> File "mx\Tidy\Tidy.py", line 38, in tidy
> return mxTidy.tidy(input, output, errors, kws)
> TypeError: inputstream must be a file object or string
>
> ... what I don't understand is surely the result of a urllib is a file
> object ? Isn't it ? To quote the manual at :
>
> http://www.python.org/doc/current/li...le-urllib.html
>
> "If all went well, a file-like object is returned". I can make the
> tidy function happy but changing the code to read ...
>
> finA= urllib.urlopen('http://www.python.org/').read()
>
> ... I haven't had time to look into this properly yet but I suspect
> finA is now a string not a file handle ?
>
> Anyway if anyone can throw light on this I would be grateful.
>
> thanks
>
> richard.shea.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
2to3 urllib.URLopener -> urllib.request.URLopener Chris McDonald Python 0 11-01-2010 11:23 AM
Asynchronous urllib (urllib+asyncore)? Jonathan Gardner Python 1 02-27-2008 12:51 AM
Downloading files using urllib in a for loop? justsee@gmail.com Python 6 02-15-2006 11:47 AM
Downloading files using URLLib Oyvind Ostlund Python 1 06-27-2005 02:05 PM
Re: Downloading files using URLLib A. Murat Eren Python 0 06-27-2005 12:40 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57