Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Q: urlopen() and "file:///c:/mypage.html" ??

Reply
Thread Tools

Q: urlopen() and "file:///c:/mypage.html" ??

 
 
MAK
Guest
Posts: n/a
 
      08-21-2003
I'm stumped.

I'm trying to use Python 2.3's urllib2.urlopen() to open an HTML file
on the local harddrive of my WinXP box.

If I were to use, say, Netscape to open this file, I'd specify it as
"file:///c:/mypage.html", and it would open it just fine. But
urlopen() won't accept it as a valid URL. I get an OSError exception
with the error message "No such file or directory:
'\\C:\\mypage.html'".

I've tried variations on the URL, such as "file://c:/mypage.html",
too, without luck. That one gives me a 'socket.gaierror' exception
with the message "'getaddrinfo failed'".

Upon diving into the code, I found that, in the first case, the third
'/' is left as part of the filename, and in the second case, it ends
up thinking that 'C:' is the hostname of the machine.

Can anyone point out the error of my ways?
Thanks.
 
Reply With Quote
 
 
 
 
Joe Francia
Guest
Posts: n/a
 
      08-21-2003
MAK wrote:
> I'm stumped.
>
> I'm trying to use Python 2.3's urllib2.urlopen() to open an HTML file
> on the local harddrive of my WinXP box.
>
> If I were to use, say, Netscape to open this file, I'd specify it as
> "file:///c:/mypage.html", and it would open it just fine. But
> urlopen() won't accept it as a valid URL. I get an OSError exception
> with the error message "No such file or directory:
> '\\C:\\mypage.html'".
>
> I've tried variations on the URL, such as "file://c:/mypage.html",
> too, without luck. That one gives me a 'socket.gaierror' exception
> with the message "'getaddrinfo failed'".
>
> Upon diving into the code, I found that, in the first case, the third
> '/' is left as part of the filename, and in the second case, it ends
> up thinking that 'C:' is the hostname of the machine.
>
> Can anyone point out the error of my ways?
> Thanks.


This works:

f = urllib2.urlopen(r'file:///c|\mypage.html')

But, if you're only opening local files, what's wrong with:

f = file(r'c:/mypage.html', 'r')

jf

 
Reply With Quote
 
 
 
 
Michael Geary
Guest
Posts: n/a
 
      08-22-2003
> MAK wrote:
> > I'm trying to use Python 2.3's urllib2.urlopen() to open an HTML
> > file on the local harddrive of my WinXP box.
> >
> > If I were to use, say, Netscape to open this file, I'd specify it as
> > "file:///c:/mypage.html", and it would open it just fine. But
> > urlopen() won't accept it as a valid URL. I get an OSError
> > exception with the error message "No such file or directory:
> > '\\C:\\mypage.html'".


Joe Francia wrote:
> This works:
>
> f = urllib2.urlopen(r'file:///c|\mypage.html')
>
> But, if you're only opening local files, what's wrong with:
>
> f = file(r'c:/mypage.html', 'r')


Just to add to that, the significant thing in the working example isn't that
it uses backslash instead of forward slash, but that it uses vertical bar
instead of colon. This works just as well:

f = urllib2.urlopen( 'file:///c|/mypage.html' )

-Mike


 
Reply With Quote
 
MAK
Guest
Posts: n/a
 
      08-22-2003
Wow, thanks guys. A vertical bar instead of a colon... I'da never
figured on that...
 
Reply With Quote
 
John J. Lee
Guest
Posts: n/a
 
      08-22-2003
"Michael Geary" <> writes:
> > MAK wrote:

[...]
> > > If I were to use, say, Netscape to open this file, I'd specify it as
> > > "file:///c:/mypage.html", and it would open it just fine. But
> > > urlopen() won't accept it as a valid URL. I get an OSError
> > > exception with the error message "No such file or directory:
> > > '\\C:\\mypage.html'".

[...]
> f = urllib2.urlopen( 'file:///c|/mypage.html' )


Why does Python use a different syntax to the rest of the Windows
world?


John
 
Reply With Quote
 
Mike Brown
Guest
Posts: n/a
 
      08-22-2003
"John J. Lee" <> wrote in message
news:...
> "Michael Geary" <> writes:
> > > MAK wrote:

> [...]
> > > > If I were to use, say, Netscape to open this file, I'd specify it as
> > > > "file:///c:/mypage.html", and it would open it just fine. But
> > > > urlopen() won't accept it as a valid URL. I get an OSError
> > > > exception with the error message "No such file or directory:
> > > > '\\C:\\mypage.html'".

> [...]
> > f = urllib2.urlopen( 'file:///c|/mypage.html' )

>
> Why does Python use a different syntax to the rest of the Windows
> world?


On Windows, if I open a local file in Netscape 4, the Location bar shows a
"file" URL with the "|". If I open a local file in Internet Explorer (or the
file Explorer with the Address bar turned on), the Address bar shows a
"file" URL with a ":". The resolver used by both Netscape and Explorer will
accept either one, if you type it in the address bar. So who is to say what
is canon? The 'file' URI scheme is, by definition, OS dependent. If the OS
likes the URL, then it's good enough.

For 4Suite running on Windows, we were thinking of making a Python wrapper
to the Windows resolver for maximum compatibility, but haven't gotten around
to it. For now, we avoid the bug-ridden urllib as much as we can, and do
some voodoo on 'file' URLs to convert them to OS-specific paths that are
safe to pass to open() on the (win32 or posix) OS we're running on. It's not
foolproof yet, and won't handle the colon case, but does a round-trip from
an OS path to a URI and back pretty well. See the UriToOsPath() and
OsPathToUri() work in the Ft.Lib.Uri module here:
http://cvs.4suite.org/cgi-bin/viewcv...viewcvs-markup


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
if and and vs if and,and titi VHDL 4 03-11-2007 05:23 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57