Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > urllib and sites that require passwds

Reply
Thread Tools

urllib and sites that require passwds

 
 
bob_smith_17280@hotmail.com
Guest
Posts: n/a
 
      12-23-2004
Hello,

I'm doing a small website survey as a consultant for a company that has
a large private lan. Basically, I'm trying to determine how many web
sites there are on their network and what content the sites contain
(scary how they don't know this, but I suspect many companies are this
way).

Everything is going fine so far except for sites that require passwds
to be accessed. I don't want to view content on these sites, I only
want to note that they are passwd protected, make a list of them and
move on. The problem is that urllib hangs waiting for a username/passwd
to be entered. Is there a graceful way to deal with this?
Many thanks,
Bob

 
Reply With Quote
 
 
 
 
Fuzzyman
Guest
Posts: n/a
 
      12-23-2004
USe urllib2 which will fail with an exception. You can trap this
exception and using the code attribute of the exception object,
determine why it failed. The error code for 'authentication required'
is 401.

Off the top of my head :

import urllib2
req = urllib2.Request(theurl)
try:
handle = urllib2.urlopen(req)
except IOError, e:
if not e.hasattr('code'):
print 'The url appears to be invalid.'
print e.reason
else:
if e.code == 401:
print theurl, 'is protected with a password.'
else:
print 'We failed with error code', e.code
HTH

Regards,

Fuzzy
http://www.voidspace.org.uk/python/index.shtml

 
Reply With Quote
 
 
 
 
Fuzzyman
Guest
Posts: n/a
 
      12-23-2004
damn... I'm losing my leading spaces.... indentation should be obvious
anyway... (everything below except is indented at least one step).
Fuzzy

 
Reply With Quote
 
Ishwor
Guest
Posts: n/a
 
      12-23-2004
On 23 Dec 2004 06:46:50 -0800, Fuzzyman <> wrote:
> damn... I'm losing my leading spaces.... indentation should be obvious

We'll forgive you for that. It was from "top-of-your-head" ~

> anyway... (everything below except is indented at least one step).
> Fuzzy

Its nice that urllib2 returns errcode to process further. doesn't
urllib do that?
Anyway i wanted to know if any website which is similar to CPAN
library website? I mean i want to be able find modules n stuff for
Python.. It would be really great to know.

Thanks.

--
cheers,
Ishwor Gurung
 
Reply With Quote
 
Fuzzyman
Guest
Posts: n/a
 
      12-23-2004

Ishwor wrote:
> On 23 Dec 2004 06:46:50 -0800, Fuzzyman <> wrote:
> > damn... I'm losing my leading spaces.... indentation should be

obvious
> We'll forgive you for that. It was from "top-of-your-head" ~
>


Hey - I put the indentation in there... it just got stripped out when
it was posted !

> > anyway... (everything below except is indented at least one step).
> > Fuzzy

> Its nice that urllib2 returns errcode to process further. doesn't
> urllib do that?


The OP is saying that it hangs rather than returning an error. I
haven't tested it. In general urllib2.urlopen is much better than
urllib.urlopen. urllib has some useful other functions though.

> Anyway i wanted to know if any website which is similar to CPAN
> library website? I mean i want to be able find modules n stuff for
> Python.. It would be really great to know.
>


There is PyPi and the Vaults of Parnassus. Neither are really like
CPAN. There has been lots of talk about it recently - everyone agrees
we need one... but no one is offering the bandwidth or the code.

There are lots of modules available though - and usually not too hard
to track down.

Regards,

Fuzzy
http://www.voidspace.org.uk/python/index.shtml
> Thanks.
>
> --
> cheers,
> Ishwor Gurung


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
problem in running a basic code in python 3.3.0 that includes HTML file Satabdi Mukherjee Python 1 04-04-2013 07:48 PM
2to3 urllib.URLopener -> urllib.request.URLopener Chris McDonald Python 0 11-01-2010 11:23 AM
Sites which require personal information before providing information! Enkidu NZ Computing 5 01-06-2009 09:52 AM
Asynchronous urllib (urllib+asyncore)? Jonathan Gardner Python 1 02-27-2008 12:51 AM
using python to visit web sites and print the web sites image to files imx Python 10 03-14-2007 02:19 PM



Advertisments