Velocity Reviews

Velocity Reviews (
-   Python (
-   -   [perl-python] get web page programatically (

Xah Lee 02-04-2005 07:01 PM

[perl-python] get web page programatically
# -*- coding: utf-8 -*-
# Python

# suppose you want to fetch a webpage.
from urllib import urlopen

# note the line
# from <library_name> import <function_name1,function_name2...>
# it reads the library and import the function name
# to see available functions in a module one can use "dir"
# import urllib; print dir(urllib)

# for more about this module import syntax, see

# sometimes in working with html pages, you need to creat links
# In url, some chars need to be encoded.
# the "quote" function does it. "unquote" function reverses it. Very

from urllib import quote
print quote("~joe's home page")
print '' + quote("ménage à trois")
# (rely on the French to teach us interesting words)

# for more about the urllib module, see

in perl, it's messy as usual. Long story short the simplest way is to
use the perl program HEAD or GET in /usr/bin or /usr/local/bin. When
one of the networking module is installed, perl contaminate your bin
dirs with these programs. In the unix shell, try
GET ''
should do the job. HEAD is similar for http
head. (assuming they are installed.)

if you need more complexty, perl has LWP::Simple and LWP::UserAgent to
begin with. (there are a host of spaghetti others) Both of these needs
to be installed extra. Perhaps consult your sys admin. The last time i
used them was some 2 years ago, so the following code is untested, but
should be it. I don't recall which one can't do what. Your milage may

use strict;
# use LWP::Simple;
use LWP::UserAgent;
my $ua = new LWP::UserAgent;
my $url='';
my $request = new HTTP::Request('GET', $url);
my $response = $ua->request($request);
my $content = $response->content();
print $content;

# note the above perl code. In many perl codes, they sport the Object
Oriented syntax, often concomitantly with a normal syntax version as

this post is from the perl-python a-day mailing list. Please see


Dan Perl 02-04-2005 07:21 PM

Re: [perl-python] get web page programatically

"Xah Lee" <> wrote in message
# note the line
# from <library_name> import <function_name1,function_name2...>
# it reads the library and import the function name
# to see available functions in a module one can use "dir"
# import urllib; print dir(urllib)

After about a month, this tutorial has finally reached the syntax of the
"import" statement!

And word of advice to python beginners, "print dir(urllib)" is not very
useful in the sense mentioned here (it prints all the names defined in the
module with no explanations, and those names are not only functions, BTW).
But "help(urllib)" is much more useful. Even with such a simple script,
this tutorial still managed to give some bad advice.

Chris Mattern 02-04-2005 09:39 PM

Re: [perl-python] get web page programatically
Xah Lee wrote:


Just the standard warnings for any novices unfamiliar with Mr. Lee.
Mr. Lee's posts are regularly riddled with severe errors (I found
the assertion that LWP::Simple and LWP::UserAgent aren't part of
the standard base perl install a particularly amusing one in this
particular post). Please be advised that you should get your
perl information from accurate sources.
is an excellent place to start, with pointers to excellent Perl
books and even some readable for free online (notably Beginning
Christopher Mattern

"Which one you figure tracked us?"
"The ugly one, sir."
"...Could you be more specific?"

All times are GMT. The time now is 02:31 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.