Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > [perl-python] get web page programatically

Reply
Thread Tools

[perl-python] get web page programatically

 
 
Xah Lee
Guest
Posts: n/a
 
      02-04-2005
# -*- coding: utf-8 -*-
# Python

# suppose you want to fetch a webpage.
from urllib import urlopen
print
urlopen('http://xahlee.org/Periodic_dosage_dir/_p2/russell-lecture.html').read()

# note the line
# from <library_name> import <function_name1,function_name2...>
# it reads the library and import the function name
# to see available functions in a module one can use "dir"
# import urllib; print dir(urllib)

# for more about this module import syntax, see
# http://python.org/doc/tut/node8.html

#---------------------
# sometimes in working with html pages, you need to creat links
# In url, some chars need to be encoded.
# the "quote" function does it. "unquote" function reverses it. Very
nice.

from urllib import quote
print quote("~joe's home page")
print 'http://www.google.com/search?q=' + quote("ménage à trois")
# (rely on the French to teach us interesting words)

# for more about the urllib module, see
# http://python.org/doc/lib/module-urllib.html

----------------------------
in perl, it's messy as usual. Long story short the simplest way is to
use the perl program HEAD or GET in /usr/bin or /usr/local/bin. When
one of the networking module is installed, perl contaminate your bin
dirs with these programs. In the unix shell, try
GET 'http://yahoo.com/'
should do the job. HEAD is similar for http
head. (assuming they are installed.)

if you need more complexty, perl has LWP::Simple and LWP::UserAgent to
begin with. (there are a host of spaghetti others) Both of these needs
to be installed extra. Perhaps consult your sys admin. The last time i
used them was some 2 years ago, so the following code is untested, but
should be it. I don't recall which one can't do what. Your milage may
vary.

use strict;
# use LWP::Simple;
use LWP::UserAgent;
my $ua = new LWP::UserAgent;
$ua->timeout(120);
my $url='http://yahoo.com/';
my $request = new HTTP::Request('GET', $url);
my $response = $ua->request($request);
my $content = $response->content();
print $content;
__END__

# note the above perl code. In many perl codes, they sport the Object
Oriented syntax, often concomitantly with a normal syntax version as
well.

----------------
this post is from the perl-python a-day mailing list. Please see
http://xahlee.org/perl-python/python.html

Xah
http://www.velocityreviews.com/forums/(E-Mail Removed)
http://xahlee.org/PageTwo_dir/more.html

 
Reply With Quote
 
 
 
 
Chris Mattern
Guest
Posts: n/a
 
      02-04-2005
Xah Lee wrote:

<snip>

Just the standard warnings for any novices unfamiliar with Mr. Lee.
Mr. Lee's posts are regularly riddled with severe errors (I found
the assertion that LWP::Simple and LWP::UserAgent aren't part of
the standard base perl install a particularly amusing one in this
particular post). Please be advised that you should get your
perl information from accurate sources. http://learn.perl.org
is an excellent place to start, with pointers to excellent Perl
books and even some readable for free online (notably Beginning
Perl).
--
Christopher Mattern

"Which one you figure tracked us?"
"The ugly one, sir."
"...Could you be more specific?"
 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      02-04-2005
Chris Mattern wrote:
> Just the standard warnings for any novices unfamiliar with Mr. Lee.
> Mr. Lee's posts are regularly riddled with severe errors


Concur.

> (I found the assertion that LWP::Simple and LWP::UserAgent aren't
> part of the standard base perl install a particularly amusing one in
> this particular post).


I'm not sure about that, though. I thought that libwww-perl was not a
core package.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[perl-python] get web page programatically Xah Lee Python 2 02-04-2005 09:39 PM
get the form id programatically francois ASP .Net 1 06-17-2004 03:08 PM
ADVANCED: Dynamically/Programatically adding a UserControl to another inside of a Web Page? Ezra Epstein ASP .Net Web Controls 0 09-13-2003 05:12 PM
Position Web control within Panel on Web Form ... programatically M- ASP .Net 1 07-09-2003 08:57 PM
Position Web control within Panel on Web Form ... programatically Marcia ASP .Net Web Controls 0 07-09-2003 08:41 PM



Advertisments