Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > How can I keep LWP::UserAgent from adding the http-equiv strings fromthe Head section of the page?

Reply
Thread Tools

How can I keep LWP::UserAgent from adding the http-equiv strings fromthe Head section of the page?

 
 
CronJob
Guest
Posts: n/a
 
      03-18-2009
How can I keep LWP::UserAgent from adding the http-equiv strings from
the Head section of the page? When I run the following program below,
the $headers variable contains three Content-Type: listings. One from
the actual http header and one from the meta tag in the web page.

#!/usr/bin/perl -w

use LWP::UserAgent;
use HTML:arse;
use HTML::Element;
use HTTP::Response;
use HTTP::Request;
use HTTP::Status;
use URI::URL;

my ($code, $desc, $headers, $body)=&makeRequest('GET', 'http://
www.google.com');
print "The headers:\n$headers\n";
print "The body:\n$body\n";

sub makeRequest( ) {
($method, $path) = @_;
# create a user agent object
my $ua = new LWP::UserAgent;
$ua->agent("Mozilla/4.0");

# request a url
my $request = new HTTP::Request($method, $path);
# set values in response object HTTP::Reponse
my $response = $ua->request($request);

# get the details if there is an error
# otherwise parse the response object
my $body=$response->content;
my $code=$response->code;
my $desc=HTTP::Status::status_message($code);
my $headers=$response->headers_as_string;
$body = $response->error_as_HTML if ($response->is_error);
return ($code, $desc, $headers, $body);
}
 
Reply With Quote
 
 
 
 
CronJob
Guest
Posts: n/a
 
      03-19-2009
On Mar 18, 5:17*pm, Ben Morrow <(E-Mail Removed)> wrote:
> Quoth CronJob <(E-Mail Removed)>:
>
> > How can I keep LWP::UserAgent from adding the http-equiv strings from
> > the Head section of the page? When I run the following program below,
> > the $headers variable contains three Content-Type: listings. One from
> > the actual http header and one from the meta tag in the web page.

>
> See the ->parse_head method of LWP::UserAgent.
>
> You might want to try reading the docs of the modules you are using.
>
> Ben


Yes I agree with you. Unfortunately for me, I find the form that is
used in the perl documentation to be abstruse. I learn by working with
example code, not by reading abstract discussions about how code is
that do not contain working examples. Hopefully it will come to me
over time. I had the same issue with man pages years ago, but now its
second nature. I appreciate your response and I will look through the
documentation carefully.
 
Reply With Quote
 
 
 
 
CronJob
Guest
Posts: n/a
 
      03-19-2009
Thank you Ben.

I ran 'perldoc LWP' and found:

The class name for the user agent is "LWP::UserAgent".
<snip>
The parse_head specifies whether we should initialize
response headers from the <head> section of HTML docu-
ments.

Running 'perldoc LWP::UserAgent' I see that:

$ua = LWP::UserAgent->new( %options )
This method constructs a new "LWP::UserAgent" object and
returns it. Key/value pair arguments may be pro-
vided to set up the initial state. The following options
correspond to attribute methods described below:

KEY DEFAULT
----------- --------------------
parse_head 1


I now realize that the 1 is implicitly a boolean value, and hence that
0 should do the trick for me.

Working code:

#!/usr/bin/perl -w

use strict;
use LWP::UserAgent;
use HTML:arse;
use HTML::Element;
use HTTP::Response;
use HTTP::Request;
use HTTP::Status;
use URI::URL;

my $ie7UAString = 'Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0;
en-US)';
my ($code, $desc, $headers,$body) = &LWPUserAgentRequest('GET','http://
www.google.com');
print "The headers:\n$headers\n";
print "The body:\n$body\n";

sub LWPUserAgentRequest {
my ($method, $path) = @_;
my $ua = new LWP::UserAgent;
$ua->agent($ie7UAString);
$ua->parse_head(0);
my $request = new HTTP::Request($method, $path);
my $response = $ua->request($request);
my $body = $response->content;
$body = $response->error_as_HTML if ($response->is_error);
my $code = $response->code;
my $desc = HTTP::Status::status_message($code);
my $headers = $response->headers_as_string;
return ($code, $desc, $headers, $body);
}

 
Reply With Quote
 
J. Gleixner
Guest
Posts: n/a
 
      03-20-2009
CronJob wrote:
[...]
> Working code:
>
> #!/usr/bin/perl -w
>
> use strict;
> use LWP::UserAgent;
> use HTML:arse;
> use HTML::Element;
> use HTTP::Response;
> use HTTP::Request;
> use HTTP::Status;
> use URI::URL;


Some minor tweaks..


Do you really need all of those?

>
> my $ie7UAString = 'Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US)';
> my ($code, $desc, $headers,$body) = &LWPUserAgentRequest('GET','http://www.google.com');


Remove the '&'------------------------^

If you add a '/' to the end of the URL, then the Web server doesn't
have to do it for you.

> print "The headers:\n$headers\n";
> print "The body:\n$body\n";


You can call print once, with a list:

print "The headers:\n$headers\n",
"The body:\n$body\n";
>
> sub LWPUserAgentRequest {
> my ($method, $path) = @_;


Usually, it's nice to have a blank line after initializing
the input parameters.

> my $ua = new LWP::UserAgent;


my $ua = LWP::UserAgent->new();

> $ua->agent($ie7UAString);
> $ua->parse_head(0);
> my $request = new HTTP::Request($method, $path);


my $request = HTTP::Request->new( $method, $path );

> my $response = $ua->request($request);
> my $body = $response->content;
> $body = $response->error_as_HTML if ($response->is_error);


my $body = ( $response->is_error )
? $response->error_as_HTML
: $response->content;

> my $code = $response->code;
> my $desc = HTTP::Status::status_message($code);
> my $headers = $response->headers_as_string;


Ya don't really need $headers, you could just return
$response->headers_as_string, instead of $headers, below.

> return ($code, $desc, $headers, $body);
> }
>

 
Reply With Quote
 
Tad J McClellan
Guest
Posts: n/a
 
      03-20-2009
J. Gleixner <(E-Mail Removed)> wrote:
> CronJob wrote:



>> my $ua = new LWP::UserAgent;

>
> my $ua = LWP::UserAgent->new();


>> my $request = new HTTP::Request($method, $path);

>
> my $request = HTTP::Request->new( $method, $path );



Just in case you're wondering why this suggested change is
a Really Good Idea, see the "Indirect Object Syntax" section in:

perldoc perlobj


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
 
Reply With Quote
 
Eric Pozharski
Guest
Posts: n/a
 
      03-20-2009
On 2009-03-20, J. Gleixner <(E-Mail Removed)> wrote:
> CronJob wrote:

*SKIP*
>> print "The headers:\n$headers\n";
>> print "The body:\n$body\n";

>
> You can call print once, with a list:
>
> print "The headers:\n$headers\n",
> "The body:\n$body\n";


With such outrageous number of newlines I would suggest

print <<"EOT";
The headers:
$headers
The body:
$body
EOT

*CUT*

--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Windows Server 2008 - Error reading configuration information fromthe registry. Iain ASP .Net 5 11-04-2009 02:31 PM
A critical update is available to remove unacceptable symbols fromthe Bookshelf Symbol 7 font SchoolTech NZ Computing 5 02-03-2006 08:00 AM
Write into <HEAD></HEAD> section? Jiho Han ASP .Net Building Controls 6 01-16-2004 04:31 PM
Injecting code into the <head></head> section Brian W ASP .Net 10 07-02-2003 07:53 PM



Advertisments