Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > LWP user agent grabs the intermediate wait page after POST intead ofthe actual result page

Reply
Thread Tools

LWP user agent grabs the intermediate wait page after POST intead ofthe actual result page

 
 
bhabs
Guest
Posts: n/a
 
      02-12-2008
Hi,

I wrote a small LWP based perl program to search the air fare from a
travel website using POST.

#!/usr/bin/perl
use strict;
use CGI;
use LWP;

my $web_browser = LWP::UserAgent->new();
push @{ $web_browser->requests_redirectable }, 'POST';
$web_browser->timeout(300);
my $web_response = ();

$web_response = $web_browser->post('http://blabla.com/travel/
InitialSearch.do',
[
'fromCity' =>
'SFO',
'toCIty'
=> 'CVG'
.... #the rest
of the fields occur here
],
);

die "Error: ", $web_response->status_line()
unless $web_response->is_success;

my @content = $web_response->content;
print "@content";

When I print the content, I see the "intermediate" wait page (where it
displays the progress bar using javascript.... => I matched the
content with the "view source" from IExplorer)
I am unable to capture the final air fare page. It takes time for the
website to do the search and then display the air fare result page.
How do I make my program wait for the actual result and not grab the
intermediate response.

Could anyone please help me on this?

Regards,
bhabs
 
Reply With Quote
 
 
 
 
Ben Morrow
Guest
Posts: n/a
 
      02-12-2008

Quoth Christian Winter <(E-Mail Removed)>:
> bhabs wrote:
> > I wrote a small LWP based perl program to search the air fare from a
> > travel website using POST.
> >

> [...code snipped]
> >
> > When I print the content, I see the "intermediate" wait page (where it
> > displays the progress bar using javascript.... => I matched the
> > content with the "view source" from IExplorer)
> > I am unable to capture the final air fare page. It takes time for the
> > website to do the search and then display the air fare result page.
> > How do I make my program wait for the actual result and not grab the
> > intermediate response.

>
> You have to simulate what the browser does, and from your
> description, this is most likely a repeated ajax request
> to the server. Analyze the behaviour of the javascript
> and see how it fetches the progress state and what it
> does once the result is calculated, then craft those
> actions yourself. You best chances to see exactly what is going
> on in the background is with a network sniffer like wireshark,
> or a browser plugin like Firefox' Live HTTP Headers.


Or http://www.research.att.com/sw/tools/wsp/ , which will write a Perl
script to make the appropriate requests for you.

Ben

 
Reply With Quote
 
 
 
 
Tad J McClellan
Guest
Posts: n/a
 
      02-13-2008
Christian Winter <(E-Mail Removed)> wrote:
> bhabs wrote:
>> I wrote a small LWP based perl program to search the air fare from a
>> travel website using POST.
>>

> [...code snipped]
>>
>> When I print the content, I see the "intermediate" wait page (where it
>> displays the progress bar using javascript.... => I matched the
>> content with the "view source" from IExplorer)
>> I am unable to capture the final air fare page. It takes time for the
>> website to do the search and then display the air fare result page.
>> How do I make my program wait for the actual result and not grab the
>> intermediate response.

>
> You have to simulate what the browser does, and from your
> description, this is most likely a repeated ajax request
> to the server. Analyze the behaviour of the javascript
> and see how it fetches the progress state and what it
> does once the result is calculated, then craft those
> actions yourself. You best chances to see exactly what is going
> on in the background is with a network sniffer like wireshark,



I like the Web Scraping Proxy for this, it logs the traffic in
the form of LWP Perl code:

http://www.research.att.com/sw/tools/wsp/


> or a browser plugin like Firefox' Live HTTP Headers.



--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Why ever use std_logic_vector intead of signed/unsigned? kevin.neilson@xilinx.com VHDL 11 02-23-2013 03:17 AM
how to write html tag to download a file intead of show content of file in browser yihucd@gmail.com HTML 7 02-09-2006 09:10 PM
Activation of a javascript incorporated after initial loading ofthe page Alexandre Damiron Javascript 0 11-29-2005 10:28 AM
LWP::Parallel grabs Graham Computer Support 2 08-31-2003 06:30 PM



Advertisments