Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   lwp::simple get (why it would stop working along with wget when fetch still works) (http://www.velocityreviews.com/forums/t903873-lwp-simple-get-why-it-would-stop-working-along-with-wget-when-fetch-still-works.html)

rockerd@gmail.com 07-18-2007 02:17 AM

lwp::simple get (why it would stop working along with wget when fetch still works)
 
Hi Perl People,
Something recently changed on a site that I was fetching and parsing
from with lwp::simple.
Here is the thing: For the longest time I was using get() to grab a
http: site and store it in a scalar which I parsed later. Suddenly I
get an empty but defined scalar with: $html = get($url);

More: when I use fetch on a freebsd system it pulls the page to text
without any problems but when I use wget on a linux system I get a
blank file. Everything used to work. I tried changing my user-agent
headers and have had no luck. The only thing I can see is that the
file has an unknown length.. but I don't know what to do.

Thanks for the advice,
Rocker


Gunnar Hjalmarsson 07-18-2007 02:57 AM

Re: lwp::simple get (why it would stop working along with wget whenfetch still works)
 
rockerd@gmail.com wrote:
> Something recently changed on a site that I was fetching and parsing
> from with lwp::simple.
> Here is the thing: For the longest time I was using get() to grab a
> http: site and store it in a scalar which I parsed later. Suddenly I
> get an empty but defined scalar with: $html = get($url);


Maybe the web server doesn't like requests that are generated by Perl.
:( You may want to try without sending a client identifier:

use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
$ua->agent(''); # <- This line may make a difference
my $response = $ua->get('http://www.perl.org/');
print $response->content;

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Peter J. Holzer 07-21-2007 11:26 AM

Re: lwp::simple get (why it would stop working along with wget when fetch still works)
 
On 2007-07-18 02:17, rockerd@gmail.com <rockerd@gmail.com> wrote:
> Something recently changed on a site that I was fetching and parsing
> from with lwp::simple.
> Here is the thing: For the longest time I was using get() to grab a
> http: site and store it in a scalar which I parsed later. Suddenly I
> get an empty but defined scalar with: $html = get($url);


Use LWP::Simple only if you are absolutely sure that you never need the
return code or headers. LWP::UserAgent is almost always the better
choice, especially if you have to handle errors or strange behaviour.


> More: when I use fetch on a freebsd system it pulls the page to text
> without any problems but when I use wget on a linux system I get a
> blank file. Everything used to work. I tried changing my user-agent
> headers and have had no luck.


Is "a linux system" the system where the script normally runs and "a
freebsd system" a different system? It might be that the owner of the
site noticed that you are automatically retrieving data and blocking
your IP address.

hp


--
_ | Peter J. Holzer | I know I'd be respectful of a pirate
|_|_) | Sysadmin WSR | with an emu on his shoulder.
| | | hjp@hjp.at |
__/ | http://www.hjp.at/ | -- Sam in "Freefall"


All times are GMT. The time now is 01:38 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.