wrote in
news:b0833433-862e-4895-8002-:
> On May 15, 3:16 pm, "Gordon Etly" <ge...@bentsys-INVALID.com> wrote:
>> cha...@lonemerchant.com wrote:
>> > On May 15, 1:37 pm, Uri Guttman <u...@stemsystems.com> wrote:
>> > chadda <cha...@lonemerchant.com> writes:
>> > > i have to know if you could write this mess any slower? you are
>> > > doing
>> > > everything possible to slow you down.
>> > I know I shouldn't critize free help, but you seem to have some
>> > anger management issues.
....
>> As a simple answer, take a look at LWP:UserAgent
>> (http://search.cpan.org/~gaas/libwww-perl-
5.812/lib/LWP/UserAgent.pm),
>> as a good start in the right direction.
....
> I just tried LWP, and now I can't get the code to work for the life of
> me. Here is what I attempted
As I mentioned elsewhere, all you need is LWP::Simple.
So, here is a fish for you:
C:\Temp> cat p.pl
#!/usr/bin/perl
use strict;
use warnings;
use HTML::TokeParser;
use LWP::Simple;
my ($input_file) = @ARGV;
die "No input file specified\n" unless defined $input_file;
open my $INPUT, '<', $input_file
or die "Cannot open '$input_file': $!";
ID:
while ( my $id = <$INPUT> ) {
chomp $id;
my $url = make_url( $id );
my $html = get $url;
unless ( defined $html ) {
warn "Error downloading from '$url'\n";
next ID;
}
my $parser = HTML::TokeParser->new( \$html );
TABLE:
while ( my $token = $parser->get_tag('table') ) {
if ( lc $token->[1]{id} eq 'product_details' ) {
my $td = $parser->get_tag('td');
last TABLE unless $td;
my $cell = $parser->get_text('/td');
my %data;
while ( $cell =~ /\s*([^:]+?):\s+(\d+)\s+/g ) {
$data{$1} = $2;
}
use Data:

umper;
print Dumper \%data;
}
}
}
sub make_url {
return
sprintf q{http://www.doba.com/members/catalog/%s.html}, $_[0];
}
__END__
C:\Temp> timethis p list
$VAR1 = {
'Product ID' => '3308191',
'UPC' => '896207999816',
'Item ID' => '3653992',
'SKU' => '8930'
};
TimeThis : Command Line : p list
TimeThis : Start Time : Thu May 15 18:19:28 2008
TimeThis : End Time : Thu May 15 18:19:29 2008
TimeThis : Elapsed Time : 00:00:01.062
Comparing this to the overhead of an empty script:
C:\Temp> cat t.pl
#!/usr/bin/perl
use strict;
use warnings;
C:\Temp> timethis t
TimeThis : Command Line : t
TimeThis : Start Time : Thu May 15 18:20:38 2008
TimeThis : End Time : Thu May 15 18:20:38 2008
TimeThis : Elapsed Time : 00:00:00.218
It took 0.844 seconds to retrieve and parse the required information. Of
course, the time cost would be better amortized if you ran a lot of
these queries.
--
A. Sinan Unur <>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/