Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Problem using TableExtract 1.08

Reply
Thread Tools

Problem using TableExtract 1.08

 
 
Darren Dunham
Guest
Posts: n/a
 
      09-07-2003
I'm trying to pull some data out of a table that I retrieve from an HTML
page. I found HTML::TableExtract and it looks like it could do what I
want. Is there a better module I should be using or any known problems
with it?

However, I'm having trouble getting the "header" method to behave
exactly like I want. The first thing I did was to have it print all the
tables from the html to see the bits. This is the relevant bit of the
code that does that...

# Examine all matching tables
foreach my $ts ($te->table_states) {
print "Table (", join(',', $ts->coords), "):\n";
print "MAP (", join(',', $ts->column_map), "):\n";
foreach my $row ($ts->rows) {
print join('<>', @$row), "\n";
}

And then this is the stuff from the top of the table I want...

Table (1,:
MAP (0,1,2,3,4,5,6,7,:
[Blah blah blah useless data on top line]<><><><><><><>
Player<>AB<>H<>HR<>RBI<>R<>OPS<>SB<>BA
Abad, Andy 1B BOS<>0<>0<>0<>0<>0<>0.000<>0<>0.0000
[.. table continues]

So I tried the simple "header" extraction method. If I supply all the
headers from the line that begins "Player", then I get no data. If I
supply some of the headers, I sometimes get data. The strange thing is
that simply rearranging the order will change whether or not data is
returned.

my $te = new HTML::TableExtract(
headers => [ "Player","AB","H","HR"],
);

# perl extract
# (no data returned).

Change the above to this...

my $te = new HTML::TableExtract(
headers => [ "Player","AB","HR","H"],
);

# perl extract
Table (1,:
MAP (0,1,3,2):
Abad, Andy 1B BOS<>0<>0<>0
[... more data...]
#

So the order of the headers matters (which I don't think it should).

Is there any way to use the "headers" to simply select a table (rather
than relying on it being in position (1, for example), but then
returning all columns, or is there something I'm doing wrong here?

Thanks.

--
Darren Dunham http://www.velocityreviews.com/forums/(E-Mail Removed)
Unix System Administrator Taos - The SysAdmin Company
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
HTML::TableExtract punctuation parsing Maqo Perl Misc 3 05-25-2005 02:40 AM
HTML::TableExtract with headers constraint, exluding right-most column Jim Monty Perl Misc 0 05-16-2005 04:54 PM
Perl HTML::TableExtract Question Paul Perl Misc 3 04-17-2005 10:57 PM
TableExtract question - how to deal with headers with spaces? Michael Bourgon Perl Misc 0 09-30-2004 08:56 PM
Rookie: HTML::TableExtract test will not print sdfgsd Perl Misc 6 10-09-2003 03:31 PM



Advertisments