Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   parsing event handler and object data (http://www.velocityreviews.com/forums/t900221-parsing-event-handler-and-object-data.html)

Michael Goerz 10-06-2006 07:08 AM

parsing event handler and object data
 
Hi,

I'm having some trouble with the event based HTML parser module
HTML::Parser. See the attached example code. The problem is this:

The event handlers seem to be completely self-contained, they only get
the parameters that are passed to them by the parser. However, I'd like
them to access variables from a higher scope, such as object data from
the class I'm using the HTML parser in. I suppose the same problem
arises with other event-based parsers, too. What's the right way to do
something like this?

Thanks,
Michael


Brian McCauley 10-06-2006 04:38 PM

Re: parsing event handler and object data
 


On Oct 6, 8:08 am, Michael Goerz <new...@8439.e4ward.com> wrote:
> Hi,
>
> I'm having some trouble with the event based HTML parser module
> HTML::Parser. See the attached example code. The problem is this:
>
> The event handlers seem to be completely self-contained, they only get
> the parameters that are passed to them by the parser. However, I'd like
> them to access variables from a higher scope, such as object data from
> the class I'm using the HTML parser in.


What you are saying is you want to pass the HTML::Parser a callback
that calls back to an object method rather than just to a subroutine.

> I suppose the same problem arises with other event-based parsers, too.


Or any API with callbacks.

Your question is, in fact, almost FAQ but it's perhaps not immediately
obvious that this is the case.

The FAQ in question is "How can I pass/return a {Function, FileHandle,
Array, Hash, Method, Regex}?". One way of looking at it is that you are
asking "How can I pass a Method?"

> What's the right way to do something like this?


This is Perl! There's more than on right way.

> Content-Type: application/x-perl; name="TestParser.pm"


text/plain please (or simply inline your text).

[ code slightly simplified for illustrative puposes, the OP's code was
an execllent *mimimal* but *complete* illustration of his point ]

> sub parse{
> my $self = shift;
> my $p = HTML::Parser->new( api_version => 3,
> start_h => [\&start, "tagname, attr"],
> );
> $p->parse_file($self->{infile});
>}
>
> sub start{
> # Doesn't work, how do I access variables from a hight scope?
> $self->{tagname} = shift;
>}


Right. There are three approches that spring to mind.

1) Move start() inside the lexical scope of parse() so that $self is
in scope. This is a slightly complicated by the fact that Perl doesn't
have proper named nested subs but does have anonymous closures.

2) Call start() as a method using a small closure as a shim.

3) Use package for the variables that you want to be shared between
multiple lexical scopes.

Note: Solution 3 is considered dity by some. It is generally the
easiest to debug unless you are interfacing to a object that will
persist beyond the context is which it is created in which case it
becomes the hardest to debug.

In your code the HTML::Parser object will only exist within the time
that parser() is on the stack. Futhermore, if parser() is called
reentrantly then you can be sure that the HTML::Parser object from the
outer instance of parser() will never try to call back dring the
execution of the inner parser(). Only because these two conditions are
met is it safe to opt for solution 3.


# Solution 1

sub parse{
my $self = shift;
my $start = sub {
$self->{tagname} = shift;
};

my $p = HTML::Parser->new( api_version => 3,
start_h => [ $start, "tagname, attr"],
);
$p->parse_file($self->{infile});
}

# Solution 2

sub parse {
my $self = shift;
my $p = HTML::Parser->new( api_version => 3,
start_h => [ sub { $self->start(@_) },
"tagname, attr"],
);
$p->parse_file($self->{infile});
}

sub start {
my $self = shift; # We're now a method
$self->{tagname} = shift;
}

# Solution 3

our $self;

sub parse {
local $self = shift;
my $p = HTML::Parser->new( api_version => 3,
start_h => [\&start, "tagname, attr"],
);
$p->parse_file($self->{infile});
}

sub start {
$self->{tagname} = shift;
}



All times are GMT. The time now is 04:56 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.