raga wrote:
> From the link given here :
> http://search.cpan.org/~kmacleod/lib...oc/PerlSAX.pod
> Perl sax seems to split the characters call for a single entity.
> Though this is wierd.(not sure if there is a genuine reason) it is
> fine.. as all belong to same entity, we can simply append all the
> characters calls.
The URL you provide says this:
"The Parser will call this method to report each chunk of character
data. SAX parsers may return all contiguous character data in a single
chunk, or they may split it into several chunks;"
> However ,sadly it just calls the characters api with an unwanted
> space.
> Eg: i've tag < tag1>mynameisrs</tag>
That isn't well formed XML and so cant be parsed.
1. you have a space in front of the firts tag name.
2. you open tag1 but close tag.
> it calls characters("myname") characters(" ") characters("isrs") ,
> It is not atall predictible why it is doing this way.
In my experience it is always sufficiently predictable. Probably your
mynameisrs data is split over several lines and you've not written your
handler to take this into account.
$ cat sax.pl
#!/usr/local/bin/perl
use strict;
use warnings;
use XML:

arser:

erlSAX;
my $xml="<tag>mynameisrs</tag>";
my $handler = MyHandler->new();
my $parser = XML:

arser:

erlSAX->new(Handler=>$handler);
$parser->parse($xml);
package MyHandler;
use strict;
use warnings;
use Data:

umper;
sub new {
my $type = shift;
return bless {}, $type;
}
my $current_element = '';
sub start_element {
my ($self, $element) = @_;
$current_element = $element->{Name};
print "Start: <$current_element>\n";
}
sub end_element {
my ($self, $element) = @_;
print "End: \n";
}
sub characters {
my ($self, $characters) = @_;
my $text = $characters->{Data};
print "Characters: '$text'\n";
}
1;
$ perl sax.pl
Start: <tag>
Characters: 'mynameisrs'
End:
--
RGB