Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Copy characterdata from XML file to XML file

Reply
Thread Tools

Copy characterdata from XML file to XML file

 
 
Eric van Oorschot
Guest
Posts: n/a
 
      12-07-2005
Hi,

I'm writing a Perl script that has to copy a block of data (nodes numbers
and coordinates) from one XML formatted file into another XML file.
I'm using XML:arser to extract the data and XML::Writer to write the
data into the second file.

This does not work, since some of the numbers are corrupted after being
read by XML:arser. Below I have copied a small bit that shows how the
data is corrupted. It always happens at the same line(s) of data.

67 2.9005093479606E+000 3.6637104002418E-001 7.9522656092442E-001
68 2.8852994122583E+000 3.5353599488296E-001 7.7516591265738E-001
69 2.9109259023248E+000 3.5272037818926E-001 8.1765470045
602E-001
70 2.9014248453522E+000 3.4032368974452E-001 7.9417266267164E-001
71 2.8849923984542E+000 3.2706829720117E-001 7.7537618002780E-001

My Perl script (I am not an experienced Perl programmer) is shown below.
The error occurs in the sub 'ReadCharacterData'. In this subroutine the
data is read and copied into a hash %tables. When writing this hash in the
output file the error shown above is found.

If anyone has an idea, or needs more info, please reply.

Regards,

Eric


use XML:arser;
use IO::File;
use Switch ;
use XML::Writer;

my $fmsfile = shift ; # fms output file
my $reffile = shift ; # Exchange output deck
my $outfile = shift ; # Output file

die "Cannot find fms output file \"$xmlfile\""
unless -f $fmsfile;

die "Cannot find xml input deck \"$reffile\""
unless -f $reffile;

my $output = new IO::File(">$outfile");
my $writer = new XML::Writer( OUTPUT => $output, UNSAFE => 1 );

#
# Find tmax in fms file
#
my $tmax = 0.00 ;
open ( IN, $fmsfile ) ;
while ( <IN> ) {
if ( /TIME/ ) {
( $dum, $dum, $dum, $ti ) = split /\s+/ ;
$tmax = $ti if ( $ti > $tmax ) ;
}
}
close (IN) ;

$tag = "";

my %tables ; # hash with coordinates from fms file
my $model ; # naam van het FE model
my $i = 0 ; #
# Readfile to create hash of the coordinate tables
#
my $parser = new XML:arser;

$parser->setHandlers( Char => \&ReadCharacterData,
Default => \&default);
print "Reading fms file ($fmsfile)\n" ; $parser->parsefile($fmsfile);

## Check info read in fms file
#foreach $i ( keys %tables ) {
# print "Table $i\n",$tables{$i},"\n End table $i\n\n";
# }

my $coords = 0 ;

#
# Read reffile and replace coordinate tables with data from fms file
#
my $bparser = new XML:arser;
$bparser->setHandlers( XMLDecl => \&XmlDecl,
Doctype => \&DocType,
Start => \&startElement,
End => \&endElement,
Char => \&characterData,
CdataStart => \&cdatastart,
CdataEnd => \&cdataend,
Default => \&default);
print "Reading ($reffile) and writing ($outfile) \n" ;
$bparser->parsefile($reffile);

$writer->end() ;

#
################################################## ######################
#

sub XmlDecl {
my( $parseinst, $version, $encoding, $standalone ) = @_;
$writer->xmlDecl( $encoding, $standalone );
}

sub DocType {
my( $parseinst, $name, $sysid, $pub, $internal ) = @_;
$writer->doctype( $name, $pub, $sysid );
}

sub startElement {
# Reading xml data
my( $parseinst, $element, %attrs ) = @_;
SWITCH: {
if ($element eq "FE_MODEL") {
$model = $attrs{'NAME'} ;
$tag = "DEFINE";
# print "FE model $model\n" ;
last SWITCH;
}
if ($element eq "TABLE" && $attrs{'TYPE'} =~ /COORDINATE/ ) {
$coords = 1 ;
# print "$coords - TABLE COORDINATES\n" ;
}
last SWITCH ;
}
$writer -> startTag( $element , %attrs );
}

sub endElement {

my( $parseinst, $element ) = @_;
$coords = 0 ;
$writer -> endTag( $element ) ;
}

sub ReadCharacterData {
my( $parseinst, $data ) = @_;
SWITCH: {
if ( $data =~ /^\s*$/ ) {
last ;
};
if ( $data =~ /TIME/ ) {
( $dum, $dum, $dum, $ti ) = split /\s+/, $data ;
# print "Timepoint ", $ti, "\n" ;
last ;
} ;
if ( $data =~ /FE MODEL/ ) {
($dum, $dum, $dum, $dum, $dum ) = split /\s+/, $data ;
( $txt = $dum ) =~ s/\/.*\/// ; # strip system numbering
# print $txt, "\n" ;
$tables{$txt} = ' ' ;
last ;
} ;
if ( $ti == $tmax ) {
# print $data ;
$tables{$txt} .= $data . "\n" ;
last ;
} ;
}
}


sub characterData {
my( $parseinst, $data ) = @_;
if ( $writer->within_element('FE_MODEL') && $writer->within_element('TABLE') && $coords && $data != /^\s*$/ ) {
$writer -> characters ( $tables{$model} ) ;
$tables{$model} = ' ' ; #empty table
}
elsif ( ! $coords ) {
# print "Coords $coords : $data";
$writer -> characters( $data ) ;
}
}

sub cdatastart {
$writer -> raw( "<![CDATA[\n" );
}
sub cdataend {
$writer -> raw( "]]>\n" );
}
sub default {

# do nothing, but stay quiet

}


 
Reply With Quote
 
 
 
 
John Bokma
Guest
Posts: n/a
 
      12-07-2005
Eric van Oorschot <(E-Mail Removed)4all.nl> wrote:

> Hi,
>
> I'm writing a Perl script that has to copy a block of data (nodes
> numbers and coordinates) from one XML formatted file into another XML
> file. I'm using XML:arser to extract the data and XML::Writer to
> write the data into the second file.
>
> This does not work, since some of the numbers are corrupted after
> being read by XML:arser. Below I have copied a small bit that shows
> how the data is corrupted. It always happens at the same line(s) of
> data.
>
> 67 2.9005093479606E+000 3.6637104002418E-001 7.9522656092442E-001
> 68 2.8852994122583E+000 3.5353599488296E-001 7.7516591265738E-001
> 69 2.9109259023248E+000 3.5272037818926E-001 8.1765470045
> 602E-001
> 70 2.9014248453522E+000 3.4032368974452E-001 7.9417266267164E-001
> 71 2.8849923984542E+000 3.2706829720117E-001 7.7537618002780E-001


Be aware that character data, as also stated in the documentation, is
*not* available in your handler in one big string. The handler might get
called several times. A quick glance showed me that you're not aware of
this / not doing it right. What you should do:

if an start element is found: reset the global string buffer
if char data is found: glue it to the global buffer
if an end element is found, process the global string buffer

HTH,


--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
I ploink googlegroups.com

 
Reply With Quote
 
 
 
 
robic0@yahoo.com
Guest
Posts: n/a
 
      12-08-2005

Eric van Oorschot wrote:
> Hi,
>
> I'm writing a Perl script that has to copy a block of data (nodes numbers
> and coordinates) from one XML formatted file into another XML file.
> I'm using XML:arser to extract the data and XML::Writer to write the
> data into the second file.
>
> This does not work, since some of the numbers are corrupted after being
> read by XML:arser. Below I have copied a small bit that shows how the
> data is corrupted. It always happens at the same line(s) of data.
>
> 67 2.9005093479606E+000 3.6637104002418E-001 7.9522656092442E-001
> 68 2.8852994122583E+000 3.5353599488296E-001 7.7516591265738E-001
> 69 2.9109259023248E+000 3.5272037818926E-001 8.1765470045
> 602E-001
> 70 2.9014248453522E+000 3.4032368974452E-001 7.9417266267164E-001
> 71 2.8849923984542E+000 3.2706829720117E-001 7.7537618002780E-001
>
> My Perl script (I am not an experienced Perl programmer) is shown below.
> The error occurs in the sub 'ReadCharacterData'. In this subroutine the
> data is read and copied into a hash %tables. When writing this hash in the
> output file the error shown above is found.
>
> If anyone has an idea, or needs more info, please reply.
>
> Regards,
>
> Eric
>
>
> use XML:arser;
> use IO::File;
> use Switch ;
> use XML::Writer;
>
> my $fmsfile = shift ; # fms output file
> my $reffile = shift ; # Exchange output deck
> my $outfile = shift ; # Output file
>
> die "Cannot find fms output file \"$xmlfile\""
> unless -f $fmsfile;
>
> die "Cannot find xml input deck \"$reffile\""
> unless -f $reffile;
>
> my $output = new IO::File(">$outfile");
> my $writer = new XML::Writer( OUTPUT => $output, UNSAFE => 1 );
>
> #
> # Find tmax in fms file
> #
> my $tmax = 0.00 ;
> open ( IN, $fmsfile ) ;
> while ( <IN> ) {
> if ( /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/ ;
> $tmax = $ti if ( $ti > $tmax ) ;
> }
> }
> close (IN) ;
>
> $tag = "";
>
> my %tables ; # hash with coordinates from fms file
> my $model ; # naam van het FE model
> my $i = 0 ; #
> # Readfile to create hash of the coordinate tables
> #
> my $parser = new XML:arser;
>
> $parser->setHandlers( Char => \&ReadCharacterData,
> Default => \&default);
> print "Reading fms file ($fmsfile)\n" ; $parser->parsefile($fmsfile);
>
> ## Check info read in fms file
> #foreach $i ( keys %tables ) {
> # print "Table $i\n",$tables{$i},"\n End table $i\n\n";
> # }
>
> my $coords = 0 ;
>
> #
> # Read reffile and replace coordinate tables with data from fms file
> #
> my $bparser = new XML:arser;
> $bparser->setHandlers( XMLDecl => \&XmlDecl,
> Doctype => \&DocType,
> Start => \&startElement,
> End => \&endElement,
> Char => \&characterData,
> CdataStart => \&cdatastart,
> CdataEnd => \&cdataend,
> Default => \&default);
> print "Reading ($reffile) and writing ($outfile) \n" ;
> $bparser->parsefile($reffile);
>
> $writer->end() ;
>
> #
> ################################################## ######################
> #
>
> sub XmlDecl {
> my( $parseinst, $version, $encoding, $standalone ) = @_;
> $writer->xmlDecl( $encoding, $standalone );
> }
>
> sub DocType {
> my( $parseinst, $name, $sysid, $pub, $internal ) = @_;
> $writer->doctype( $name, $pub, $sysid );
> }
>
> sub startElement {
> # Reading xml data
> my( $parseinst, $element, %attrs ) = @_;
> SWITCH: {
> if ($element eq "FE_MODEL") {
> $model = $attrs{'NAME'} ;
> $tag = "DEFINE";
> # print "FE model $model\n" ;
> last SWITCH;
> }
> if ($element eq "TABLE" && $attrs{'TYPE'} =~ /COORDINATE/ ) {
> $coords = 1 ;
> # print "$coords - TABLE COORDINATES\n" ;
> }
> last SWITCH ;
> }
> $writer -> startTag( $element , %attrs );
> }
>
> sub endElement {
>
> my( $parseinst, $element ) = @_;
> $coords = 0 ;
> $writer -> endTag( $element ) ;
> }
>
> sub ReadCharacterData {
> my( $parseinst, $data ) = @_;
> SWITCH: {
> if ( $data =~ /^\s*$/ ) {
> last ;
> };
> if ( $data =~ /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/, $data ;
> # print "Timepoint ", $ti, "\n" ;
> last ;
> } ;
> if ( $data =~ /FE MODEL/ ) {
> ($dum, $dum, $dum, $dum, $dum ) = split /\s+/, $data ;
> ( $txt = $dum ) =~ s/\/.*\/// ; # strip system numbering
> # print $txt, "\n" ;
> $tables{$txt} = ' ' ;
> last ;
> } ;
> if ( $ti == $tmax ) {
> # print $data ;
> $tables{$txt} .= $data . "\n" ;
> last ;
> } ;
> }
> }
>
>
> sub characterData {
> my( $parseinst, $data ) = @_;
> if ( $writer->within_element('FE_MODEL') && $writer->within_element('TABLE') && $coords && $data != /^\s*$/ ) {
> $writer -> characters ( $tables{$model} ) ;
> $tables{$model} = ' ' ; #empty table
> }
> elsif ( ! $coords ) {
> # print "Coords $coords : $data";
> $writer -> characters( $data ) ;
> }
> }
>
> sub cdatastart {
> $writer -> raw( "<![CDATA[\n" );
> }
> sub cdataend {
> $writer -> raw( "]]>\n" );
> }
> sub default {
>
> # do nothing, but stay quiet
>
> }


3rd try:
I agree with John Bokma. You have to have the start and end handlers as
well.
Posibly something like this --

my $RD_xml = '';
my $last_content = '';
my $RD_xml = '';
my $special_tag = 0;

sub default_start_handler
{
my ($p, $element, %atts) = @_;
$element = uc($element);
$last_content = '';

## Check for start of singular tag data capture
## -----------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT')
{ $special_tag = 1; }

## Check for start of XML chunk data capture
## -----------------------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 1;
}
if ($capturing_Is_part_of_larger_xml)
{ $RD_xml .= $p->original_string; }
}

sub default_content_handler
{
my ($p, $str) = @_;

## Use original for entities, incase reparse
## --------------------------------------------
$str = $p->original_string;

## Remove leading/trailing space, newline, tab
## if you want to do this now....
## -----------------------------------------------
$str =~ s/^[\x20\n\t]+//; $str =~ s/[\x20\n\t]+$//;

## Capture what is necessary. Last content is
## always captured by default
## -----------------------------------------------
if (length ($str) > 0) {
$last_content .= $str;
$RD_xml .= $str if ($capturing_Is_part_of_larger_xml);
}
}

sub default_end_handler
{
my ($p, $element) = @_;
$element = uc($element);

## Handle singular capture of special tag data
## ---------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT') {
ProcessContent ($last_content) if ($special_tag);
$special_tag = 0;
}
$last_content = '';

## Handle larger capture XML chunks
## -----------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
if ($capturing_Is_part_of_larger_xml) {
$RD_xml .= $p->original_string;
ProcessXmlChunk ($RD_xml);
}
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 0;
}
}

 
Reply With Quote
 
robic0@yahoo.com
Guest
Posts: n/a
 
      12-08-2005

Eric van Oorschot wrote:
> Hi,
>
> I'm writing a Perl script that has to copy a block of data (nodes numbers
> and coordinates) from one XML formatted file into another XML file.
> I'm using XML:arser to extract the data and XML::Writer to write the
> data into the second file.
>
> This does not work, since some of the numbers are corrupted after being
> read by XML:arser. Below I have copied a small bit that shows how the
> data is corrupted. It always happens at the same line(s) of data.
>
> 67 2.9005093479606E+000 3.6637104002418E-001 7.9522656092442E-001
> 68 2.8852994122583E+000 3.5353599488296E-001 7.7516591265738E-001
> 69 2.9109259023248E+000 3.5272037818926E-001 8.1765470045
> 602E-001
> 70 2.9014248453522E+000 3.4032368974452E-001 7.9417266267164E-001
> 71 2.8849923984542E+000 3.2706829720117E-001 7.7537618002780E-001
>
> My Perl script (I am not an experienced Perl programmer) is shown below.
> The error occurs in the sub 'ReadCharacterData'. In this subroutine the
> data is read and copied into a hash %tables. When writing this hash in the
> output file the error shown above is found.
>
> If anyone has an idea, or needs more info, please reply.
>
> Regards,
>
> Eric
>
>
> use XML:arser;
> use IO::File;
> use Switch ;
> use XML::Writer;
>
> my $fmsfile = shift ; # fms output file
> my $reffile = shift ; # Exchange output deck
> my $outfile = shift ; # Output file
>
> die "Cannot find fms output file \"$xmlfile\""
> unless -f $fmsfile;
>
> die "Cannot find xml input deck \"$reffile\""
> unless -f $reffile;
>
> my $output = new IO::File(">$outfile");
> my $writer = new XML::Writer( OUTPUT => $output, UNSAFE => 1 );
>
> #
> # Find tmax in fms file
> #
> my $tmax = 0.00 ;
> open ( IN, $fmsfile ) ;
> while ( <IN> ) {
> if ( /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/ ;
> $tmax = $ti if ( $ti > $tmax ) ;
> }
> }
> close (IN) ;
>
> $tag = "";
>
> my %tables ; # hash with coordinates from fms file
> my $model ; # naam van het FE model
> my $i = 0 ; #
> # Readfile to create hash of the coordinate tables
> #
> my $parser = new XML:arser;
>
> $parser->setHandlers( Char => \&ReadCharacterData,
> Default => \&default);
> print "Reading fms file ($fmsfile)\n" ; $parser->parsefile($fmsfile);
>
> ## Check info read in fms file
> #foreach $i ( keys %tables ) {
> # print "Table $i\n",$tables{$i},"\n End table $i\n\n";
> # }
>
> my $coords = 0 ;
>
> #
> # Read reffile and replace coordinate tables with data from fms file
> #
> my $bparser = new XML:arser;
> $bparser->setHandlers( XMLDecl => \&XmlDecl,
> Doctype => \&DocType,
> Start => \&startElement,
> End => \&endElement,
> Char => \&characterData,
> CdataStart => \&cdatastart,
> CdataEnd => \&cdataend,
> Default => \&default);
> print "Reading ($reffile) and writing ($outfile) \n" ;
> $bparser->parsefile($reffile);
>
> $writer->end() ;
>
> #
> ################################################## ######################
> #
>
> sub XmlDecl {
> my( $parseinst, $version, $encoding, $standalone ) = @_;
> $writer->xmlDecl( $encoding, $standalone );
> }
>
> sub DocType {
> my( $parseinst, $name, $sysid, $pub, $internal ) = @_;
> $writer->doctype( $name, $pub, $sysid );
> }
>
> sub startElement {
> # Reading xml data
> my( $parseinst, $element, %attrs ) = @_;
> SWITCH: {
> if ($element eq "FE_MODEL") {
> $model = $attrs{'NAME'} ;
> $tag = "DEFINE";
> # print "FE model $model\n" ;
> last SWITCH;
> }
> if ($element eq "TABLE" && $attrs{'TYPE'} =~ /COORDINATE/ ) {
> $coords = 1 ;
> # print "$coords - TABLE COORDINATES\n" ;
> }
> last SWITCH ;
> }
> $writer -> startTag( $element , %attrs );
> }
>
> sub endElement {
>
> my( $parseinst, $element ) = @_;
> $coords = 0 ;
> $writer -> endTag( $element ) ;
> }
>
> sub ReadCharacterData {
> my( $parseinst, $data ) = @_;
> SWITCH: {
> if ( $data =~ /^\s*$/ ) {
> last ;
> };
> if ( $data =~ /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/, $data ;
> # print "Timepoint ", $ti, "\n" ;
> last ;
> } ;
> if ( $data =~ /FE MODEL/ ) {
> ($dum, $dum, $dum, $dum, $dum ) = split /\s+/, $data ;
> ( $txt = $dum ) =~ s/\/.*\/// ; # strip system numbering
> # print $txt, "\n" ;
> $tables{$txt} = ' ' ;
> last ;
> } ;
> if ( $ti == $tmax ) {
> # print $data ;
> $tables{$txt} .= $data . "\n" ;
> last ;
> } ;
> }
> }
>
>
> sub characterData {
> my( $parseinst, $data ) = @_;
> if ( $writer->within_element('FE_MODEL') && $writer->within_element('TABLE') && $coords && $data != /^\s*$/ ) {
> $writer -> characters ( $tables{$model} ) ;
> $tables{$model} = ' ' ; #empty table
> }
> elsif ( ! $coords ) {
> # print "Coords $coords : $data";
> $writer -> characters( $data ) ;
> }
> }
>
> sub cdatastart {
> $writer -> raw( "<![CDATA[\n" );
> }
> sub cdataend {
> $writer -> raw( "]]>\n" );
> }
> sub default {
>
> # do nothing, but stay quiet
>
> }


I have to agree with John Bokma, get a better strategy
for capturing content data. Separate the processing
from the event handling as much as possible.
You can't rely upon parsing, nor the parser to
return content as in the source form. And you can't
really tell where content data begins and ends
without processing start and end events as well.

Print out the xml with indents so you can visually
see what is being sent to the handlers. A form
like this may help -

my $RD_xml = '';
my $last_content = '';
my $RD_xml = '';
my $special_tag = 0;

sub default_start_handler
{
my ($p, $element, %atts) = @_;
$element = uc($element);
$last_content = '';

## Check for start of singular tag data capture
## -----------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT')
{ $special_tag = 1; }

## Check for start of XML chunk data capture
## -----------------------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 1;
}
if ($capturing_Is_part_of_larger_xml)
{ $RD_xml .= $p->original_string; }
}

sub default_content_handler
{
my ($p, $str) = @_;

## Use original for entities, incase reparse
## --------------------------------------------
$str = $p->original_string;

## Remove leading/trailing space, newline, tab
## if you want to do this now....
## -----------------------------------------------
$str =~ s/^[\x20\n\t]+//; $str =~ s/[\x20\n\t]+$//;

## Capture what is necessary. Last content is
## always captured by default
## -----------------------------------------------
if (length ($str) > 0) {
$last_content .= $str;
$RD_xml .= $str if ($capturing_Is_part_of_larger_xml);
}
}

sub default_end_handler
{
my ($p, $element) = @_;
$element = uc($element);

## Handle singular capture of special tag data
## ---------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT') {
ProcessContent ($last_content) if ($special_tag);
$special_tag = 0;
}
$last_content = '';

## Handle larger capture XML chunks
## -----------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
if ($capturing_Is_part_of_larger_xml) {
$RD_xml .= $p->original_string;
ProcessXmlChunk ($RD_xml);
}
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 0;
}
}

 
Reply With Quote
 
robic0@yahoo.com
Guest
Posts: n/a
 
      12-08-2005

Eric van Oorschot wrote:
> Hi,
>
> I'm writing a Perl script that has to copy a block of data (nodes numbers
> and coordinates) from one XML formatted file into another XML file.
> I'm using XML:arser to extract the data and XML::Writer to write the
> data into the second file.
>
> This does not work, since some of the numbers are corrupted after being
> read by XML:arser. Below I have copied a small bit that shows how the
> data is corrupted. It always happens at the same line(s) of data.
>
> 67 2.9005093479606E+000 3.6637104002418E-001 7.9522656092442E-001
> 68 2.8852994122583E+000 3.5353599488296E-001 7.7516591265738E-001
> 69 2.9109259023248E+000 3.5272037818926E-001 8.1765470045
> 602E-001
> 70 2.9014248453522E+000 3.4032368974452E-001 7.9417266267164E-001
> 71 2.8849923984542E+000 3.2706829720117E-001 7.7537618002780E-001
>
> My Perl script (I am not an experienced Perl programmer) is shown below.
> The error occurs in the sub 'ReadCharacterData'. In this subroutine the
> data is read and copied into a hash %tables. When writing this hash in the
> output file the error shown above is found.
>
> If anyone has an idea, or needs more info, please reply.
>
> Regards,
>
> Eric
>
>
> use XML:arser;
> use IO::File;
> use Switch ;
> use XML::Writer;
>
> my $fmsfile = shift ; # fms output file
> my $reffile = shift ; # Exchange output deck
> my $outfile = shift ; # Output file
>
> die "Cannot find fms output file \"$xmlfile\""
> unless -f $fmsfile;
>
> die "Cannot find xml input deck \"$reffile\""
> unless -f $reffile;
>
> my $output = new IO::File(">$outfile");
> my $writer = new XML::Writer( OUTPUT => $output, UNSAFE => 1 );
>
> #
> # Find tmax in fms file
> #
> my $tmax = 0.00 ;
> open ( IN, $fmsfile ) ;
> while ( <IN> ) {
> if ( /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/ ;
> $tmax = $ti if ( $ti > $tmax ) ;
> }
> }
> close (IN) ;
>
> $tag = "";
>
> my %tables ; # hash with coordinates from fms file
> my $model ; # naam van het FE model
> my $i = 0 ; #
> # Readfile to create hash of the coordinate tables
> #
> my $parser = new XML:arser;
>
> $parser->setHandlers( Char => \&ReadCharacterData,
> Default => \&default);
> print "Reading fms file ($fmsfile)\n" ; $parser->parsefile($fmsfile);
>
> ## Check info read in fms file
> #foreach $i ( keys %tables ) {
> # print "Table $i\n",$tables{$i},"\n End table $i\n\n";
> # }
>
> my $coords = 0 ;
>
> #
> # Read reffile and replace coordinate tables with data from fms file
> #
> my $bparser = new XML:arser;
> $bparser->setHandlers( XMLDecl => \&XmlDecl,
> Doctype => \&DocType,
> Start => \&startElement,
> End => \&endElement,
> Char => \&characterData,
> CdataStart => \&cdatastart,
> CdataEnd => \&cdataend,
> Default => \&default);
> print "Reading ($reffile) and writing ($outfile) \n" ;
> $bparser->parsefile($reffile);
>
> $writer->end() ;
>
> #
> ################################################## ######################
> #
>
> sub XmlDecl {
> my( $parseinst, $version, $encoding, $standalone ) = @_;
> $writer->xmlDecl( $encoding, $standalone );
> }
>
> sub DocType {
> my( $parseinst, $name, $sysid, $pub, $internal ) = @_;
> $writer->doctype( $name, $pub, $sysid );
> }
>
> sub startElement {
> # Reading xml data
> my( $parseinst, $element, %attrs ) = @_;
> SWITCH: {
> if ($element eq "FE_MODEL") {
> $model = $attrs{'NAME'} ;
> $tag = "DEFINE";
> # print "FE model $model\n" ;
> last SWITCH;
> }
> if ($element eq "TABLE" && $attrs{'TYPE'} =~ /COORDINATE/ ) {
> $coords = 1 ;
> # print "$coords - TABLE COORDINATES\n" ;
> }
> last SWITCH ;
> }
> $writer -> startTag( $element , %attrs );
> }
>
> sub endElement {
>
> my( $parseinst, $element ) = @_;
> $coords = 0 ;
> $writer -> endTag( $element ) ;
> }
>
> sub ReadCharacterData {
> my( $parseinst, $data ) = @_;
> SWITCH: {
> if ( $data =~ /^\s*$/ ) {
> last ;
> };
> if ( $data =~ /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/, $data ;
> # print "Timepoint ", $ti, "\n" ;
> last ;
> } ;
> if ( $data =~ /FE MODEL/ ) {
> ($dum, $dum, $dum, $dum, $dum ) = split /\s+/, $data ;
> ( $txt = $dum ) =~ s/\/.*\/// ; # strip system numbering
> # print $txt, "\n" ;
> $tables{$txt} = ' ' ;
> last ;
> } ;
> if ( $ti == $tmax ) {
> # print $data ;
> $tables{$txt} .= $data . "\n" ;
> last ;
> } ;
> }
> }
>
>
> sub characterData {
> my( $parseinst, $data ) = @_;
> if ( $writer->within_element('FE_MODEL') && $writer->within_element('TABLE') && $coords && $data != /^\s*$/ ) {
> $writer -> characters ( $tables{$model} ) ;
> $tables{$model} = ' ' ; #empty table
> }
> elsif ( ! $coords ) {
> # print "Coords $coords : $data";
> $writer -> characters( $data ) ;
> }
> }
>
> sub cdatastart {
> $writer -> raw( "<![CDATA[\n" );
> }
> sub cdataend {
> $writer -> raw( "]]>\n" );
> }
> sub default {
>
> # do nothing, but stay quiet
>
> }


I have to agree with John Bokma, get a better strategy
for capturing content data. Separate the processing
from the event handling as much as possible.
You can't rely upon parsing, nor the parser to
return content as in the source form. And you can't
really tell where content data begins and ends
without processing start and end events as well.

Print out the xml with indents so you can visually
see what is being sent to the handlers. A form
like this may help -

my $RD_xml = '';
my $last_content = '';
my $RD_xml = '';
my $special_tag = 0;

sub default_start_handler
{
my ($p, $element, %atts) = @_;
$element = uc($element);
$last_content = '';

## Check for start of singular tag data capture
## -----------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT')
{ $special_tag = 1; }

## Check for start of XML chunk data capture
## -----------------------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 1;
}
if ($capturing_Is_part_of_larger_xml)
{ $RD_xml .= $p->original_string; }
}

sub default_content_handler
{
my ($p, $str) = @_;

## Use original for entities, incase reparse
## --------------------------------------------
$str = $p->original_string;

## Remove leading/trailing space, newline, tab
## if you want to do this now....
## -----------------------------------------------
$str =~ s/^[\x20\n\t]+//; $str =~ s/[\x20\n\t]+$//;

## Capture what is necessary. Last content is
## always captured by default
## -----------------------------------------------
if (length ($str) > 0) {
$last_content .= $str;
$RD_xml .= $str if ($capturing_Is_part_of_larger_xml);
}
}

sub default_end_handler
{
my ($p, $element) = @_;
$element = uc($element);

## Handle singular capture of special tag data
## ---------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT') {
ProcessContent ($last_content) if ($special_tag);
$special_tag = 0;
}
$last_content = '';

## Handle larger capture XML chunks
## -----------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
if ($capturing_Is_part_of_larger_xml) {
$RD_xml .= $p->original_string;
ProcessXmlChunk ($RD_xml);
}
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 0;
}
}

 
Reply With Quote
 
robic0@yahoo.com
Guest
Posts: n/a
 
      12-08-2005

Eric van Oorschot wrote:
> Hi,
>
> I'm writing a Perl script that has to copy a block of data (nodes numbers
> and coordinates) from one XML formatted file into another XML file.
> I'm using XML:arser to extract the data and XML::Writer to write the
> data into the second file.
>
> This does not work, since some of the numbers are corrupted after being
> read by XML:arser. Below I have copied a small bit that shows how the
> data is corrupted. It always happens at the same line(s) of data.
>
> 67 2.9005093479606E+000 3.6637104002418E-001 7.9522656092442E-001
> 68 2.8852994122583E+000 3.5353599488296E-001 7.7516591265738E-001
> 69 2.9109259023248E+000 3.5272037818926E-001 8.1765470045
> 602E-001
> 70 2.9014248453522E+000 3.4032368974452E-001 7.9417266267164E-001
> 71 2.8849923984542E+000 3.2706829720117E-001 7.7537618002780E-001
>
> My Perl script (I am not an experienced Perl programmer) is shown below.
> The error occurs in the sub 'ReadCharacterData'. In this subroutine the
> data is read and copied into a hash %tables. When writing this hash in the
> output file the error shown above is found.
>
> If anyone has an idea, or needs more info, please reply.
>
> Regards,
>
> Eric
>
>
> use XML:arser;
> use IO::File;
> use Switch ;
> use XML::Writer;
>
> my $fmsfile = shift ; # fms output file
> my $reffile = shift ; # Exchange output deck
> my $outfile = shift ; # Output file
>
> die "Cannot find fms output file \"$xmlfile\""
> unless -f $fmsfile;
>
> die "Cannot find xml input deck \"$reffile\""
> unless -f $reffile;
>
> my $output = new IO::File(">$outfile");
> my $writer = new XML::Writer( OUTPUT => $output, UNSAFE => 1 );
>
> #
> # Find tmax in fms file
> #
> my $tmax = 0.00 ;
> open ( IN, $fmsfile ) ;
> while ( <IN> ) {
> if ( /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/ ;
> $tmax = $ti if ( $ti > $tmax ) ;
> }
> }
> close (IN) ;
>
> $tag = "";
>
> my %tables ; # hash with coordinates from fms file
> my $model ; # naam van het FE model
> my $i = 0 ; #
> # Readfile to create hash of the coordinate tables
> #
> my $parser = new XML:arser;
>
> $parser->setHandlers( Char => \&ReadCharacterData,
> Default => \&default);
> print "Reading fms file ($fmsfile)\n" ; $parser->parsefile($fmsfile);
>
> ## Check info read in fms file
> #foreach $i ( keys %tables ) {
> # print "Table $i\n",$tables{$i},"\n End table $i\n\n";
> # }
>
> my $coords = 0 ;
>
> #
> # Read reffile and replace coordinate tables with data from fms file
> #
> my $bparser = new XML:arser;
> $bparser->setHandlers( XMLDecl => \&XmlDecl,
> Doctype => \&DocType,
> Start => \&startElement,
> End => \&endElement,
> Char => \&characterData,
> CdataStart => \&cdatastart,
> CdataEnd => \&cdataend,
> Default => \&default);
> print "Reading ($reffile) and writing ($outfile) \n" ;
> $bparser->parsefile($reffile);
>
> $writer->end() ;
>
> #
> ################################################## ######################
> #
>
> sub XmlDecl {
> my( $parseinst, $version, $encoding, $standalone ) = @_;
> $writer->xmlDecl( $encoding, $standalone );
> }
>
> sub DocType {
> my( $parseinst, $name, $sysid, $pub, $internal ) = @_;
> $writer->doctype( $name, $pub, $sysid );
> }
>
> sub startElement {
> # Reading xml data
> my( $parseinst, $element, %attrs ) = @_;
> SWITCH: {
> if ($element eq "FE_MODEL") {
> $model = $attrs{'NAME'} ;
> $tag = "DEFINE";
> # print "FE model $model\n" ;
> last SWITCH;
> }
> if ($element eq "TABLE" && $attrs{'TYPE'} =~ /COORDINATE/ ) {
> $coords = 1 ;
> # print "$coords - TABLE COORDINATES\n" ;
> }
> last SWITCH ;
> }
> $writer -> startTag( $element , %attrs );
> }
>
> sub endElement {
>
> my( $parseinst, $element ) = @_;
> $coords = 0 ;
> $writer -> endTag( $element ) ;
> }
>
> sub ReadCharacterData {
> my( $parseinst, $data ) = @_;
> SWITCH: {
> if ( $data =~ /^\s*$/ ) {
> last ;
> };
> if ( $data =~ /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/, $data ;
> # print "Timepoint ", $ti, "\n" ;
> last ;
> } ;
> if ( $data =~ /FE MODEL/ ) {
> ($dum, $dum, $dum, $dum, $dum ) = split /\s+/, $data ;
> ( $txt = $dum ) =~ s/\/.*\/// ; # strip system numbering
> # print $txt, "\n" ;
> $tables{$txt} = ' ' ;
> last ;
> } ;
> if ( $ti == $tmax ) {
> # print $data ;
> $tables{$txt} .= $data . "\n" ;
> last ;
> } ;
> }
> }
>
>
> sub characterData {
> my( $parseinst, $data ) = @_;
> if ( $writer->within_element('FE_MODEL') && $writer->within_element('TABLE') && $coords && $data != /^\s*$/ ) {
> $writer -> characters ( $tables{$model} ) ;
> $tables{$model} = ' ' ; #empty table
> }
> elsif ( ! $coords ) {
> # print "Coords $coords : $data";
> $writer -> characters( $data ) ;
> }
> }
>
> sub cdatastart {
> $writer -> raw( "<![CDATA[\n" );
> }
> sub cdataend {
> $writer -> raw( "]]>\n" );
> }
> sub default {
>
> # do nothing, but stay quiet
>
> }


2nd try to send.
I agree with John Bokma, you need the start and end handlers as well.
Something like this --

my $RD_xml = '';
my $last_content = '';
my $RD_xml = '';
my $special_tag = 0;

sub default_start_handler
{
my ($p, $element, %atts) = @_;
$element = uc($element);
$last_content = '';

## Check for start of singular tag data capture
## -----------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT')
{ $special_tag = 1; }

## Check for start of XML chunk data capture
## -----------------------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 1;
}
if ($capturing_Is_part_of_larger_xml)
{ $RD_xml .= $p->original_string; }
}

sub default_content_handler
{
my ($p, $str) = @_;

## Use original for entities, incase reparse
## --------------------------------------------
$str = $p->original_string;

## Remove leading/trailing space, newline, tab
## if you want to do this now....
## -----------------------------------------------
$str =~ s/^[\x20\n\t]+//; $str =~ s/[\x20\n\t]+$//;

## Capture what is necessary. Last content is
## always captured by default
## -----------------------------------------------
if (length ($str) > 0) {
$last_content .= $str;
$RD_xml .= $str if ($capturing_Is_part_of_larger_xml);
}
}

sub default_end_handler
{
my ($p, $element) = @_;
$element = uc($element);

## Handle singular capture of special tag data
## ---------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT') {
ProcessContent ($last_content) if ($special_tag);
$special_tag = 0;
}
$last_content = '';

## Handle larger capture XML chunks
## -----------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
if ($capturing_Is_part_of_larger_xml) {
$RD_xml .= $p->original_string;
ProcessXmlChunk ($RD_xml);
}
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 0;
}
}

 
Reply With Quote
 
robic0@yahoo.com
Guest
Posts: n/a
 
      12-08-2005

Eric van Oorschot wrote:
> Hi,
>
> I'm writing a Perl script that has to copy a block of data (nodes numbers
> and coordinates) from one XML formatted file into another XML file.
> I'm using XML:arser to extract the data and XML::Writer to write the
> data into the second file.
>
> This does not work, since some of the numbers are corrupted after being
> read by XML:arser. Below I have copied a small bit that shows how the
> data is corrupted. It always happens at the same line(s) of data.
>
> 67 2.9005093479606E+000 3.6637104002418E-001 7.9522656092442E-001
> 68 2.8852994122583E+000 3.5353599488296E-001 7.7516591265738E-001
> 69 2.9109259023248E+000 3.5272037818926E-001 8.1765470045
> 602E-001
> 70 2.9014248453522E+000 3.4032368974452E-001 7.9417266267164E-001
> 71 2.8849923984542E+000 3.2706829720117E-001 7.7537618002780E-001
>
> My Perl script (I am not an experienced Perl programmer) is shown below.
> The error occurs in the sub 'ReadCharacterData'. In this subroutine the
> data is read and copied into a hash %tables. When writing this hash in the
> output file the error shown above is found.
>
> If anyone has an idea, or needs more info, please reply.
>
> Regards,
>
> Eric
>
>
> use XML:arser;
> use IO::File;
> use Switch ;
> use XML::Writer;
>
> my $fmsfile = shift ; # fms output file
> my $reffile = shift ; # Exchange output deck
> my $outfile = shift ; # Output file
>
> die "Cannot find fms output file \"$xmlfile\""
> unless -f $fmsfile;
>
> die "Cannot find xml input deck \"$reffile\""
> unless -f $reffile;
>
> my $output = new IO::File(">$outfile");
> my $writer = new XML::Writer( OUTPUT => $output, UNSAFE => 1 );
>
> #
> # Find tmax in fms file
> #
> my $tmax = 0.00 ;
> open ( IN, $fmsfile ) ;
> while ( <IN> ) {
> if ( /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/ ;
> $tmax = $ti if ( $ti > $tmax ) ;
> }
> }
> close (IN) ;
>
> $tag = "";
>
> my %tables ; # hash with coordinates from fms file
> my $model ; # naam van het FE model
> my $i = 0 ; #
> # Readfile to create hash of the coordinate tables
> #
> my $parser = new XML:arser;
>
> $parser->setHandlers( Char => \&ReadCharacterData,
> Default => \&default);
> print "Reading fms file ($fmsfile)\n" ; $parser->parsefile($fmsfile);
>
> ## Check info read in fms file
> #foreach $i ( keys %tables ) {
> # print "Table $i\n",$tables{$i},"\n End table $i\n\n";
> # }
>
> my $coords = 0 ;
>
> #
> # Read reffile and replace coordinate tables with data from fms file
> #
> my $bparser = new XML:arser;
> $bparser->setHandlers( XMLDecl => \&XmlDecl,
> Doctype => \&DocType,
> Start => \&startElement,
> End => \&endElement,
> Char => \&characterData,
> CdataStart => \&cdatastart,
> CdataEnd => \&cdataend,
> Default => \&default);
> print "Reading ($reffile) and writing ($outfile) \n" ;
> $bparser->parsefile($reffile);
>
> $writer->end() ;
>
> #
> ################################################## ######################
> #
>
> sub XmlDecl {
> my( $parseinst, $version, $encoding, $standalone ) = @_;
> $writer->xmlDecl( $encoding, $standalone );
> }
>
> sub DocType {
> my( $parseinst, $name, $sysid, $pub, $internal ) = @_;
> $writer->doctype( $name, $pub, $sysid );
> }
>
> sub startElement {
> # Reading xml data
> my( $parseinst, $element, %attrs ) = @_;
> SWITCH: {
> if ($element eq "FE_MODEL") {
> $model = $attrs{'NAME'} ;
> $tag = "DEFINE";
> # print "FE model $model\n" ;
> last SWITCH;
> }
> if ($element eq "TABLE" && $attrs{'TYPE'} =~ /COORDINATE/ ) {
> $coords = 1 ;
> # print "$coords - TABLE COORDINATES\n" ;
> }
> last SWITCH ;
> }
> $writer -> startTag( $element , %attrs );
> }
>
> sub endElement {
>
> my( $parseinst, $element ) = @_;
> $coords = 0 ;
> $writer -> endTag( $element ) ;
> }
>
> sub ReadCharacterData {
> my( $parseinst, $data ) = @_;
> SWITCH: {
> if ( $data =~ /^\s*$/ ) {
> last ;
> };
> if ( $data =~ /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/, $data ;
> # print "Timepoint ", $ti, "\n" ;
> last ;
> } ;
> if ( $data =~ /FE MODEL/ ) {
> ($dum, $dum, $dum, $dum, $dum ) = split /\s+/, $data ;
> ( $txt = $dum ) =~ s/\/.*\/// ; # strip system numbering
> # print $txt, "\n" ;
> $tables{$txt} = ' ' ;
> last ;
> } ;
> if ( $ti == $tmax ) {
> # print $data ;
> $tables{$txt} .= $data . "\n" ;
> last ;
> } ;
> }
> }
>
>
> sub characterData {
> my( $parseinst, $data ) = @_;
> if ( $writer->within_element('FE_MODEL') && $writer->within_element('TABLE') && $coords && $data != /^\s*$/ ) {
> $writer -> characters ( $tables{$model} ) ;
> $tables{$model} = ' ' ; #empty table
> }
> elsif ( ! $coords ) {
> # print "Coords $coords : $data";
> $writer -> characters( $data ) ;
> }
> }
>
> sub cdatastart {
> $writer -> raw( "<![CDATA[\n" );
> }
> sub cdataend {
> $writer -> raw( "]]>\n" );
> }
> sub default {
>
> # do nothing, but stay quiet
>
> }


2nd try to send.
I agree with John Bokma, you need the start and end handlers as well.
Something like this --

my $RD_xml = '';
my $last_content = '';
my $RD_xml = '';
my $special_tag = 0;

sub default_start_handler
{
my ($p, $element, %atts) = @_;
$element = uc($element);
$last_content = '';

## Check for start of singular tag data capture
## -----------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT')
{ $special_tag = 1; }

## Check for start of XML chunk data capture
## -----------------------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 1;
}
if ($capturing_Is_part_of_larger_xml)
{ $RD_xml .= $p->original_string; }
}

sub default_content_handler
{
my ($p, $str) = @_;

## Use original for entities, incase reparse
## --------------------------------------------
$str = $p->original_string;

## Remove leading/trailing space, newline, tab
## if you want to do this now....
## -----------------------------------------------
$str =~ s/^[\x20\n\t]+//; $str =~ s/[\x20\n\t]+$//;

## Capture what is necessary. Last content is
## always captured by default
## -----------------------------------------------
if (length ($str) > 0) {
$last_content .= $str;
$RD_xml .= $str if ($capturing_Is_part_of_larger_xml);
}
}

sub default_end_handler
{
my ($p, $element) = @_;
$element = uc($element);

## Handle singular capture of special tag data
## ---------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT') {
ProcessContent ($last_content) if ($special_tag);
$special_tag = 0;
}
$last_content = '';

## Handle larger capture XML chunks
## -----------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
if ($capturing_Is_part_of_larger_xml) {
$RD_xml .= $p->original_string;
ProcessXmlChunk ($RD_xml);
}
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 0;
}
}

 
Reply With Quote
 
robic0@yahoo.com
Guest
Posts: n/a
 
      12-08-2005

Eric van Oorschot wrote:
> Hi,
>
> I'm writing a Perl script that has to copy a block of data (nodes numbers
> and coordinates) from one XML formatted file into another XML file.
> I'm using XML:arser to extract the data and XML::Writer to write the
> data into the second file.
>
> This does not work, since some of the numbers are corrupted after being
> read by XML:arser. Below I have copied a small bit that shows how the
> data is corrupted. It always happens at the same line(s) of data.
>
> 67 2.9005093479606E+000 3.6637104002418E-001 7.9522656092442E-001
> 68 2.8852994122583E+000 3.5353599488296E-001 7.7516591265738E-001
> 69 2.9109259023248E+000 3.5272037818926E-001 8.1765470045
> 602E-001
> 70 2.9014248453522E+000 3.4032368974452E-001 7.9417266267164E-001
> 71 2.8849923984542E+000 3.2706829720117E-001 7.7537618002780E-001
>
> My Perl script (I am not an experienced Perl programmer) is shown below.
> The error occurs in the sub 'ReadCharacterData'. In this subroutine the
> data is read and copied into a hash %tables. When writing this hash in the
> output file the error shown above is found.
>
> If anyone has an idea, or needs more info, please reply.
>
> Regards,
>
> Eric
>
>
> use XML:arser;
> use IO::File;
> use Switch ;
> use XML::Writer;
>
> my $fmsfile = shift ; # fms output file
> my $reffile = shift ; # Exchange output deck
> my $outfile = shift ; # Output file
>
> die "Cannot find fms output file \"$xmlfile\""
> unless -f $fmsfile;
>
> die "Cannot find xml input deck \"$reffile\""
> unless -f $reffile;
>
> my $output = new IO::File(">$outfile");
> my $writer = new XML::Writer( OUTPUT => $output, UNSAFE => 1 );
>
> #
> # Find tmax in fms file
> #
> my $tmax = 0.00 ;
> open ( IN, $fmsfile ) ;
> while ( <IN> ) {
> if ( /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/ ;
> $tmax = $ti if ( $ti > $tmax ) ;
> }
> }
> close (IN) ;
>
> $tag = "";
>
> my %tables ; # hash with coordinates from fms file
> my $model ; # naam van het FE model
> my $i = 0 ; #
> # Readfile to create hash of the coordinate tables
> #
> my $parser = new XML:arser;
>
> $parser->setHandlers( Char => \&ReadCharacterData,
> Default => \&default);
> print "Reading fms file ($fmsfile)\n" ; $parser->parsefile($fmsfile);
>
> ## Check info read in fms file
> #foreach $i ( keys %tables ) {
> # print "Table $i\n",$tables{$i},"\n End table $i\n\n";
> # }
>
> my $coords = 0 ;
>
> #
> # Read reffile and replace coordinate tables with data from fms file
> #
> my $bparser = new XML:arser;
> $bparser->setHandlers( XMLDecl => \&XmlDecl,
> Doctype => \&DocType,
> Start => \&startElement,
> End => \&endElement,
> Char => \&characterData,
> CdataStart => \&cdatastart,
> CdataEnd => \&cdataend,
> Default => \&default);
> print "Reading ($reffile) and writing ($outfile) \n" ;
> $bparser->parsefile($reffile);
>
> $writer->end() ;
>
> #
> ################################################## ######################
> #
>
> sub XmlDecl {
> my( $parseinst, $version, $encoding, $standalone ) = @_;
> $writer->xmlDecl( $encoding, $standalone );
> }
>
> sub DocType {
> my( $parseinst, $name, $sysid, $pub, $internal ) = @_;
> $writer->doctype( $name, $pub, $sysid );
> }
>
> sub startElement {
> # Reading xml data
> my( $parseinst, $element, %attrs ) = @_;
> SWITCH: {
> if ($element eq "FE_MODEL") {
> $model = $attrs{'NAME'} ;
> $tag = "DEFINE";
> # print "FE model $model\n" ;
> last SWITCH;
> }
> if ($element eq "TABLE" && $attrs{'TYPE'} =~ /COORDINATE/ ) {
> $coords = 1 ;
> # print "$coords - TABLE COORDINATES\n" ;
> }
> last SWITCH ;
> }
> $writer -> startTag( $element , %attrs );
> }
>
> sub endElement {
>
> my( $parseinst, $element ) = @_;
> $coords = 0 ;
> $writer -> endTag( $element ) ;
> }
>
> sub ReadCharacterData {
> my( $parseinst, $data ) = @_;
> SWITCH: {
> if ( $data =~ /^\s*$/ ) {
> last ;
> };
> if ( $data =~ /TIME/ ) {
> ( $dum, $dum, $dum, $ti ) = split /\s+/, $data ;
> # print "Timepoint ", $ti, "\n" ;
> last ;
> } ;
> if ( $data =~ /FE MODEL/ ) {
> ($dum, $dum, $dum, $dum, $dum ) = split /\s+/, $data ;
> ( $txt = $dum ) =~ s/\/.*\/// ; # strip system numbering
> # print $txt, "\n" ;
> $tables{$txt} = ' ' ;
> last ;
> } ;
> if ( $ti == $tmax ) {
> # print $data ;
> $tables{$txt} .= $data . "\n" ;
> last ;
> } ;
> }
> }
>
>
> sub characterData {
> my( $parseinst, $data ) = @_;
> if ( $writer->within_element('FE_MODEL') && $writer->within_element('TABLE') && $coords && $data != /^\s*$/ ) {
> $writer -> characters ( $tables{$model} ) ;
> $tables{$model} = ' ' ; #empty table
> }
> elsif ( ! $coords ) {
> # print "Coords $coords : $data";
> $writer -> characters( $data ) ;
> }
> }
>
> sub cdatastart {
> $writer -> raw( "<![CDATA[\n" );
> }
> sub cdataend {
> $writer -> raw( "]]>\n" );
> }
> sub default {
>
> # do nothing, but stay quiet
>
> }


2nd try to send.
I agree with John Bokma, you need the start and end handlers as well.
Something like this --

my $RD_xml = '';
my $last_content = '';
my $RD_xml = '';
my $special_tag = 0;

sub default_start_handler
{
my ($p, $element, %atts) = @_;
$element = uc($element);
$last_content = '';

## Check for start of singular tag data capture
## -----------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT')
{ $special_tag = 1; }

## Check for start of XML chunk data capture
## -----------------------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 1;
}
if ($capturing_Is_part_of_larger_xml)
{ $RD_xml .= $p->original_string; }
}

sub default_content_handler
{
my ($p, $str) = @_;

## Use original for entities, incase reparse
## --------------------------------------------
$str = $p->original_string;

## Remove leading/trailing space, newline, tab
## if you want to do this now....
## -----------------------------------------------
$str =~ s/^[\x20\n\t]+//; $str =~ s/[\x20\n\t]+$//;

## Capture what is necessary. Last content is
## always captured by default
## -----------------------------------------------
if (length ($str) > 0) {
$last_content .= $str;
$RD_xml .= $str if ($capturing_Is_part_of_larger_xml);
}
}

sub default_end_handler
{
my ($p, $element) = @_;
$element = uc($element);

## Handle singular capture of special tag data
## ---------------------------------------------
if ($element eq 'SPECIAL_TAG_ELEMENT') {
ProcessContent ($last_content) if ($special_tag);
$special_tag = 0;
}
$last_content = '';

## Handle larger capture XML chunks
## -----------------------------------
if ($element eq 'CAPTURE_ALL_OF_ME') {
if ($capturing_Is_part_of_larger_xml) {
$RD_xml .= $p->original_string;
ProcessXmlChunk ($RD_xml);
}
$RD_xml = '';
$capturing_Is_part_of_larger_xml = 0;
}
}

 
Reply With Quote
 
John Bokma
Guest
Posts: n/a
 
      12-08-2005
"(E-Mail Removed)" <(E-Mail Removed)> wrote:

> I agree with John Bokma,


Something is broke on your side, or mine, but I see this message the 6th
time or so .

--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
I ploink googlegroups.com

 
Reply With Quote
 
robic0
Guest
Posts: n/a
 
      12-08-2005
On 8 Dec 2005 05:20:17 GMT, John Bokma <(E-Mail Removed)> wrote:

>"(E-Mail Removed)" <(E-Mail Removed)> wrote:
>
>> I agree with John Bokma,

>
>Something is broke on your side, or mine, but I see this message the 6th
>time or so .

Yup, thanks to Google (again)..
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
File::Copy::copy With File Handles MaggotChild Perl Misc 2 10-22-2011 12:15 AM
unicode shutil.copy() changes a file name during copy? dave Python 6 02-16-2011 09:53 PM
what is Deep Copy, shallow copy and bitwises copy.? saxenavaibhav17@gmail.com C++ 26 09-01-2006 09:37 PM
is dict.copy() a deep copy or a shallow copy Alex Python 2 09-05-2005 07:01 AM
using File.Copy to copy files to shared hosting site Steve Richter ASP .Net 4 04-18-2005 03:06 PM



Advertisments