Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   validate XML file content (http://www.velocityreviews.com/forums/t890229-validate-xml-file-content.html)

Sara 01-17-2005 08:43 AM

validate XML file content
 
Hi all,

I have just started using XML::* modules for validating XML files and
I am trying to understand which module ('tree' or 'stream') would fit
for my requirement which is to get data (a specific nodelist) from a
simple external XML file based on user-input and use it to validate a
source XML file, more specifically the content of elements in the XML
file.

For instance the content of element 'figure' should start with a
string matching regex /^Fig\. \d+\b/. The external file would be
having the format of the element's content in regex format. I have
planned to use XML::XPath for reading the external XML file, but still
undecided about what to use for validating the source XML file because
of the following points.

Is there a better way of doing the following piece of code, in terms
of ease of maintenance and secondarily, code size.

sub start_element {
my ($self, $element) = @_;

if ($element->{Name} eq 'body') {
....
}
elsif ($element->{Name} eq 'head') {
....
}
}
Because if the element 'head' is removed or renamed then the code
would have to be changed. Instead if it was independent of the element
name this change would be eliminated.

Is XML::Checker the only module in CPAN to
1. check if ID of an element was defined
2. get number of times the ID was referenced?
I would prefer not to write less-optimized blocks of code if someone
has already done that in a far more better manner.

Finally, can someone please help or point me to help using namespaces
in SAX.

Thanks,
Sara

Peroli 01-17-2005 10:21 AM

Re: validate XML file content
 
hi sara,
Since you are a starter with XML, XML::* modules are pure perl
implementations XML parser. So if you need performance use XML::LibXML
module. It is implemented in C and more robust.
Considering the following XML Document (I think this is what you expect
)
<root>
<image>
<name>IMG_5000.gif</name>
<size>5000</size>
</image>
</root>

use strict;
use XML::LibXML ();

my $xmlfile = "somefile.xml";
my $xmlDom = undef;
eval {
$xmlDom = XML::LibXML->new()->parse_file($xmlfile);
};
die "can't parse xmlfile \n Error: $@\n" if($@);

foreach ($xmlDom->documentElement->findnodes('/root/image')) {
if($_->findvalue('name') =~ /^IMG_/) {
#dosomething
}
}

Doing this thing in SAX would require a new strategy. I think if you
are a newbie start with DOM, because its a lot easy to visualize the
whole problem.

Peroli Sivaprakasam


sa_ravenone@yahoo.com 01-18-2005 05:31 AM

Re: validate XML file content
 
Hi,
Thanks for the reply, Peroli. And sorry for not making things clear.
>Peroli wrote:
>Since you are a starter with XML, XML::* modules are pure perl

I am not a starter in XML and not a starter in Perl either, but surely
a newbie in using modules for processing XML.

>foreach ($xmlDom->documentElement->findnodes('/root/image')) {
>if($_->findvalue('name') =~ /^IMG_/) {
>#dosomething
>}
>}

Thanks again for the clearly-understandable example, that almost
matched what I had in mind, but don't we have to repeat the same loop
for all the elements that need to be validated?
Is there a shorter way to do this, by associating each XML element with
its corresponding validation subroutine.

>Doing this thing in SAX would require a new strategy. I think if you
>are a newbie start with DOM, because its a lot easy to visualize the
>whole problem.


I need to check for IDs, IDREFs and IDREF content also. For that I need
to process IDs before I check IDREFs. Because the order is not
sequential will this work with SAX?
Thanks for all the clarifications.
Sara.



All times are GMT. The time now is 08:29 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.