Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > validate XML file content

Reply
Thread Tools

validate XML file content

 
 
Sara
Guest
Posts: n/a
 
      01-17-2005
Hi all,

I have just started using XML::* modules for validating XML files and
I am trying to understand which module ('tree' or 'stream') would fit
for my requirement which is to get data (a specific nodelist) from a
simple external XML file based on user-input and use it to validate a
source XML file, more specifically the content of elements in the XML
file.

For instance the content of element 'figure' should start with a
string matching regex /^Fig\. \d+\b/. The external file would be
having the format of the element's content in regex format. I have
planned to use XML::XPath for reading the external XML file, but still
undecided about what to use for validating the source XML file because
of the following points.

Is there a better way of doing the following piece of code, in terms
of ease of maintenance and secondarily, code size.

sub start_element {
my ($self, $element) = @_;

if ($element->{Name} eq 'body') {
....
}
elsif ($element->{Name} eq 'head') {
....
}
}
Because if the element 'head' is removed or renamed then the code
would have to be changed. Instead if it was independent of the element
name this change would be eliminated.

Is XML::Checker the only module in CPAN to
1. check if ID of an element was defined
2. get number of times the ID was referenced?
I would prefer not to write less-optimized blocks of code if someone
has already done that in a far more better manner.

Finally, can someone please help or point me to help using namespaces
in SAX.

Thanks,
Sara
 
Reply With Quote
 
 
 
 
Peroli
Guest
Posts: n/a
 
      01-17-2005
hi sara,
Since you are a starter with XML, XML::* modules are pure perl
implementations XML parser. So if you need performance use XML::LibXML
module. It is implemented in C and more robust.
Considering the following XML Document (I think this is what you expect
)
<root>
<image>
<name>IMG_5000.gif</name>
<size>5000</size>
</image>
</root>

use strict;
use XML::LibXML ();

my $xmlfile = "somefile.xml";
my $xmlDom = undef;
eval {
$xmlDom = XML::LibXML->new()->parse_file($xmlfile);
};
die "can't parse xmlfile \n Error: $@\n" if($@);

foreach ($xmlDom->documentElement->findnodes('/root/image')) {
if($_->findvalue('name') =~ /^IMG_/) {
#dosomething
}
}

Doing this thing in SAX would require a new strategy. I think if you
are a newbie start with DOM, because its a lot easy to visualize the
whole problem.

Peroli Sivaprakasam

 
Reply With Quote
 
 
 
 
sa_ravenone@yahoo.com
Guest
Posts: n/a
 
      01-18-2005
Hi,
Thanks for the reply, Peroli. And sorry for not making things clear.
>Peroli wrote:
>Since you are a starter with XML, XML::* modules are pure perl

I am not a starter in XML and not a starter in Perl either, but surely
a newbie in using modules for processing XML.

>foreach ($xmlDom->documentElement->findnodes('/root/image')) {
>if($_->findvalue('name') =~ /^IMG_/) {
>#dosomething
>}
>}

Thanks again for the clearly-understandable example, that almost
matched what I had in mind, but don't we have to repeat the same loop
for all the elements that need to be validated?
Is there a shorter way to do this, by associating each XML element with
its corresponding validation subroutine.

>Doing this thing in SAX would require a new strategy. I think if you
>are a newbie start with DOM, because its a lot easy to visualize the
>whole problem.


I need to check for IDs, IDREFs and IDREF content also. For that I need
to process IDs before I check IDREFs. Because the order is not
sequential will this work with SAX?
Thanks for all the clarifications.
Sara.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
error: Only Content controls are allowed directly in a content page that contains Content controls. hazz ASP .Net 6 06-09-2010 01:54 PM
Validate XML against DTD and/or XML Schema? Reid Priedhorsky Python 2 04-17-2006 08:46 AM
Different results parsing a XML file with XML::Simple (XML::Sax vs. XML::Parser) Erik Wasser Perl Misc 5 03-05-2006 10:09 PM
How to validate a xml file by an external dtd file? yw XML 2 08-02-2005 01:21 AM
tool to validate xml file against custom XML Schema file Leona XML 9 11-01-2004 09:51 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57