Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > What kinds things can be verified of XML files?

Reply
Thread Tools

What kinds things can be verified of XML files?

 
 
Cambridge Ray
Guest
Posts: n/a
 
      08-10-2011

The question is so abstract, I guess I have to illustrate. One of my
XML files contains a set of rectangular coordinates:

<reference>
<line x1="416" y1="6436" x2="416" y2="3924" />
<line x1="420" y1="6436" x2="420" y2="3924" />
<line x1="1500" y1="5388" x2="1500" y2="4452" />
<line x1="1504" y1="4436" x2="1504" y2="3924" />
<line x1="2884" y1="5388" x2="2884" y2="4456" />
<line x1="412" y1="4436" x2="412" y2="3932" />
</reference>

I would like to make sure that every X2 is greater than or equal to
its X1 companion. Same for Y2 and Y1. Is this something that can be
easily checked at the XML level, or should I perform such check after
the XML file is read and parsed?

I use Xerces-C++.

TIA,

-Ramon
 
Reply With Quote
 
 
 
 
Cambridge Ray
Guest
Posts: n/a
 
      08-10-2011
On Aug 10, 4:31*pm, Cambridge Ray <(E-Mail Removed)> wrote:
> The question is so abstract, I guess I have to illustrate. One of my
> XML files contains a set of rectangular coordinates:
>
> <reference>
> * * <line x1="416" y1="6436" x2="416" y2="3924" />
> * * <line x1="420" y1="6436" x2="420" y2="3924" />
> * * <line x1="1500" y1="5388" x2="1500" y2="4452" />
> * * <line x1="1504" y1="4436" x2="1504" y2="3924" />
> * * <line x1="2884" y1="5388" x2="2884" y2="4456" />
> * * <line x1="412" y1="4436" x2="412" y2="3932" />
> </reference>
>
> I would like to make sure that every X2 is greater than or equal to
> its X1 companion. Same for Y2 and Y1. Is this something that can be
> easily checked at the XML level, or should I perform such check after
> the XML file is read and parsed?
>
> I use Xerces-C++.
>
> TIA,
>
> -Ramon


Here's another example. What I would like to check is that the
successive coordinates have an ascending order, and the "skip" element
should only contain 0 and 1 values. Can this be (relatively) easily be
verified at the XML level, or should I do it after the XML file is
read and parsed?

TIA,

-Ramon

-----------

<rows>
<coord>3449</coord>
<coord>3600</coord>
<coord>3893</coord>
<coord>4196</coord>
<coord>4340</coord>
<coord>4644</coord>
<coord>4941</coord>
<coord>5242</coord>
<coord>5541</coord>
</rows>

<columns>
<coord>278</coord>
<coord>876</coord>
<coord>1174</coord>
<coord>1783</coord>
<coord>2555</coord>
<coord>3154</coord>
<coord>4068</coord>
<coord>4825</coord>
</columns>

<skip>
<coord>0</coord>
<coord>1</coord>
<coord>1</coord>
<coord>0</coord>
<coord>1</coord>
</skip>
 
Reply With Quote
 
 
 
 
Joe Kesselman
Guest
Posts: n/a
 
      08-10-2011
>> <line x1="412" y1="4436" x2="412" y2="3932" />
>> I would like to make sure that every X2 is greater than or equal to
>> its X1 companion.


The standard XML DTD and Schema languages can't express that kind of
interaction; you'd need to implement it at a higher level of your
application. Basically, if something is application semantics the
application has to deal with it; if it's closer to syntax (type and
range limits, and many but not all kinds of document structure
constraint) schema can check it.

There have been alternatives to the W3C's XML Schema language which can
implement more complicated constraints. The problem is that they aren't
as well standardized or as widely supported, so you really can't count
on anyone else using them. They may still be useful within some
controlled contexts, as an alternative to hand-coding.

--
Joe Kesselman,
http://www.love-song-productions.com...lam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
 
Reply With Quote
 
Tim Arnold
Guest
Posts: n/a
 
      08-11-2011
On 8/10/2011 7:01 PM, Joe Kesselman wrote:
>>> <line x1="412" y1="4436" x2="412" y2="3932" />
>>> I would like to make sure that every X2 is greater than or equal to
>>> its X1 companion.

>
> The standard XML DTD and Schema languages can't express that kind of
> interaction; you'd need to implement it at a higher level of your
> application. Basically, if something is application semantics the
> application has to deal with it; if it's closer to syntax (type and
> range limits, and many but not all kinds of document structure
> constraint) schema can check it.
>
> There have been alternatives to the W3C's XML Schema language which can
> implement more complicated constraints. The problem is that they aren't
> as well standardized or as widely supported, so you really can't count
> on anyone else using them. They may still be useful within some
> controlled contexts, as an alternative to hand-coding.
>


Hi Joe,
Just curious if schematron with its 'let' and 'value-of' abilities could
be of help for the OP?
thanks,
--Tim

 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      08-12-2011
On 8/11/2011 12:21 PM, Tim Arnold wrote:
> Just curious if schematron with its 'let' and 'value-of' abilities could
> be of help for the OP?


I believe Schematron can express this kind of constraint... if you are
in an environment where you can guarantee that Schematron will be
available on the machine in question. In other words, it might be
reasonable to apply this on the server end where you own all the code,
but unless you can also guarantee that nobody but you will be writing
clients you may not be able to do much with it on that end -- and if you
ARE writing all the clients, you can usually ensure the data is correct
in the first place rather than spending cycles checking it.



--
Joe Kesselman,
http://www.love-song-productions.com...lam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
 
Reply With Quote
 
Martin Honnen
Guest
Posts: n/a
 
      08-12-2011
Joe Kesselman wrote:
>>> <line x1="412" y1="4436" x2="412" y2="3932" />
>>> I would like to make sure that every X2 is greater than or equal to
>>> its X1 companion.

>
> The standard XML DTD and Schema languages can't express that kind of
> interaction;


It might be worth noting that the version 1.1 of the schema language is
in the state "Candidate Recommendation" and with that you are able to
define assertions http://www.w3.org/TR/xmlschema11-1/#cAssertions e.g.
<xs:assert test="@x2 ge @x1"/>
I think there is a version of Xerces Java that does implement that
already. And Saxon's commercial schema processor also supports that.


--

Martin Honnen --- MVP Data Platform Development
http://msmvps.com/blogs/martin_honnen/
 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      08-12-2011
> It might be worth noting that the version 1.1 of the schema language is
> in the state "Candidate Recommendation" and with that you are able to
> define assertions http://www.w3.org/TR/xmlschema11-1/#cAssertions e.g.
> <xs:assert test="@x2 ge @x1"/>


Good point. I'd hesitate to _rely_ on Schema 1.1 until it graduates to
Recommendation -- and even then, not all parsers will support it
promptly -- but it's certainly reasonable to start prototyping against
it if you have it available.

--
Joe Kesselman,
http://www.love-song-productions.com...lam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
 
Reply With Quote
 
Peter Flynn
Guest
Posts: n/a
 
      08-14-2011
On 11/08/11 00:01, Joe Kesselman wrote:
>>> <line x1="412" y1="4436" x2="412" y2="3932" />
>>> I would like to make sure that every X2 is greater than or equal to
>>> its X1 companion.

>
> The standard XML DTD and Schema languages can't express that kind of
> interaction; you'd need to implement it at a higher level of your
> application. Basically, if something is application semantics the
> application has to deal with it; if it's closer to syntax (type and
> range limits, and many but not all kinds of document structure
> constraint) schema can check it.
>
> There have been alternatives to the W3C's XML Schema language which can
> implement more complicated constraints. The problem is that they aren't
> as well standardized or as widely supported, so you really can't count
> on anyone else using them. They may still be useful within some
> controlled contexts, as an alternative to hand-coding.


I think it's also important to establish what the objective is. The
typical sequence of events when an XML instance is processed can be
expressed as

1. syntactic verification (is the document well-formed)
2. formal validation (well-formed document tested against schema/dtd)
3. processing with whatever language/engine is specified, which may
involve further error-reporting, but at this stage the document
itself is presumed valid to its schema/dtd

The expectation is that if steps 1 or 2 fail, no further action takes
place, although a processor can report an error and even try to fix it,
which may involve digging further into the document to see what is going
on; but it cannot continue as if nothing had happened.

If you specify a constraint at the level of the Schema or DTD then
presumably you do so because you want to prevent the instance being
processed if it fails a well-formedness or validation test.

In effect, an assertion such as Martin mentions (that one attribute has
to be bigger than another) becomes a breaking-point. So we need to
consider how big a deal this is. The document is well-formed, because
validation will only take place if the document has passed (1) above. Is
the fact that <foo bar="42" blort="43"/> going to kill someone, or cause
the stock market to crash, or create a batch of dud chips, or just order
43 paperclips instead of 42? This level of analysis should indicate
whether such a test should cause the entire factory to come to a stop
and evacuate, or simply email a warning to the appropriate person.

I think what I am saying is, the fact that you *can* specify ever
tighter constraints doesn't necessarily mean that it is the right
business decision to do so, because the effects of premature validation
failure can be just as serious as those of remaining undetected until later.

///Peter
--
XML FAQ: http://xml.silmaril.ie/
(and apologies to those trying to access it in the last few days: the
server suffered a CPU flood from a rogue process; and No, before you
ask, it wasn't unvalidated XML data
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Can someone tell me what kinds of questions should be asked inthis list and what kinds in the tutor section? Oscar Benjamin Python 8 02-19-2013 12:12 AM
Re: Can someone tell me what kinds of questions should be asked inthis list and what kinds in the tutor section? Tim Golden Python 2 02-17-2013 09:31 PM
Can someone tell me what kinds of questions should be asked in thislist and what kinds in the tutor section? Claira Python 0 02-17-2013 12:19 AM
What kinds of things could cause a class' annotations to not beavailable? david.karr Java 22 08-30-2009 05:38 PM
Verified: you can get anybody you want kicked off Hotmail JedMeister NZ Computing 2 06-24-2004 11:29 AM



Advertisments