Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > when will empty tags pass schema validation?

Reply
Thread Tools

when will empty tags pass schema validation?

 
 
wolf_y
Guest
Posts: n/a
 
      05-04-2006
My question is simply: under what conditions will empty tags of the
form <MOM></MOM> pass schema validation? Of course, the mirror
question is: under what conditions will empty tags fail validation?
The former seems to be an easier question to answer.

XML files will arrive from around the world and must be schema
validated before further processing and loading into a database, so I'm
trying to foresee the various layouts that might be submitted. I can
anticipate suppliers starting with a template, filling in needed
elements, and sending the file with empty tags in conditional segments
with mandatory and conditional elements. I understand the role of
restrictions, but there are about a dozen record types, dozens of
segments, and hundreds of elements (some of which are sometimes
mandatory, sometimes conditional, and sometimes
conditionally-mandatory). One schema is 230 pages.

I already created a test file where a conditional segment had empty
tags and validation failed.

Thanks

 
Reply With Quote
 
 
 
 
Joe Kesselman
Guest
Posts: n/a
 
      05-04-2006
wolf_y wrote:
> My question is simply: under what conditions will empty tags of the
> form <MOM></MOM> pass schema validation?


Semantically identical to <MOM/>, in XML. Therefore, they will pass in
the same conditions where <MOM/> would pass: When the schema accepts
that tag and does not require that it have any content.



--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
 
 
 
wolf_y
Guest
Posts: n/a
 
      05-04-2006
Thanks for answering, but maybe I should have led with my disclaimer:
I'm a newbie to XML, primarily program in SAS, and consulted online
documentation.

Some of my confusion stems from the way terms such as empty, missing,
null, and blank are used/handled in different languages. I don't mind
reading docs, but I can't find an answer I understand at
http://www.w3.org/ or url links I've found.

I don't want to create an empty element, but need to know under what
circumstances an empty element will pass schema checks, so that the
backend processing in SAS can react correctly when it's time to load
the data. There are 5 SAS programmers sharing responsibility for
writing the load routines and I was chosen to explain what to expect
after validation. There might be circumstances where an empty element
is allowed and others where we want to reject the file, both based on
the same element, depending upon the XML file provider or segment.

There are 4 levels of schema involved. Here's an example of an element
in the Level 3 schema:

<xs:element name="MOM">
<xs:annotation>
<xs:documentation>Mother</xs:documentation>
</xs:annotation>
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="1"/>
<xs:maxLength value="25"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

I understand that because of minLength this element must have at least
one character. In a simple test, whitespace <MOM> </MOM> passes (is
this a blank in XML?) whereas <MOM></MOM> doesn't (null or empty?). An
element defined with type=xs:integer fails in both circumstances.

Is there any type (or attribute?) where both <MOM></MOM> and <MOM>
</MOM> passes validation? Or must an element be explicitly defined as
permitting Empty(nil?) values? Or must I test each unique element?

I hope this makes sense.

 
Reply With Quote
 
Joseph Kesselman
Guest
Posts: n/a
 
      05-04-2006
wolf_y wrote:
> Is there any type (or attribute?) where both <MOM></MOM> and <MOM>
> </MOM> passes validation?


Sure. If minimum length had been zero (or had not been explicitly set)
for the xs:string example, both would pass.

It's really a matter of what that specific schema has said the datatype
is (which controls whether empty is syntactically acceptable) and what
additional constraints (which controls whether empty is semantically
acceptable for validation purposes).

Nillable is a different concept, having to do with the concept of
"explicitly has no meaningful value" rather than either "value is empty"
or "element was not present". It may make more sense to folks who've
worked with databases that support this idea.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
 
Reply With Quote
 
wolf_y
Guest
Posts: n/a
 
      05-05-2006
> It's really a matter of what that specific schema has said the datatype
> is (which controls whether empty is syntactically acceptable) and what
> additional constraints (which controls whether empty is semantically
> acceptable for validation purposes).


You've helped by confirming my take on what I've read, and I'll
continue to reread W3C docs. Since element properties are derived and
there are so many elements, it looks like my safest strategy is to
generate test files under both scenarios and see what happens.

 
Reply With Quote
 
Peter Flynn
Guest
Posts: n/a
 
      05-05-2006
wolf_y wrote:
> Thanks for answering, but maybe I should have led with my disclaimer:
> I'm a newbie to XML, primarily program in SAS, and consulted online
> documentation.
>
> Some of my confusion stems from the way terms such as empty, missing,
> null, and blank are used/handled in different languages. I don't mind
> reading docs, but I can't find an answer I understand at
> http://www.w3.org/ or url links I've found.
>
> I don't want to create an empty element, but need to know under what
> circumstances an empty element will pass schema checks,


I think the confusion arises from the two different meanings of the word.

a) EMPTY (in caps) is an XML keyword used to declare that a certain
element type can *never* have any content (neither character data
content nor other elements)

b) empty (in lowercase) is just an adjective meaning "with no content";
it doesn't specify whether content is permitted or not, it simply
says that there isn't any content at the moment.

An element type declared as EMPTY can be represented as <foo/> or as
<foo></foo>. The first is often recommended because it is unambiguous
and there is no possibility of anyone ever manually inserting any
content and thereby breaking the document model.

An element type declared *with* content *may* be empty on some
occasions (like this <name></name>) but that does not necessarily mean
that it was declared EMPTY: you'd have to consult the Schema or DTD
to find that out.

So an empty element like <name></name> will pass a validation check
either

a) if it was declared EMPTY, or
b) it was declared with optional content and just doesn't happen to
have any right now.

An element like <foo/> will only pass a validation check if it was
declared EMPTY.

(In both cases I am assuming there are no compulsory attributes.)

///Peter
--
XML FAQ: http://xml.silmaril.ie/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
All style tags after the first 30 style tags on an HTML page are not applied in Internet Explorer Rob Nicholson ASP .Net 3 05-28-2005 03:11 PM
[XML Schema] Including a schema document with absent target namespace to a schema with specified target namespace Stanimir Stamenkov XML 3 04-25-2005 09:59 AM
JSP newbie - use include, custom tags, standard tags - or what? Mike Java 3 01-09-2004 09:30 AM
RegEx to find CFML tags nested in HTML tags Dean H. Saxe Perl 0 01-03-2004 06:11 PM
Custom Tags within Custom Tags. Ranganath Java 2 10-21-2003 06:14 AM



Advertisments