Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > standalone="yes"

Reply
Thread Tools

standalone="yes"

 
 
tah
Guest
Posts: n/a
 
      10-24-2006
Hey,
Can someone please clarify, confirm, or set me straight on my
understanding of a standalone="yes" attribute in the xml version
element?
I assume it means that the xml document containing it is
standalone, and does not refer to any external document to define
types. In other words, it doesn't use an external dtd to validate any
types - everything used would be defined within the doc itself. In
other words, you would never see an xml document with standalone="yes"
defined if it had a corresponding and separate dtd file that defined
elements, attributes, etc. Is this correct? If so, then this should
probably almost never be used, since you don't usually define all types
in every doc.
Also, in one particular instance, I'm seeing this error from a
parser validation:

"White space must not occur between elements declared in an
external parsed entity with element content in a standalone document"

I really don't know what this means, or why it has anything to do
with 'standalone', yes or no.

Am I just misunderstanding the whole meaning?


Thanks!!

--Ty

 
Reply With Quote
 
 
 
 
Richard Tobin
Guest
Posts: n/a
 
      10-24-2006
In article <(E-Mail Removed) .com>,
tah <(E-Mail Removed)> wrote:

> I assume it means that the xml document containing it is
>standalone, and does not refer to any external document to define
>types. In other words, it doesn't use an external dtd to validate any
>types - everything used would be defined within the doc itself.


No. It means that if you use it without the external subset, you'll get
the same results. So there can't, for example, be declarations of
NMTOKENS attributes in the external subset, because that would cause
attributes to be differently normalized. (Actually, it's OK if there
are such declarations *but there aren't any of those attributes in
the document*.)

> "White space must not occur between elements declared in an
>external parsed entity with element content in a standalone document"


If an element is declared as having element-only content, a validating
parser will inform the application that the whitespace between child
elements is "ignorable" (that's not a term the standard uses, but
that's the idea). That is, the whitespace between child elements
is just for formatting, and is not significant. If an element is
declared in the external subset as having element-only content, then
the parser won't be able to report it correctly without reading the
external subset, so the document isn't standalone.

-- Richard
 
Reply With Quote
 
 
 
 
tah
Guest
Posts: n/a
 
      10-25-2006

Richard Tobin wrote:
> In article <(E-Mail Removed) .com>,
> tah <(E-Mail Removed)> wrote:
>
> > I assume it means that the xml document containing it is
> >standalone, and does not refer to any external document to define
> >types. In other words, it doesn't use an external dtd to validate any
> >types - everything used would be defined within the doc itself.

>
> No. It means that if you use it without the external subset, you'll get
> the same results. So there can't, for example, be declarations of
> NMTOKENS attributes in the external subset, because that would cause
> attributes to be differently normalized. (Actually, it's OK if there
> are such declarations *but there aren't any of those attributes in
> the document*.)
>
> > "White space must not occur between elements declared in an
> >external parsed entity with element content in a standalone document"

>
> If an element is declared as having element-only content, a validating
> parser will inform the application that the whitespace between child
> elements is "ignorable" (that's not a term the standard uses, but
> that's the idea). That is, the whitespace between child elements
> is just for formatting, and is not significant. If an element is
> declared in the external subset as having element-only content, then
> the parser won't be able to report it correctly without reading the
> external subset, so the document isn't standalone.
>
> -- Richard


Thanks Richard! That does clarify the first question, and your
answer helped me understand some of the other documentation I had read
and didn't quite get. The second point (whitespace) is still pretty
fuzzy, though.
Are you saying that if a parser tries to validate an xml doc
with standalone=yes, and finds whitespace between elements, it then
needs to know whether the element is declared to have element-only
content in order to determine whether the whitespace is ignorable? And
if in fact it is declared in an external dtd to have element-only
content, then it's not standalone? (***this is the important question
that i'd like to be clear on)

This seems pretty chicken-and-egg-ish to me: If there's
whitespace and I'm standalone, I need to know if it's element-only, but
if it's declared as element-only outside the doc, then it's not
standalone (I now know I can ignore the whitespace, but you lied, and
are not standalone, so I'm choking).


I know I must still be missing the main point. What's the point
of standalone, if it's not what I stated in the first place: I don't
need to, and cannot, rely on ANY external subset (dtd),
What is it useful for? For example, if I say I'm standalone, but
in an external subset I declare an element to have, say a required
attribute, but within the doc I don't have the attribute, am I still
valid? If I'm not valid, which rule did I fail, standalone, or
required-attribute? In other words, why can we have any external subset
at all here? If we have one, it must be used for something, and if it's
used for something, then I can't be standalone.

Anyway, my brain's ususally too small to understand the XML
standards, so I apologize if I'm missing some simple point. Thanks for
the help!

 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      10-25-2006
http://www.w3.org/TR/2006/REC-xml-20.../#vc-check-rmd

"The standalone document declaration MUST have the value "no" if any
external markup declarations contain declarations of:
[...]
* element types with element content, if white space occurs directly
within any instance of those types."


--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      10-25-2006
.... Note that that's a Validity Condition, not a well-formedness
condition. If you don't validate, you may be able to get away with
abusing standalone -- but why would you want to?

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      10-25-2006
In article <(E-Mail Removed). com>,
tah <(E-Mail Removed)> wrote:

> Are you saying that if a parser tries to validate an xml doc
>with standalone=yes, and finds whitespace between elements, it then
>needs to know whether the element is declared to have element-only
>content in order to determine whether the whitespace is ignorable?


Whether it's really ignorable depends on the application. The parser's
job is to report that it's whitespace-in-element-content so that the
application can make that decision.

>And
>if in fact it is declared in an external dtd to have element-only
>content, then it's not standalone? (***this is the important question
>that i'd like to be clear on)


If an element is declared in the external subset to have element-only
content, AND there is such an element in the document with whitespace
between the children, then it's not standalone.

> I know I must still be missing the main point. What's the point
>of standalone, if it's not what I stated in the first place: I don't
>need to, and cannot, rely on ANY external subset (dtd),
> What is it useful for? For example, if I say I'm standalone, but
>in an external subset I declare an element to have, say a required
>attribute, but within the doc I don't have the attribute, am I still
>valid? If I'm not valid, which rule did I fail, standalone, or
>required-attribute?


Required attribute.

>In other words, why can we have any external subset
>at all here? If we have one, it must be used for something, and if it's
>used for something, then I can't be standalone.


On the one hand, you want to be able to verify that your documents are
correct, so you use a DTD. On the other hand, you want to be able to
use your documents with lightweight processors that won't bother to
validate, and certainly won't fetch an external subset. So you
validate your documents when you create them, and the lightweight
processors just assume that they're correct.

The standalone declaration allows the "offline" validation to warn
you that the lightweight processor is not going to see the right
thing, because (for example) you've defaulted an attribute that
the lightweight processor won't see.

In practice, I don't think that standalone has been very widely used.
Many lightweight applications know enough about the document format to
provide default values, normalise attributes, and treat whitespace
appropriately, without reading the DTD at all.

-- Richard
 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      10-25-2006
Richard Tobin wrote:
> In practice, I don't think that standalone has been very widely used.


The spec also suggests that, if you want to distribute explicitly
"standalone" documents, you can explicitly convert them into that
form... which may be the right answer; don't use it unless you need it,
and when you do need it plug in the appropriate conversion.

> Many lightweight applications know enough about the document format to
> provide default values, normalise attributes, and treat whitespace
> appropriately, without reading the DTD at all.


Very true. Also, see past debates here about whether DTDs are becoming
obsolete as schemas take over... and standalone says nothing about
schema validation; it's strictly a DTD-validation directive/assertion.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
tah
Guest
Posts: n/a
 
      10-26-2006

Richard Tobin wrote:
> In article <(E-Mail Removed). com>,
> tah <(E-Mail Removed)> wrote:
>
> > Are you saying that if a parser tries to validate an xml doc
> >with standalone=yes, and finds whitespace between elements, it then
> >needs to know whether the element is declared to have element-only
> >content in order to determine whether the whitespace is ignorable?

>
> Whether it's really ignorable depends on the application. The parser's
> job is to report that it's whitespace-in-element-content so that the
> application can make that decision.
>
> >And
> >if in fact it is declared in an external dtd to have element-only
> >content, then it's not standalone? (***this is the important question
> >that i'd like to be clear on)

>
> If an element is declared in the external subset to have element-only
> content, AND there is such an element in the document with whitespace
> between the children, then it's not standalone.
>
> > I know I must still be missing the main point. What's the point
> >of standalone, if it's not what I stated in the first place: I don't
> >need to, and cannot, rely on ANY external subset (dtd),
> > What is it useful for? For example, if I say I'm standalone, but
> >in an external subset I declare an element to have, say a required
> >attribute, but within the doc I don't have the attribute, am I still
> >valid? If I'm not valid, which rule did I fail, standalone, or
> >required-attribute?

>
> Required attribute.
>
> >In other words, why can we have any external subset
> >at all here? If we have one, it must be used for something, and if it's
> >used for something, then I can't be standalone.

>
> On the one hand, you want to be able to verify that your documents are
> correct, so you use a DTD. On the other hand, you want to be able to
> use your documents with lightweight processors that won't bother to
> validate, and certainly won't fetch an external subset. So you
> validate your documents when you create them, and the lightweight
> processors just assume that they're correct.
>
> The standalone declaration allows the "offline" validation to warn
> you that the lightweight processor is not going to see the right
> thing, because (for example) you've defaulted an attribute that
> the lightweight processor won't see.
>
> In practice, I don't think that standalone has been very widely used.
> Many lightweight applications know enough about the document format to
> provide default values, normalise attributes, and treat whitespace
> appropriately, without reading the DTD at all.
>
> -- Richard



All right! That last makes perfect sense, and clears it up. I hadn't
considered the offline (pre) validation vs. online validation. If
that's the reason for it, that makes sense and sounds useful. Thanks
for all the help!

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments