Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > How do I detect empty tags?

Reply
Thread Tools

How do I detect empty tags?

 
 
vega
Guest
Posts: n/a
 
      04-14-2005
How do I detect empty tags if I have the DOM document?

For example: <br /> and <br></br>

I tried org.w3c.dom.Node.getFirstChild(), it returns null for both <br
/> and <br></br>
I also tried getNodeValue(), they both returns null also.

I know <br /> and <br></br> are the same from the xml spec. Is there
any way to tell the different syntax using DOM parser?

Thanks,
-John

 
Reply With Quote
 
 
 
 
Andy Dingley
Guest
Posts: n/a
 
      04-14-2005
On 13 Apr 2005 18:23:59 -0700, "vega" <(E-Mail Removed)> wrote:

>How do I detect empty tags if I have the DOM document?
>
>For example: <br /> and <br></br>


You can't and you don't need to. In XML these are exactly
equivalent(sic).

http://www.w3.org/TR/2004/REC-xml-20...#sec-starttags

"Empty-element tags MAY be used for any element which has no content,
whether or not it is declared using the keyword EMPTY. For
interoperability, the empty-element tag SHOULD be used, and SHOULD
only be used, for elements which are declared EMPTY."


There may be a useful difference you can find in the element's
definition from DTD or schema - i..e. EMPTY You can access this by
either parsing it, or (more easily) by using a document parser that
understands schema and offers a more direct link to the relevant one.

This is the definition though, not the instance. It won't tell you if
the empty-element form of the tag in your document was used because
it's an EMPTY element, or just a non-empty element that happens to
have no content in this instance.


In general though, the way the document was serialised is not visible
to an XML application and even more importantly there is NO reason why
it needs to be. You just never need it.

If you do think you need it, then the chances are that you're in a
non-XML context, such as XHTML or RSS. Although these are ostensibly
XML protocols, they exist in an environment that's still rooted in the
HTML past. There may be valid reasons for still caring about things
that a purely XML context wouldn't need to.

 
Reply With Quote
 
 
 
 
Mukul Gandhi
Guest
Posts: n/a
 
      04-14-2005
<br/> and <br></br> are same according to XML spec.. I do not think
any compliant XML parser would treat these two ways differently. So I
think the XML parser cannot report this difference..

Just also curious, for what purpose this information is useful to
you..

Regards,
Mukul

"vega" <(E-Mail Removed)> wrote in message news:<(E-Mail Removed) ups.com>...
> How do I detect empty tags if I have the DOM document?
>
> For example: <br /> and <br></br>
>
> I tried org.w3c.dom.Node.getFirstChild(), it returns null for both <br
> /> and <br></br>
> I also tried getNodeValue(), they both returns null also.
>
> I know <br /> and <br></br> are the same from the xml spec. Is there
> any way to tell the different syntax using DOM parser?
>
> Thanks,
> -John

 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      04-14-2005
In article <(E-Mail Removed)> ,
Mukul Gandhi <(E-Mail Removed)> wrote:

><br/> and <br></br> are same according to XML spec.. I do not think
>any compliant XML parser would treat these two ways differently. So I
>think the XML parser cannot report this difference..


An XML parser can report what it likes, but it would usually be unwise
to write software that depended on the difference. For one thing,
passing the document through any common XML program might well change
it.

The XML Infoset does not distinguish between the two forms.

>Just also curious, for what purpose this information is useful to
>you..


Editor-like applications should preserve the user's preferred
formatting, and ideally so should any application that doesn't
completely alter the structure of the document.

-- Richard
 
Reply With Quote
 
Jon Haugsand
Guest
Posts: n/a
 
      04-14-2005
* Richard Tobin
> >Just also curious, for what purpose this information is useful to
> >you..

>
> Editor-like applications should preserve the user's preferred
> formatting, and ideally so should any application that doesn't
> completely alter the structure of the document.


Would <br><!-- metainformation comment --></br> be illegal according
to the spec?

--
Jon Haugsand
Dept. of Informatics, Univ. of Oslo, Norway, (E-Mail Removed)
http://www.ifi.uio.no/~jonhaug/, Phone: +47 22 85 24 92
 
Reply With Quote
 
Malte
Guest
Posts: n/a
 
      04-14-2005
Jon Haugsand wrote:
> * Richard Tobin
>
>>>Just also curious, for what purpose this information is useful to
>>>you..

>>
>>Editor-like applications should preserve the user's preferred
>>formatting, and ideally so should any application that doesn't
>>completely alter the structure of the document.

>
>
> Would <br><!-- metainformation comment --></br> be illegal according
> to the spec?
>


Das interessiert mich auch. Ich habe hier nachgeschaut:

http://www.w3.org/TR/xhtml1/#C_2

Dort heisst es man sollte <br /> bevorzugen (statt <br></br>) (xhtml)

und hier

http://www.w3.org/TR/1999/REC-html40...t.html#edef-BR

Hier heisst es, dass <br /> nicht erlaubt sei (html 4.01) (Start tag:
required, End tag: forbidden)


 
Reply With Quote
 
Andy Dingley
Guest
Posts: n/a
 
      04-14-2005
On 14 Apr 2005 15:12:57 +0200, Jon Haugsand <(E-Mail Removed)>
wrote:

>Would <br><!-- metainformation comment --></br> be illegal according
>to the spec?


Yes. (according to XML 1.0)
http://www.w3.org/TR/2004/REC-xml-20040204/#NT-content
"The representation of an empty element is either a start-tag
immediately followed by an end-tag, or an empty-element tag."

Note "immediately"

<br/> is equivalent to <br />
<br /> is equivalent to <br></br>

<br>[... anything ...]</br> is _not_ equivalent to <br></br>

Even <br> </br> (simple whitespace) is not empty content and thus is
invalid for an element defined as EMPTY


Of course in most cases this will be treated as valid, because <br />
is presumed to be an XHTML element and most XHTML gets handled by a
HTML parser, not an XML parser.

 
Reply With Quote
 
David Carlisle
Guest
Posts: n/a
 
      04-14-2005


Of course in most cases this will be treated as valid, because <br />
is presumed to be an XHTML element and most XHTML gets handled by a
HTML parser, not an XML parser.


Except that if it gets handled by a real HTML parser it is valid but
equivalent to <br>> so typesets a > at the start of the new line.

See what onsgmls makes of:

<html><head><title>a</title></head>
<body>
<br/><br>>
</body>
</html>


(BODY
AID IMPLIED
ACLASS IMPLIED
ASTYLE IMPLIED
ATITLE IMPLIED
ACLEAR TOKEN NONE
(BR
)BR
->
AID IMPLIED
ACLASS IMPLIED
ASTYLE IMPLIED
ATITLE IMPLIED
ACLEAR TOKEN NONE
(BR
)BR
->
)BODY
)HTML
C


David
 
Reply With Quote
 
Andy Dingley
Guest
Posts: n/a
 
      04-14-2005
On Thu, 14 Apr 2005 14:39:42 GMT, David Carlisle <(E-Mail Removed)>
wrote:

>Except that if it gets handled by a real HTML parser


But is HTML SGML ? I accept your point for SGML certainly, but
HTML is a world-of-hacks no matter how you look at it.

 
Reply With Quote
 
Alan J. Flavell
Guest
Posts: n/a
 
      04-14-2005
On Thu, 14 Apr 2005, Andy Dingley wrote:

> But is HTML SGML ?


The W3C say both yes and no. This has been discussed before, or
course: in the body of the HTML specification, they describe HTML as
an application of SGML, but then later on they rule-out certain
constructions when SGML didn't allow to be ruled out. That's the way
I understood the argument, anyway.

> I accept your point for SGML certainly, but HTML is a world-of-hacks
> no matter how you look at it.


Indeed. And XHTML/1,0 Appendix C continued that messy tradition.
Quite why so many newcomers aspire to just that, beats me.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Detect empty ID weblinkunlimited@gmail.com Javascript 9 03-17-2008 06:44 PM
any g++ compilation warning to detect empty if statement lasing C++ 6 08-16-2007 09:00 PM
How to detect an empty file? Tom St Denis C Programming 32 07-14-2006 01:05 AM
Good practice to detect empty string? ipellew@pipemedia.co.uk Perl Misc 10 12-23-2004 09:19 AM
empty/non-empty element John XML 1 07-16-2003 10:23 AM



Advertisments