Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > XML in XMPP

Reply
Thread Tools

XML in XMPP

 
 
Ivan Shmakov
Guest
Posts: n/a
 
      07-06-2012
I've found a short discussion of XMPP as an XML application at
[1], which contains some points I cannot agree. But then, I'm
not really that confident in my knowledge of XMPP particulars,
so I'd appreciate if someone could comment on my arguments
below.

[1] http://search.cpan.org/~elmex/AnyEve...XMPP/Writer.pm

> The whole "XML" concept of XMPP is fundamentally broken anyway. It's
> supposed to be an subset of XML. But a subset of XML productions is
> not XML.


It's true, but such a subset could satisfy the definition of an
XML application (AIUI), which XMPP is intended to be.

> Strictly speaking you need a special XMPP "XML" parser and writer to
> be 100% conformant.


OTOH, the requirement of a custom XMPP parser certainly doesn't
fit the notion of an XML application.

> On top of that XMPP requires you to parse these partial "XML"
> documents. But a partial XML document is not well-formed, heck, it's
> not even a XML document! And a parser should bail out with an error.
> But XMPP doesn't care, it just relies on implementation dependend
> behaviour of chunked parsing modes for SAX parsing. This
> functionality isn't even specified by the XML recommendation in any
> way. The recommendation even says that it's undefined what happens
> if you process not-well-formed XML documents.


And as long as it's undefined (and not denied outright), the
particular interpretation of XML "fragments" used by XMPP seems
more like a natural extension, than a failure to comply with the
standard.

> But I try to be as XMPP "XML" conformant as possible (it should be
> around 99-100%). But it's hard to say what XML is conformant, as the
> specifications of XMPP "XML" and XML are contradicting. For example
> XMPP also says you only have to generated and accept UTF-8 encodings
> of XML, but the XML recommendation says that each parser has to
> accept UTF-8 and UTF-16.


Once again, this is a specialization, and it's my understanding
that an XML application may choose to explicitly define an
acceptable subset of XML.

Though, of course, this allows for XMPP parsers that aren't XML
parsers at the same time.

> So, what do you do? Do you use a XML conformant parser or do you
> write your own?


> I'm using XML:arser::Expat because expat knows how to parse broken
> (aka 'partial') "XML" documents, as XMPP requires. Another argument
> is that if you capture a XMPP conversation to the end, and even if a
> '</stream:stream>' tag was captured, you wont have a valid XML
> document. The problem is that you have to resent a <stream> tag
> after TLS and SASL authentication each! Awww... I'm repeating
> myself.


This one indeed may be a problem, but probably not as much in
practice as in theory.

> But well... AnyEvent::XMPP does it's best with expat to cope with
> the fundamental brokeness of "XML" in XMPP.


> Back to the issue with "XML" generation: I've discoverd that many
> XMPP servers (eg. jabberd14 and ejabberd) have problems with XML
> namespaces. Thats the reason why I'm assigning the namespace
> prefixes manually: The servers just don't accept validly namespaced
> XML. The draft 3921bis does even state that a client SHOULD generate
> a 'stream' prefix for the <stream> tag.


Indeed, and such a problem seems to be quite common.

To note is that the XHTML 1.1 + MathML 2.0 + SVG 1.1 profile [2]
(as implemented by, e. g., the W3C validator [3]) explicitly
requires that the embedded MathML and SVG documents use the m:
and svg: namespace prefixes, respectively.

My understanding is that it simplifies the task of DTD-based
validation, but DTD doesn't seem such a major part of XML as it
was of SGML, and I doubt of whether it's really necessary to
continue to enforce such restrictions.

[2] http://w3.org/TR/XHTMLplusMathMLplusSVG/
[3] http://validator.w3.org/

--
FSF associate member #7257
 
Reply With Quote
 
 
 
 
Joe Kesselman
Guest
Posts: n/a
 
      07-08-2012
On 7/6/2012 5:54 AM, Ivan Shmakov wrote:
> > The whole "XML" concept of XMPP is fundamentally broken anyway. It's
> > supposed to be an subset of XML. But a subset of XML productions is
> > not XML.

>
> It's true, but such a subset could satisfy the definition of an
> XML application (AIUI), which XMPP is intended to be.


Not at all familiar with XMPP, but it sounds like it bears the same sort
of relationship to XML that XML did to SGML -- subset, _possibly_
"backward compatible syntax" in that you could run it through tools
intended for the other syntax if you didn't have something XMPP-specific
available, but Not XML and not really interoperable with XML at anything
beyond that most basic syntactic-subset level.

If an application doesn't use all of XML, that's fine. BUT:

> OTOH, the requirement of a custom XMPP parser certainly doesn't
> fit the notion of an XML application.


Yep. If it can't _tolerate_ all of XML, it isn't XML..

> And as long as it's undefined (and not denied outright), the
> particular interpretation of XML "fragments" used by XMPP seems
> more like a natural extension, than a failure to comply with the
> standard.


XML has a clear definition of well-formed document fragment. If XMPP is
complying with that, it may be fine. If not, no.

> Once again, this is a specialization, and it's my understanding
> that an XML application may choose to explicitly define an
> acceptable subset of XML.


Marginally. There are indeed ASCII-only XML-subset parsers. But they
don't claim to satisfy the XML Recommendation.

> > So, what do you do? Do you use a XML conformant parser or do you
> > write your own?


If you *can't* use an XML parser, it isn't XML. If you *choose* not to
use an XML parser, that's a different matter.

If the document isn't a well-formed XML document or XML document
fragment, it isn't XML. Period.


What's the advantage of all this breakage supposed to be? Why didn't
they just use XML propertly?

> My understanding is that it simplifies the task of DTD-based
> validation, but DTD doesn't seem such a major part of XML as it
> was of SGML, and I doubt of whether it's really necessary to
> continue to enforce such restrictions.


DTDs should be abandoned. They are simply not compatible with XML
Namespaces, and Namespaces should now be considered an essential part of
serious XML processing.

(Believe me, we *tried* to find a model which could reasonably handle
both. There really isn't a reasonable way to retrofit namespaces into
DTDs. DTDs are too bound to raw syntax to work with something that has
semantic behaviors.)


I don't have time to investigate XMPP, but it sounds like its creator
was either lazy and either took unreasonable shortcuts, or diverged
simply to suit their own biases and had no interest in working with the
rest of the XML universe. Unless you like those answers (I don't)
suggest looking for something else which isn't gratuitously incompatible.

--
Joe Kesselman,
http://www.love-song-productions.com...lam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."


 
Reply With Quote
 
 
 
 
Ivan Shmakov
Guest
Posts: n/a
 
      07-09-2012
>>>>> Joe Kesselman <(E-Mail Removed)> writes:
>>>>> On 7/6/2012 5:54 AM, Ivan Shmakov wrote:


[...]

>> OTOH, the requirement of a custom XMPP parser certainly doesn't fit
>> the notion of an XML application.


> Yep. If it can't _tolerate_ all of XML, it isn't XML..


My guess is that, leaving aside the interpretation of XML
"fragments", the whole recorded XMPP session /should/ comprise a
well-formed XML document.

Yet, once again, an XMPP parser is /not/ required to implement
the whole XML (though it may choose to do so.)

>> And as long as it's undefined (and not denied outright), the
>> particular interpretation of XML "fragments" used by XMPP seems more
>> like a natural extension, than a failure to comply with the
>> standard.


> XML has a clear definition of well-formed document fragment.


Huh? Where is it?

> If XMPP is complying with that, it may be fine. If not, no.


Unfortunately, I don't know for sure.

>> Once again, this is a specialization, and it's my understanding that
>> an XML application may choose to explicitly define an acceptable
>> subset of XML.


> Marginally. There are indeed ASCII-only XML-subset parsers. But
> they don't claim to satisfy the XML Recommendation.


AIUI, XMPP parsers don't claim to have full XML support. Or, at
least, they're not required to.

[...]

> What's the advantage of all this breakage supposed to be? Why didn't
> they just use XML propertly?


The purpose of XMPP is to pass around "messages" (either human-
or machine-readable) in real-time.

Apparently, the idea was that the complete recorded XMPP session
/should/ comprise an XML document. But as the XMPP
implementation is required to take action before the session is
over, it has to interpret the bits of XML it receives as soon as
it has a complete bit (or, in XMPP parlance, a "stanza.")

>> My understanding is that it simplifies the task of DTD-based
>> validation, but DTD doesn't seem such a major part of XML as it was
>> of SGML, and I doubt of whether it's really necessary to continue to
>> enforce such restrictions.


> DTDs should be abandoned. They are simply not compatible with XML
> Namespaces, and Namespaces should now be considered an essential part
> of serious XML processing.


Yes.

> (Believe me, we *tried* to find a model which could reasonably handle
> both. There really isn't a reasonable way to retrofit namespaces
> into DTDs. DTDs are too bound to raw syntax to work with something
> that has semantic behaviors.)


Do I understand it correctly that http://validator.w3.org/ is
based on DTD?

BTW, is there a W3C recommendation that explicitly allows for
inclusion of MathML and SVG within an XHTML document (and is
/not/ based on DTD)?

> I don't have time to investigate XMPP, but it sounds like its creator
> was either lazy and either took unreasonable shortcuts, or diverged
> simply to suit their own biases and had no interest in working with
> the rest of the XML universe. Unless you like those answers (I
> don't) suggest looking for something else which isn't gratuitously
> incompatible.


For instance? I'm interested in a "reasonably well supported"
protocol for passing messages in "real-time" (where messages may
contain some XML.) XMPP is so far the only one I've found.

> {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
> /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."


... And what about XHTML mail?

--
FSF associate member #7257
 
Reply With Quote
 
Alain Ketterlin
Guest
Posts: n/a
 
      07-09-2012
Ivan Shmakov <(E-Mail Removed)> writes:

>>>>>> Joe Kesselman <(E-Mail Removed)> writes:


[...]
> > XML has a clear definition of well-formed document fragment.

>
> Huh? Where is it?


In the XML recommandation: it's called an "external parsed entity".

> Apparently, the idea was that the complete recorded XMPP session
> /should/ comprise an XML document. But as the XMPP
> implementation is required to take action before the session is
> over, it has to interpret the bits of XML it receives as soon as
> it has a complete bit (or, in XMPP parlance, a "stanza.")


SAX should handle the task. The problem is that at the time an error is
detected, some part of the "document" have already been processed. The
protocol should specify what to do in these cases.

> BTW, is there a W3C recommendation that explicitly allows for
> inclusion of MathML and SVG within an XHTML document (and is
> /not/ based on DTD)?


A working draft (http://www.w3.org/TR/XHTMLplusMathMLplusSVG/)

> For instance? I'm interested in a "reasonably well supported"
> protocol for passing messages in "real-time" (where messages may
> contain some XML.) XMPP is so far the only one I've found.


I guess SOAP is an example.

-- Alain.
 
Reply With Quote
 
Manuel Collado
Guest
Posts: n/a
 
      07-09-2012
El 09/07/2012 10:38, Alain Ketterlin escribió:
> Ivan Shmakov <(E-Mail Removed)> writes:
> ...
>> BTW, is there a W3C recommendation that explicitly allows for
>> inclusion of MathML and SVG within an XHTML document (and is
>> /not/ based on DTD)?

>
> A working draft (http://www.w3.org/TR/XHTMLplusMathMLplusSVG/)


This profile /is/ /based/ on a DTD. The OP explicitly asked about a
recommendation /not/ /based/ on a DTD.

--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado



 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      07-10-2012
On 7/9/2012 11:04 AM, Manuel Collado wrote:
>>> BTW, is there a W3C recommendation that explicitly allows for
>>> inclusion of MathML and SVG within an XHTML document (and is
>>> /not/ based on DTD)?


> This profile /is/ /based/ on a DTD. The OP explicitly asked about a
> recommendation /not/ /based/ on a DTD.


XHTML modularization covers the concept -- which basically consists of
"that's exactly what namespaces are for". See
http://www.w3.org/TR/xhtml-modularization/ and related.




--
Joe Kesselman,
http://www.love-song-productions.com...lam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."


 
Reply With Quote
 
Ivan Shmakov
Guest
Posts: n/a
 
      07-11-2012
>>>>> Joe Kesselman <(E-Mail Removed)> writes:
>>>>> On 7/9/2012 11:04 AM, Manuel Collado wrote:


>>>> BTW, is there a W3C recommendation that explicitly allows for
>>>> inclusion of MathML and SVG within an XHTML document (and is /not/
>>>> based on DTD)?


>>> A working draft (http://www.w3.org/TR/XHTMLplusMathMLplusSVG/)


>> This profile /is/ /based/ on a DTD. The OP explicitly asked about a
>> recommendation /not/ /based/ on a DTD.


> XHTML modularization covers the concept -- which basically consists
> of "that's exactly what namespaces are for".


The problem is that the only way http://validator.w3.org/ allows
for XHTML to contain SVG and MathML is via the working draft
cited above, which uses DTD as part of its definition, and thus,
as it was already pointed out, is "poorly compatible" with XML
namespaces.

My guess is that for W3C Validator to be updated to allow for a
fuller understanding of XHTML's "XML nature" there has to be a
W3C recommendation, or a working draft, that explicitly allows
for any XML namespace prefixes in XHTML. AIUI, such a
specification has to be based on something other than DTD.

Thus was my question.

Regarding the XHTML modularization, it was my understanding that
its whole idea was to allow for easier creation of XHTML
profiles. Which seems like an independent issue.

> See http://www.w3.org/TR/xhtml-modularization/ and related.


What exactly are the "related" documents?

--
FSF associate member #7257 http://sf-day.org/
 
Reply With Quote
 
Ivan Shmakov
Guest
Posts: n/a
 
      07-11-2012
>>>>> Alain Ketterlin <(E-Mail Removed)-strasbg.fr> writes:
>>>>> Ivan Shmakov <(E-Mail Removed)> writes:
>>>>> Joe Kesselman <(E-Mail Removed)> writes:


>>> XML has a clear definition of well-formed document fragment.


>> Huh? Where is it?


> In the XML recommendation: it's called an "external parsed entity".


XMPP stanzas are hardly "external" to XMPP sessions.

>> Apparently, the idea was that the complete recorded XMPP session
>> /should/ comprise an XML document. But as the XMPP implementation
>> is required to take action before the session is over, it has to
>> interpret the bits of XML it receives as soon as it has a complete
>> bit (or, in XMPP parlance, a "stanza.")


> SAX should handle the task.


Yes. Indeed, AnyEvent::XMPP:arser uses XML:arser::Expat,
which is event-based.

> The problem is that at the time an error is detected, some part of
> the "document" have already been processed. The protocol should
> specify what to do in these cases.


AIUI, it does.

[...]

>> For instance? I'm interested in a "reasonably well supported"
>> protocol for passing messages in "real-time" (where messages may
>> contain some XML.) XMPP is so far the only one I've found.


> I guess SOAP is an example.


ACK, thanks.

Though it looks like I'd have to stick to XMPP, for I search for
a way to extend XMPP clients, anyway.

--
FSF associate member #7257 http://sf-day.org/
 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      07-12-2012
On 7/10/2012 11:36 PM, Ivan Shmakov wrote:
> What exactly are the "related" documents?


Searching the W3C website for "modularization" finds the ones I know of.


--
Joe Kesselman,
http://www.love-song-productions.com...lam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Blocking XMPP API? Gabriel Rossetti Python 2 07-13-2009 05:08 PM
Re: XMPP xmpppy - User Authorization James Mills Python 3 12-15-2008 04:49 PM
XMPP xmpppy - User Authorization James Mills Python 0 11-05-2008 01:28 AM
parsing xml (xmpp) with ruby Eric Will Ruby 3 09-27-2008 10:10 PM
SIP/SIMPLE - XMPP/Jingle : Wildfire + Asterisk + SER kael UK VOIP 6 02-13-2007 09:00 AM



Advertisments