Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Processing Instructions

Reply
Thread Tools

Processing Instructions

 
 
Dominic Olivastro
Guest
Posts: n/a
 
      04-14-2004
Hi all:

I'm new to this newsgroup, and new to XML.

We receive documents in XML, and I am trying to tear them apart to obtain
information. I decided that, for my purposes, it would be fairly easy to
write a simple XML parser, which it was. But now suddenly I find that some
of the information I need is in the form of a Processing Instruction, and
not tagged in the usual way. So I get information like this:

<description>
<?BRFSUM description="Brief Summary" end="lead"?>
The present text related to a photodecter and so on in this vein
<?BRFSUM description="Brief Summary" end="tail"?>

Questions:

1. Why is this not placed in the usual tag format?
2. Can I assume that end="lead" will always open the text, and end="tail"
will always close it? Is this usual for Processing Instructions? From what
I've read, there generally isn't any end tag.

Thanks for any help you can give me.

Dom
mailto: http://www.velocityreviews.com/forums/(E-Mail Removed)


 
Reply With Quote
 
 
 
 
Ashmodai
Guest
Posts: n/a
 
      04-14-2004
Dominic Olivastro scribbled something along the lines of:

> Hi all:
>
> I'm new to this newsgroup, and new to XML.
>
> We receive documents in XML, and I am trying to tear them apart to obtain
> information. I decided that, for my purposes, it would be fairly easy to
> write a simple XML parser, which it was. But now suddenly I find that some
> of the information I need is in the form of a Processing Instruction, and
> not tagged in the usual way. So I get information like this:
>
> <description>
> <?BRFSUM description="Brief Summary" end="lead"?>
> The present text related to a photodecter and so on in this vein
> <?BRFSUM description="Brief Summary" end="tail"?>
>
> Questions:
>
> 1. Why is this not placed in the usual tag format?


Processing instructions aren't elements. The concept is that they tell
the processor something about the document. PHP, a server side scripting
language, for example uses PI brackets because it is executed
("processed") server side.
The most common PI is the xml PI which tells the version and character
encoding of the document. It doesn't say anything about the actual
content, but it explains how to process it (eg. what version must be
supported and what the character encoding setting should be).
Stylesheets are also linked in PIs because they tell how to render the
document.

The idea, to my understanding, is that PIs are namespace, vocabulary and
subset independant. <?xml ...?> means the same in an XForms document as
it does in a SVG file.
I suppose something along the lines of <xml:info version=""/> would have
worked as well, but then it'd have to be inside the root element and
that's a bit too late for the processor.


> 2. Can I assume that end="lead" will always open the text, and end="tail"
> will always close it? Is this usual for Processing Instructions? From what
> I've read, there generally isn't any end tag.


PIs don't have ending tags as they aren't normal elements or even tags.
I've never seen PIs enclosing anything by consisting of a set of two or
more PIs, but then again, I'm only using XML for the web, so maybe I
missed something out.

I don't think it's much of a flaw if you don't support EVERY PI there
is, but you should try covering the basics (xml and xml-stylesheet most
importantly).

--
Alan Plum, WAD/WD, Mushroom Cloud Productions
http://www.mushroom-cloud.com/
 
Reply With Quote
 
 
 
 
Richard Tobin
Guest
Posts: n/a
 
      04-14-2004
In article <7a08f$407d7c24$44a5e110$(E-Mail Removed) rs.com>,
Dominic Olivastro <(E-Mail Removed)> wrote:

><description>
><?BRFSUM description="Brief Summary" end="lead"?>
>The present text related to a photodecter and so on in this vein
><?BRFSUM description="Brief Summary" end="tail"?>


>1. Why is this not placed in the usual tag format?


You'll have to ask the document designer. A couple of possibilities
are:

- The markup is not necessarily nested. You can't do that with start
and end tags, but you can use processing instructions or "point
elements" (i.e. empty elements used to mark the start and end of
something). This doesn't seem very likely given the example.

- The document has to adhere to a fixed DTD that does provide an
element for "brief summary", and processing instructions are being
used to provide the additional markup.

>2. Can I assume that end="lead" will always open the text, and end="tail"
>will always close it?


Again, you'll have to ask the document designer.

-- Richard
 
Reply With Quote
 
arnold m. slotnik
Guest
Posts: n/a
 
      04-15-2004
Ashmodai <(E-Mail Removed)> wrote in
news:c5k9h9$jt2$06$(E-Mail Removed)-online.com:

[...]

> The most common PI is the xml PI which tells the version and
> character encoding of the document. It doesn't say anything
> about the actual content, but it explains how to process it (eg.
> what version must be supported and what the character encoding
> setting should be). Stylesheets are also linked in PIs because
> they tell how to render the document.


That's the XML Declaration (or Text Declaration in an external parsed
entity), not an "XML PI".

REC-xml-20001006, section 2.8.

--
a. m. slotnik
 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      04-15-2004
In article <Xns94CBF2FBF46B9slotnikverizon@199.45.49.11>,
arnold m. slotnik <(E-Mail Removed)> wrote:

>That's the XML Declaration (or Text Declaration in an external parsed
>entity), not an "XML PI".


True, but it's not just coincidence that it shares the syntax of
PIs. XML is a subset of SGML, and from the SGML point of view the
XML declaration is a PI.

-- Richard
 
Reply With Quote
 
arnold m. slotnik
Guest
Posts: n/a
 
      04-15-2004
(E-Mail Removed) (Richard Tobin) wrote in
news:c5lvoe$qsj$(E-Mail Removed):

> True, but it's not just coincidence that it shares the syntax of
> PIs. XML is a subset of SGML, and from the SGML point of view
> the XML declaration is a PI.


From the XSLT point of view, though, there's a big difference between
an XML Declaration and a PI.

Making sure we get the terminology right now can save questions later
on...

--
a. m. slotnik
 
Reply With Quote
 
Ashmodai
Guest
Posts: n/a
 
      04-15-2004
arnold m. slotnik scribbled something along the lines of:

> (E-Mail Removed) (Richard Tobin) wrote in
> news:c5lvoe$qsj$(E-Mail Removed):
>
>
>>True, but it's not just coincidence that it shares the syntax of
>>PIs. XML is a subset of SGML, and from the SGML point of view
>>the XML declaration is a PI.

>
>
> From the XSLT point of view, though, there's a big difference between
> an XML Declaration and a PI.
>
> Making sure we get the terminology right now can save questions later
> on...


The XML declaration is a mandatory[1] PI in the eyes of the author (and
probably also in the eyes of SGML). What its function when parsing the
document is, is not of the author's concern, they just have to know it's
mandatory[1] and maybe also that it's used to determine the version and
character encoding.
That's like saying the root is not an element because it's the root,
which is a special element.

[1] Okay, maybe not mandatory, but very recommended.
--
Alan Plum, WAD/WD, Mushroom Cloud Productions
http://www.mushroom-cloud.com/
 
Reply With Quote
 
arnold m. slotnik
Guest
Posts: n/a
 
      04-15-2004
Ashmodai <(E-Mail Removed)> wrote in
news:c5m64i$atk$00$(E-Mail Removed)-online.com:

> The XML declaration is a mandatory[1] PI in the eyes of the
> author (and probably also in the eyes of SGML). What its
> function when parsing the document is, is not of the author's
> concern, they just have to know it's mandatory[1] and maybe also
> that it's used to determine the version and character encoding.
> That's like saying the root is not an element because it's the
> root, which is a special element.
>
> [1] Okay, maybe not mandatory, but very recommended.



<rant>
I know what the XML Declaration is--and what it isn't. It isn't a
PI--looks like one, but the editors of the spec were very clear
that it isn't a PI. It's a special construct, recommended
("should") in XML 1.0 and mandatory ("must") in XML 1.1.

XSLT has a special function for attaching an XML Declaration to an
output tree, a different function for creating PIs in the output
tree.

How many times have we seen in this and other venues, "How do I
write the XML PI on my output?" Ask the right question, it's easy
to find the right answer.

Tool vendors have confused the XML Declaration, the Text
Declaration, and a garden variety PI in their tools. (Anyone
besides me really annoyed by editing packages that put <?xml
version="1.0"?> on *everything*?)

It doesn't belong on a DTD--a DTD is not an XML document.

It doesn't belong on a external parsed entity--they take a Text
Declaration, which must contain the encoding and *may* contain the
version.

It's very specifically the XML Declaration--with a specific set of
related functions and usage--not "the XML PI".
</rant>

--
a. m. slotnik
 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      04-15-2004
In article <Xns94CC9CEBEF079slotnikverizon@199.45.49.11>,
arnold m. slotnik <(E-Mail Removed)> wrote:
> and mandatory ("must") in XML 1.1.


.... because it's the only way to specify that the document is XML 1.1.

-- Richard
 
Reply With Quote
 
Ashmodai
Guest
Posts: n/a
 
      04-16-2004
arnold m. slotnik scribbled something along the lines of:

> Ashmodai <(E-Mail Removed)> wrote in
> news:c5m64i$atk$00$(E-Mail Removed)-online.com:
>
>
>>The XML declaration is a mandatory[1] PI in the eyes of the
>>author (and probably also in the eyes of SGML). What its
>>function when parsing the document is, is not of the author's
>>concern, they just have to know it's mandatory[1] and maybe also
>>that it's used to determine the version and character encoding.
>>That's like saying the root is not an element because it's the
>>root, which is a special element.
>>
>>[1] Okay, maybe not mandatory, but very recommended.

>
>
>
> <rant>
> I know what the XML Declaration is--and what it isn't. It isn't a
> PI--looks like one, but the editors of the spec were very clear
> that it isn't a PI. It's a special construct, recommended
> ("should") in XML 1.0 and mandatory ("must") in XML 1.1.
>
> XSLT has a special function for attaching an XML Declaration to an
> output tree, a different function for creating PIs in the output
> tree.
>
> How many times have we seen in this and other venues, "How do I
> write the XML PI on my output?" Ask the right question, it's easy
> to find the right answer.
>
> Tool vendors have confused the XML Declaration, the Text
> Declaration, and a garden variety PI in their tools. (Anyone
> besides me really annoyed by editing packages that put <?xml
> version="1.0"?> on *everything*?)
>
> It doesn't belong on a DTD--a DTD is not an XML document.
>
> It doesn't belong on a external parsed entity--they take a Text
> Declaration, which must contain the encoding and *may* contain the
> version.
>
> It's very specifically the XML Declaration--with a specific set of
> related functions and usage--not "the XML PI".
> </rant>
>


I feel so loved.


Actually, putting XML PIs on everything is as dumb as putting the, say,
XHTML 1.1 Doctype declaration on everything -- why would anybody be THAT
stupid?

--
Alan Plum, WAD/WD, Mushroom Cloud Productions
http://www.mushroom-cloud.com/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Uses of processing instructions and notations Tom Anderson XML 4 12-13-2008 09:33 PM
Executing XML with XSL Processing Instructions sneill@mxlogic.com Javascript 2 10-21-2005 03:33 PM
Question: processing HTML, re-write default processing action of many tags Hubert Hung-Hsien Chang Python 2 09-17-2004 03:10 PM
split xml file between two processing instructions kcwolle Perl Misc 4 06-24-2004 12:54 PM
Processing instructions removed from result XML webservice Ronald Scheer ASP .Net Web Services 5 10-06-2003 11:17 AM



Advertisments