Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > XML Oddity

Reply
Thread Tools

XML Oddity

 
 
Mark Johnson
Guest
Posts: n/a
 
      03-30-2005
>>DELURK<<

Over the last few weeks, we've been working on building an online
portfolio using XML to pass content to an HTML page via PHP. In the
process, we've run across a rather inexplicable error which we've been
unable to find any reference to elsewhere. Hopefully, someone who
reads this will know what's going on and be able to provide some
assistance.

Here is our XML:
http://www.uky.edu/AuxServ/creativeg...tfolio_xml.txt

Here is our HTML and PHP:
http://www.uky.edu/AuxServ/creativeg...tfolio_php.txt

And here is the page in action:
http://www.uky.edu/AuxServ/creativeg.../portfolio.php

The problem is this: When a user clicks the third link under the
"Digital" heading, as you can see from the XML, the following text
ought to be displayed:

==begin==
Such has been the patient sufferance of these Colonies; and such is now
the necessity which constrains them to alter their former Systems of
Government. The history of the present King of Great Britain [George
III] is a history of repeated injuries and usurpations, all having in
direct object the establishment of an absolute Tyranny over these
States. To prove this, let Facts be submitted to a candid world. He
has refused his Assent to Laws, the most wholesome and necessary for
the public good. He has forbidden his Governors to pass Laws of
immediate and pressing importance, unless suspended in their operation
till his Assent should be obtained; and when so suspended, he has
utterly neglected to attend to them.
==end==

However, rather than that text being displayed in its entirety, the
following is all that displays:
==begin==
sing importance, unless suspended in their operation till his Assent
should be obtained; and when so suspended, he has utterly neglected to
attend to them.
==end==

Somehow, everything prior to that point has been eaten.

This is what we know: this error occurs in WindowsXP, MacOSX, and
RedHat Linux. It occurs regardless of whether IE or a Gekko-based
browser is used. It occurs regardless of what type of server the files
are uploaded to. If all elements are edited to contain the exact same
number of characters, the error seems to disappear, but doing so
renders the code useless for our purposes. No other errors have been
noted. Changing the code so that no elements are undisplayed has no
effect. The question is this: what is causing this error, and how can
it be avoided? Any assistance would be greatly appreciated.

Mark Johnson

 
Reply With Quote
 
 
 
 
Richard Light
Guest
Posts: n/a
 
      03-31-2005
In message <. com>, Mark
Johnson <> writes

Caveat: I know nothing about the PHP XML parser. However, I suspect
that the problem is a failure to separate the physical reading of input
blocks from the logical parsing of the data they contain. My reason for
saying this is that the truncated phrase you quote "sing importance,
unless suspended ..." is at the start of the second 4096-byte block in
the file.

I would guess that the parser handed you the first part of this data
content, you placed in your array variable, and then it handed you the
second part ... Little suspecting this, you promptly overwrote the
variable with this second chunk. You can easily test this hypothesis by
changing the block size and seeing if the position of the error changes.

If this is the case, you'll have to be a bit smarter about processing
character data. Or get a better parser ...

Richard Light

>Over the last few weeks, we've been working on building an online
>portfolio using XML to pass content to an HTML page via PHP. In the
>process, we've run across a rather inexplicable error which we've been
>unable to find any reference to elsewhere. Hopefully, someone who
>reads this will know what's going on and be able to provide some
>assistance.
>
>Here is our XML:
>http://www.uky.edu/AuxServ/creativeg...tfolio_xml.txt
>
>Here is our HTML and PHP:
>http://www.uky.edu/AuxServ/creativeg...tfolio_php.txt
>
>And here is the page in action:
>http://www.uky.edu/AuxServ/creativeg.../portfolio.php
>
>The problem is this: When a user clicks the third link under the
>"Digital" heading, as you can see from the XML, the following text
>ought to be displayed:
>
>==begin==
>Such has been the patient sufferance of these Colonies; and such is now
>the necessity which constrains them to alter their former Systems of
>Government. The history of the present King of Great Britain [George
>III] is a history of repeated injuries and usurpations, all having in
>direct object the establishment of an absolute Tyranny over these
>States. To prove this, let Facts be submitted to a candid world. He
>has refused his Assent to Laws, the most wholesome and necessary for
>the public good. He has forbidden his Governors to pass Laws of
>immediate and pressing importance, unless suspended in their operation
>till his Assent should be obtained; and when so suspended, he has
>utterly neglected to attend to them.
>==end==
>
>However, rather than that text being displayed in its entirety, the
>following is all that displays:
>==begin==
>sing importance, unless suspended in their operation till his Assent
>should be obtained; and when so suspended, he has utterly neglected to
>attend to them.
>==end==
>
>Somehow, everything prior to that point has been eaten.
>
>This is what we know: this error occurs in WindowsXP, MacOSX, and
>RedHat Linux. It occurs regardless of whether IE or a Gekko-based
>browser is used. It occurs regardless of what type of server the files
>are uploaded to. If all elements are edited to contain the exact same
>number of characters, the error seems to disappear, but doing so
>renders the code useless for our purposes. No other errors have been
>noted. Changing the code so that no elements are undisplayed has no
>effect. The question is this: what is causing this error, and how can
>it be avoided? Any assistance would be greatly appreciated.
>
>Mark Johnson
>


--
Richard Light
SGML/XML and Museum Information Consultancy


 
Reply With Quote
 
 
 
 
Malcolm Dew-Jones
Guest
Posts: n/a
 
      03-31-2005
Richard Light () wrote:
: In message <. com>, Mark
: Johnson <> writes

: Caveat: I know nothing about the PHP XML parser. However, I suspect
: that the problem is a failure to separate the physical reading of input
: blocks from the logical parsing of the data they contain. My reason for
: saying this is that the truncated phrase you quote "sing importance,
: unless suspended ..." is at the start of the second 4096-byte block in
: the file.

: I would guess that the parser handed you the first part of this data
: content, you placed in your array variable, and then it handed you the
: second part ... Little suspecting this, you promptly overwrote the
: variable with this second chunk. You can easily test this hypothesis by
: changing the block size and seeing if the position of the error changes.

: If this is the case, you'll have to be a bit smarter about processing
: character data. Or get a better parser ...
^^^^^^^^^^^^^^^^^^^^^

sounds like a likely scenario

however that doesn't mean there's anything wrong with the parser. a SAX
parser has no requirement to feed all of some contiguous character data in
a single call, and in fact a parser that did so could be considered a
problem.

Imagine if I had an xml document that had a giga byte of contiguous
character data. One of the points of the SAX parser is that it can feed
that data to the handler in smaller, more memory efficient chunks, and not
have to load the entire string in to memory.




--

This space not for rent.
 
Reply With Quote
 
Richard Light
Guest
Posts: n/a
 
      03-31-2005
In message <>, Malcolm Dew-Jones
<> writes

>however that doesn't mean there's anything wrong with the parser. a SAX
>parser has no requirement to feed all of some contiguous character data in
>a single call, and in fact a parser that did so could be considered a
>problem.
>
>Imagine if I had an xml document that had a giga byte of contiguous
>character data. One of the points of the SAX parser is that it can feed
>that data to the handler in smaller, more memory efficient chunks, and not
>have to load the entire string in to memory.


I would agree with that principle entirely. However, from a software
engineering point of view, I would expect as the user of such a parser
to be able to control the "text chunk" size, and not have character data
cut into arbitrary chunks based on where the block boundaries in the
input stream happen to fall.

Richard
--
Richard Light
SGML/XML and Museum Information Consultancy


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
XML writer oddity darrel ASP .Net 0 05-23-2006 09:04 PM
Different results parsing a XML file with XML::Simple (XML::Sax vs. XML::Parser) Erik Wasser Perl Misc 5 03-05-2006 10:09 PM
XML::Simple oddity Rob Perl Misc 2 02-16-2006 02:16 AM
One More Wireless-to-Wired Oddity J Wireless Networking 1 01-02-2006 06:37 PM
split commands oddity rxl124@hehe.com Perl 3 01-29-2004 07:59 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57