Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > DOM parsing - Document root element is missing.

Reply
Thread Tools

DOM parsing - Document root element is missing.

 
 
Rico
Guest
Posts: n/a
 
      10-17-2004
The following piece of code :

DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(filename);


ends in "Document root element is missing" for the following XML:

<?xml version="1.0" encoding="utf-8"?>
<EmailSender>
<db_name>master</db_name>
<document_type>document_New</document_type>
<emailID />
<document_ID>23983</document_ID>
</EmailSender>


I don't really know how the XML is being produced but a space between the
last double-quote and the last '?' seems to solve the problem.
So does changing double-quotes to single-quotes.

Is it something wrong with the XML document or am I missing something
about the usage of the API ?

Thanks. Regards,
Rico.
 
Reply With Quote
 
 
 
 
xarax
Guest
Posts: n/a
 
      10-17-2004
"Rico" <> wrote in message
news...
> The following piece of code :
>
> DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
> .newInstance();
> DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
> Document doc = docBuilder.parse(filename);
>
>
> ends in "Document root element is missing" for the following XML:
>
> <?xml version="1.0" encoding="utf-8"?>
> <EmailSender>
> <db_name>master</db_name>
> <document_type>document_New</document_type>
> <emailID />
> <document_ID>23983</document_ID>
> </EmailSender>
>
>
> I don't really know how the XML is being produced but a space between the
> last double-quote and the last '?' seems to solve the problem.
> So does changing double-quotes to single-quotes.
>
> Is it something wrong with the XML document or am I missing something
> about the usage of the API ?


The first line of the XML file is not XML syntax.
That's according to the rules of XML.

<?xml version='1.0' encoding='UTF-8' ?>

The first line above is an example of a correct
XML header. It is *not* XML, because the keywords
must be specified in the correct order. (Attribute
keywords that appear within the XML body can be
specified in any order.) I use single quotes in
preference to double quotes, but the space appearing
before the final ? is required.



 
Reply With Quote
 
 
 
 
Rico
Guest
Posts: n/a
 
      10-18-2004
On Sun, 17 Oct 2004 15:39:17 +0000, xarax wrote:
> "Rico" <> wrote in message
>> I don't really know how the XML is being produced but a space between the
>> last double-quote and the last '?' seems to solve the problem.
>> So does changing double-quotes to single-quotes.


> The first line of the XML file is not XML syntax.
> That's according to the rules of XML.
>
> <?xml version='1.0' encoding='UTF-8' ?>
>
> The first line above is an example of a correct
> XML header. It is *not* XML, because the keywords
> must be specified in the correct order. (Attribute
> keywords that appear within the XML body can be
> specified in any order.) I use single quotes in
> preference to double quotes, but the space appearing
> before the final ? is required.


Thanks for the input xarax. However, I don't think so, after checking,
that the space is required. The file is produced by a program written in
VB.Net and I am reading it using the Java DOM package.

For some reason, if I somehow modify and save the file before getting my
program to read it, the parsing goes fine. No missing root element or
anything. That's what was happening when I added the space, to match what
worked when I had been testing my program using my own files.

Any further pointers would be very much appreciated. Thanks.

Rico.
 
Reply With Quote
 
Sudsy
Guest
Posts: n/a
 
      10-18-2004
Rico wrote:
<snip>
> Any further pointers would be very much appreciated. Thanks.
>
> Rico.


So as soon as you touch the file with an editor it parses
correctly? So now you have enough information to start on
the process of discovery!
Edit the file, making no changes, save, and exit.
Next use a binary comparator to check for differences.
Perhaps it's as simple as the ^Z used to mark end-of-file
in the M$ world.
Possibly the line termination characters: \r\n in the M$
world, \n in *NIX. Could be the cause, as the problem is
manifesting itself in the first line of the file, no?

--
Java/J2EE/JSP/Struts/Tiles/C/UNIX consulting and remote development.

 
Reply With Quote
 
Rico
Guest
Posts: n/a
 
      10-18-2004
On Sun, 17 Oct 2004 23:38:15 -0400, Sudsy wrote:
> Rico wrote:
>> Any further pointers would be very much appreciated. Thanks.


> So as soon as you touch the file with an editor it parses
> correctly? So now you have enough information to start on
> the process of discovery!
> Edit the file, making no changes, save, and exit.
> Next use a binary comparator to check for differences.
> Perhaps it's as simple as the ^Z used to mark end-of-file
> in the M$ world.


Thanks Sudsy. This sounds like a good line of reasoning. Both my Java
program and the VB.NET program are running on Win2K Pro.
Vim on Cygwin reports that I've got an "incomplete last line"
So the above guess could be in the right direction...

Appending "\n" to the file had Vim not complaining anymore but there's
some rubbish characters before the header, which even Textpad displays in
binary mode for the unmodified file coming from the VB.NET program.

It turns out the machine on which the VB.NET program was compiled is
running some Unicode settings that produced garbage on my PC. Textpad
manages to get rid of that upon saving and that's why I could parse it
afterwards.

Rico.
 
Reply With Quote
 
Sudsy
Guest
Posts: n/a
 
      10-18-2004
Rico wrote:
<snip>
> Appending "\n" to the file had Vim not complaining anymore but there's
> some rubbish characters before the header, which even Textpad displays in
> binary mode for the unmodified file coming from the VB.NET program.


If you check the archives you'll find mention of a BOM, or Byte Order Mark.
It sounds like you'll have to perform some pre-processing of this file
before trying to parse it. But I think you already know this by now...

--
Java/J2EE/JSP/Struts/Tiles/C/UNIX consulting and remote development.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Can be multiple instances of element used as the root element? VK XML 8 10-31-2006 06:51 PM
Using DOM Library to Add Namespace Declarations to Root Element Tag eric.jester@gmail.com Java 5 04-20-2005 10:30 PM
DOM Partial Document Parsing Gary V Java 3 02-21-2004 04:44 AM
Xalan document() function finding wrong document root Steve Carrow XML 0 07-28-2003 02:28 AM
Xalan document() function finding wrong document root Steve Carrow Java 0 07-28-2003 02:28 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57