Go Back   Velocity Reviews > Newsgroups > XML
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

XML - newbie question on UTFDataFormatException

 
Thread Tools Search this Thread
Old 06-24-2003, 05:58 PM   #1
Default newbie question on UTFDataFormatException


I recieve a UTFDataFormatException while parsing a huge xml file. What
is the meaning of this Exception and what are the possible causes ?
Could this be a problem with the application or is it purely a data
problem ?
I am using the xerces-c SAX parser.
Thanks for you help.


Sriv Chakravarthy
  Reply With Quote
Old 06-30-2003, 02:53 PM   #2
Sriv Chakravarthy
 
Posts: n/a
Default Re: newbie question on UTFDataFormatException

thanks for your response.
What is the difference between UTF-8 and LATIN1 ? If my xml document
contains all ascii chars then should the encoding be LATIN1 ?
And how will I set the encoding - as a parameter to the handler object
or in the first line of the xml file itself ( in <?xml ...> line )




(Richard Tobin) wrote in message news:<bda09m$2gm5$>...
> In article <>,
> Sriv Chakravarthy <> wrote:
>
> >I recieve a UTFDataFormatException while parsing a huge xml file. What
> >is the meaning of this Exception and what are the possible causes ?

>
> It means that it is reading the file as UTF-8 and there is a sequence
> of bytes in the file that is not legal UTF-8. Possibly your file is
> corrupted, but more likely it's actually in some other encoding such
> as Latin-1, and just needs a declaration specifying this (UTF-8 is
> the default).
>
> -- Richard

  Reply With Quote
Old 07-02-2003, 11:49 AM   #3
Bob Foster
 
Posts: n/a
Default Re: newbie question on UTFDataFormatException

"Sriv Chakravarthy" <> wrote in message
news: om...
> thanks for your response.
> What is the difference between UTF-8 and LATIN1 ? If my xml document
> contains all ascii chars then should the encoding be LATIN1 ?
> And how will I set the encoding - as a parameter to the handler object
> or in the first line of the xml file itself ( in <?xml ...> line )


If it's all ascii, the encoding can be ASCII. UTF-8 or LATIN1 are ok, too.
The best choice depends on what application is going to read the documents.
Every parser is required to support UTF-8; most support the others, too.

Bob Foster
http://www.xmlbuddy.com/


  Reply With Quote
Old 07-03-2003, 06:19 PM   #4
Sriv Chakravarthy
 
Posts: n/a
Default Re: newbie question on UTFDataFormatException

In xerces-c sax parser, how do you set the encoding ? is it set as the
first line <?xml...> in the xml document or is it set via a member
function ?


"Bob Foster" <> wrote in message news:<FMyMa.17056$926.572@sccrnsc03>...
> "Sriv Chakravarthy" <> wrote in message
> news: om...
> > thanks for your response.
> > What is the difference between UTF-8 and LATIN1 ? If my xml document
> > contains all ascii chars then should the encoding be LATIN1 ?
> > And how will I set the encoding - as a parameter to the handler object
> > or in the first line of the xml file itself ( in <?xml ...> line )

>
> If it's all ascii, the encoding can be ASCII. UTF-8 or LATIN1 are ok, too.
> The best choice depends on what application is going to read the documents.
> Every parser is required to support UTF-8; most support the others, too.
>
> Bob Foster
> http://www.xmlbuddy.com/

  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump