![]() |
|
|
|||||||
![]() |
XML - newbie question on UTFDataFormatException |
|
|
Thread Tools | Search this Thread |
|
|
#1 |
|
I recieve a UTFDataFormatException while parsing a huge xml file. What
is the meaning of this Exception and what are the possible causes ? Could this be a problem with the application or is it purely a data problem ? I am using the xerces-c SAX parser. Thanks for you help. Sriv Chakravarthy |
|
|
|
|
#2 |
|
Posts: n/a
|
thanks for your response.
What is the difference between UTF-8 and LATIN1 ? If my xml document contains all ascii chars then should the encoding be LATIN1 ? And how will I set the encoding - as a parameter to the handler object or in the first line of the xml file itself ( in <?xml ...> line ) (Richard Tobin) wrote in message news:<bda09m$2gm5$>... > In article <>, > Sriv Chakravarthy <> wrote: > > >I recieve a UTFDataFormatException while parsing a huge xml file. What > >is the meaning of this Exception and what are the possible causes ? > > It means that it is reading the file as UTF-8 and there is a sequence > of bytes in the file that is not legal UTF-8. Possibly your file is > corrupted, but more likely it's actually in some other encoding such > as Latin-1, and just needs a declaration specifying this (UTF-8 is > the default). > > -- Richard |
|
|
|
#3 |
|
Posts: n/a
|
"Sriv Chakravarthy" <> wrote in message
news: om... > thanks for your response. > What is the difference between UTF-8 and LATIN1 ? If my xml document > contains all ascii chars then should the encoding be LATIN1 ? > And how will I set the encoding - as a parameter to the handler object > or in the first line of the xml file itself ( in <?xml ...> line ) If it's all ascii, the encoding can be ASCII. UTF-8 or LATIN1 are ok, too. The best choice depends on what application is going to read the documents. Every parser is required to support UTF-8; most support the others, too. Bob Foster http://www.xmlbuddy.com/ |
|
|
|
#4 |
|
Posts: n/a
|
In xerces-c sax parser, how do you set the encoding ? is it set as the
first line <?xml...> in the xml document or is it set via a member function ? "Bob Foster" <> wrote in message news:<FMyMa.17056$926.572@sccrnsc03>... > "Sriv Chakravarthy" <> wrote in message > news: om... > > thanks for your response. > > What is the difference between UTF-8 and LATIN1 ? If my xml document > > contains all ascii chars then should the encoding be LATIN1 ? > > And how will I set the encoding - as a parameter to the handler object > > or in the first line of the xml file itself ( in <?xml ...> line ) > > If it's all ascii, the encoding can be ASCII. UTF-8 or LATIN1 are ok, too. > The best choice depends on what application is going to read the documents. > Every parser is required to support UTF-8; most support the others, too. > > Bob Foster > http://www.xmlbuddy.com/ |
|