"Ghislain Benrais" <> wrote in message
news:e7osgo$sge$...
> Hello,
> I am new to java and I run a short program processing xml files.
> Everything ran very well until I received xml files with the character
> itself instead of its numerical reference (for instance 'é' instead of
> 'é'). I thought java would handle it but unexpectedly, it handles it
> under DOS but doesn't handle it under Linux !
> Do you have any explanations ?
> Input file :
> =======
> <?xml version="1.0" encoding="ISO-8859-1" ?>
[most of the code snipped]
> input = new InputSource(new FileReader("file.xml"));
From
http://java.sun.com/j2se/1.5.0/docs/...leReader.html:
<quote>
The constructors of this class assume that the default character encoding
and the default byte-buffer size are appropriate. To specify these values
yourself, construct an InputStreamReader on a FileInputStream.
</quote>
In other words, you're not specifying the encoding in the reader, and so
it picks some arbitrary one, and that encoding doesn't match the encoding
used in your XML file.
Did you try using the constructor of InputSource which takes a byte
stream instead of a character stream?
http://java.sun.com/j2se/1.5.0/docs/...io.InputStream)
- Oliver