Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Transformer encoding not working for ISO-8859-1 only for UTF-8

Reply
Thread Tools

Transformer encoding not working for ISO-8859-1 only for UTF-8

 
 
janib
Guest
Posts: n/a
 
      08-07-2006
I have a problem when transforming text containing the swedish letters
"", "" and "". If I do

Transformer t =TransformerFactory.newInstance().newTransformer() ;
t.setOutputProperty( OutputKeys.METHOD, "xml");
t.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
t.setOutputProperty( OutputKeys.INDENT, "yes");
t.setOutputProperty( OutputKeys.ENCODING, "ISO-8859-1"); <------- *
t.transform( new DOMSource( document), new StreamResult( output ) );
return output.toString( );

I get an xml-file containing broken characters (=?) for the swedish
letters:

<?xml version="1.0" encoding="ISO-8859-1"?>
....
<channelinfo confirmed="true" validate="false" name="Internet">
<publishdate>1154940455898</publishdate>
<unpublishdate>1154940455898</unpublishdate>
<attribute name="rooms"/>
<attribute name="year"/>
<attribute name="title">K?pes</attribute> <------------- *
<attribute name="price">20000</attribute>
<attribute name="area"/>
<attribute name="body">Vill k?pa en truck</attribute>
<-------------- *
</channelinfo>

but if I change the encoding to UTF-8:

t.setOutputProperty( OutputKeys.ENCODING, "UTF-8"); <------- *

the letters are alright:

<?xml version="1.0" encoding="UTF-8"?>
....
<channelinfo confirmed="true" validate="false" name="Internet">
<publishdate>1154940455898</publishdate>
<unpublishdate>1154940455898</unpublishdate>
<attribute name="rooms"/>
<attribute name="year"/>
<attribute name="title">Kpes</attribute> <------------- *
<attribute name="price">20000</attribute>
<attribute name="area"/>
<attribute name="body">Vill kpa en truck</attribute>
<-------------- *
</channelinfo>

But the xml has to be formated in ISO-8859-1 so it would be nice if I
could make it work with that encoding.

Anyone know where I can alter this behavior or why it behaves like
above?

 
Reply With Quote
 
 
 
 
Jono
Guest
Posts: n/a
 
      08-07-2006
Hi Janib,
Your code works fine for me (as expected, because ", "" and ""
are part of the ISO-8859-1 character set), so I think the problem might
lie with one of the objects you're creating out of the scope of the
code snippet. Your "output" object might have a side-effect if it's
doing some character encoding of its own. I tried with a StringWriter
and also with a FileOutputStream and it worked correctly (using Java
1.5).
Cheers,
Jono


janib wrote:
> I have a problem when transforming text containing the swedish letters
> "", "" and "". If I do
>
> Transformer t =TransformerFactory.newInstance().newTransformer() ;
> t.setOutputProperty( OutputKeys.METHOD, "xml");
> t.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
> t.setOutputProperty( OutputKeys.INDENT, "yes");
> t.setOutputProperty( OutputKeys.ENCODING, "ISO-8859-1"); <------- *
> t.transform( new DOMSource( document), new StreamResult( output ) );
> return output.toString( );
>
> I get an xml-file containing broken characters (=?) for the swedish
> letters:
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> ...
> <channelinfo confirmed="true" validate="false" name="Internet">
> <publishdate>1154940455898</publishdate>
> <unpublishdate>1154940455898</unpublishdate>
> <attribute name="rooms"/>
> <attribute name="year"/>
> <attribute name="title">K?pes</attribute> <------------- *
> <attribute name="price">20000</attribute>
> <attribute name="area"/>
> <attribute name="body">Vill k?pa en truck</attribute>
> <-------------- *
> </channelinfo>
>
> but if I change the encoding to UTF-8:
>
> t.setOutputProperty( OutputKeys.ENCODING, "UTF-8"); <------- *
>
> the letters are alright:
>
> <?xml version="1.0" encoding="UTF-8"?>
> ...
> <channelinfo confirmed="true" validate="false" name="Internet">
> <publishdate>1154940455898</publishdate>
> <unpublishdate>1154940455898</unpublishdate>
> <attribute name="rooms"/>
> <attribute name="year"/>
> <attribute name="title">Kpes</attribute> <------------- *
> <attribute name="price">20000</attribute>
> <attribute name="area"/>
> <attribute name="body">Vill kpa en truck</attribute>
> <-------------- *
> </channelinfo>
>
> But the xml has to be formated in ISO-8859-1 so it would be nice if I
> could make it work with that encoding.
>
> Anyone know where I can alter this behavior or why it behaves like
> above?


 
Reply With Quote
 
 
 
 
janib
Guest
Posts: n/a
 
      08-07-2006
Tje output object is only a ByteArrayOuputStream...

ByteArrayOutputStream output = new ByteArrayOutputStream( );

Jono skrev:

> Hi Janib,
> Your code works fine for me (as expected, because ", "" and ""
> are part of the ISO-8859-1 character set), so I think the problem might
> lie with one of the objects you're creating out of the scope of the
> code snippet. Your "output" object might have a side-effect if it's
> doing some character encoding of its own. I tried with a StringWriter
> and also with a FileOutputStream and it worked correctly (using Java
> 1.5).
> Cheers,
> Jono
>
>
> janib wrote:
> > I have a problem when transforming text containing the swedish letters
> > "", "" and "". If I do
> >
> > Transformer t =TransformerFactory.newInstance().newTransformer() ;
> > t.setOutputProperty( OutputKeys.METHOD, "xml");
> > t.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
> > t.setOutputProperty( OutputKeys.INDENT, "yes");
> > t.setOutputProperty( OutputKeys.ENCODING, "ISO-8859-1"); <------- *
> > t.transform( new DOMSource( document), new StreamResult( output ) );
> > return output.toString( );
> >
> > I get an xml-file containing broken characters (=?) for the swedish
> > letters:
> >
> > <?xml version="1.0" encoding="ISO-8859-1"?>
> > ...
> > <channelinfo confirmed="true" validate="false" name="Internet">
> > <publishdate>1154940455898</publishdate>
> > <unpublishdate>1154940455898</unpublishdate>
> > <attribute name="rooms"/>
> > <attribute name="year"/>
> > <attribute name="title">K?pes</attribute> <------------- *
> > <attribute name="price">20000</attribute>
> > <attribute name="area"/>
> > <attribute name="body">Vill k?pa en truck</attribute>
> > <-------------- *
> > </channelinfo>
> >
> > but if I change the encoding to UTF-8:
> >
> > t.setOutputProperty( OutputKeys.ENCODING, "UTF-8"); <------- *
> >
> > the letters are alright:
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > ...
> > <channelinfo confirmed="true" validate="false" name="Internet">
> > <publishdate>1154940455898</publishdate>
> > <unpublishdate>1154940455898</unpublishdate>
> > <attribute name="rooms"/>
> > <attribute name="year"/>
> > <attribute name="title">Kpes</attribute> <------------- *
> > <attribute name="price">20000</attribute>
> > <attribute name="area"/>
> > <attribute name="body">Vill kpa en truck</attribute>
> > <-------------- *
> > </channelinfo>
> >
> > But the xml has to be formated in ISO-8859-1 so it would be nice if I
> > could make it work with that encoding.
> >
> > Anyone know where I can alter this behavior or why it behaves like
> > above?


 
Reply With Quote
 
Roland de Ruiter
Guest
Posts: n/a
 
      08-07-2006
On 7-8-2006 13:35, janib wrote:
> Tje output object is only a ByteArrayOuputStream...
>
> ByteArrayOutputStream output = new ByteArrayOutputStream( );
>
>

See my reply in comp.lang.java.help
--
Regards,

Roland
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
FOR SALE Netgear 814 Router transformer Jimmydolittle Wireless Networking 0 04-08-2006 05:56 PM
Help with SAXSource and javax.xml.transform.Transformer Andreas Java 1 09-04-2005 12:58 AM
blocking I/O with javax.xml.parsers.DocumentBuilder.parse() and javax.xml.transform.Transformer.transform() jazzdman@gmail.com Java 1 03-27-2005 06:56 AM
XML transformer Jacinle Young Java 2 06-24-2004 03:00 AM
Customizing Transformer objects when writing to an XML file Peter Loh Java 0 02-05-2004 07:02 AM



Advertisments