Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Need help with String encoding issue

Thread Tools

Need help with String encoding issue
Posts: n/a
I'm writting a servlet filter that manipulates the http response body
(injecting HTML). It works fine with pages using the English charset,
but when processing a page with double-byte chars, some of the
characters are junk.

When processing the OutputStream, I create a ByteArrayOutputStream

baStream = new ByteArrayOutputStream();

then I create a string (forcing it to UTF- with that stream:

String str = new String(baStream.toByteArray(), "UTF-8");

I then manipulate that string using standard regex, then output it back
to the browser:


The problem is I don't know a lot about how charsets work in Java. I
do know that Java's native string charset is UTF-16, but beyond that,
I'm not sure how to make sure that what comes into my servlet filter is
what goes out.

Thanks in advance!


Reply With Quote
Lothar Kimmeringer
Posts: n/a
      09-23-2006 Removed) wrote:

> outStream.write(str.getBytes());

here you should use str.getBytes("UTF-8");

Alternatively use a Writer instead of an OutputStream, that
you can get from the servlet as well. Then you can write
String direclty without coping with the encoding to be used.

Or you wrap an OutputStreamWriter around your OutputStream
with specifying the encoding you want to use within the

Regards, Lothar
Lothar Kimmeringer E-Mail: (E-Mail Removed)
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)

Always remember: The answer is forty-two, there can only be wrong
Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to use String.split to split a mixed encoding string(partencoded in gbk, part encoded in utf-8) Stanley Xu Ruby 2 03-23-2011 02:06 PM
Reading Text File Encoding and converting to Perls internal UTF-8 encoding Perl Misc 2 04-17-2009 11:22 PM
CGI query string encoding issue... howa Perl Misc 3 03-06-2009 03:49 AM
changing JVM encoding; setting -Dfile.encoding doesn't work Java 1 10-08-2004 09:50 PM
Encoding.Default and Encoding.UTF8 Hardy Wang ASP .Net 5 06-09-2004 04:04 PM