Go Back   Velocity Reviews > Newsgroups > Java
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

Java - Redirecting System.out and exotic characters

 
Thread Tools Search this Thread
Old 11-04-2009, 09:08 AM   #1
Default Redirecting System.out and exotic characters


I redirect system.out to a JTextArea with the following class

private class TextAreaOutputStream extends OutputStream {
JTextArea textArea;
TextAreaOutputStream(JTextArea textArea) {
this.textArea = textArea;
}
public void flush() {
textArea.repaint();
}
public void write(int b) {
//try {
textArea.append(new String(new byte[] {(byte)b}));
// } catch (UnsupportedEncodingException e){e.printStackTrace();}
}
}

and I use the class with
JTextArea msg = new JTextArea();
System.setOut(new PrintStream(new TextAreaOutputStream(msg), true));

This works well except when I have a character like Č (latin capital
letter C with caron, '\u010C') in a string, which is displayed as ? in
the text area whereas
msg.append(string); would be ok.
How could I correct the code above to have such a letter well
formed ?

Thanks
François


François R
  Reply With Quote
Old 11-04-2009, 12:25 PM   #2
Mayeul
 
Posts: n/a
Default Re: Redirecting System.out and exotic characters
François R wrote:
> This works well except when I have a character like Č (latin capital
> letter C with caron, '\u010C') in a string, which is displayed as ? in
> the text area whereas
> msg.append(string); would be ok.
> How could I correct the code above to have such a letter well
> formed ?


You have a character encoding problem.

Both the constructors PrintStream(OutputStream,boolean) and
String(byte[]) assume you're using your platform's default character
encoding to translate chars to bytes and vice-versa.

I expect your platform's default character to _not_ handle characters
such as U+10C, hence them being replaced with question marks.

The fix is to specify a character encoding to use, a unicode one, for
instance utf-8.


You can do that by constructing your PrintStream this way:

new PrintStream(new TextAreaOutputStream(msg), true, "utf-8")

And implementing your TextAreaOutputStream differently : it should store
the bytes in a buffer and wait til the OutputStream is flushed, thus
probably aligned after a character's final byte, then transform the
bytes received into a String and update the TextArea with it.

This could be done by writing the bytes you receive to a
ByteArrayOutputStream, and whenever it is flushed, fetch the byte[] and
build a String with it as such:

new String(bytes, "utf-8")


Note: one may think that using utf-16 instead of utf-8 would guarantee a
character to be 2-bytes and thus the solution easier to implement.
Except that *really* special characters (higher-than-U+FFFF characters)
still are be 4-bytes instead of 2-bytes with utf-16.
ucs-4 may work better if well-supported, I'm not sure.

--
Mayeul


Mayeul
  Reply With Quote
Old 11-05-2009, 10:53 AM   #3
Roedy Green
 
Posts: n/a
Default Re: Redirecting System.out and exotic characters
On Wed, 4 Nov 2009 01:08:55 -0800 (PST), François R
<> wrote, quoted or indirectly quoted someone who
said :

>
>This works well except when I have a character like ? (latin capital
>letter C with caron, '\u010C') in a string, which is displayed as ? in
>the text area whereas
>msg.append(string); would be ok.
>How could I


The way I would do it is direct the output to a file using UTF-8
encoding, or at least an encoding that supports the letters you need.
Then view it in some sort of viewer/editor that understands encodings.

See http://mindprod.com/applet/fileio.html
for the code to set up a PrintWriter to a file.


--
Roedy Green Canadian Mind Products
http://mindprod.com

An example (complete and annotated) is worth 1000 lines of BNF.


Roedy Green
  Reply With Quote
Old 11-05-2009, 03:28 PM   #4
François R
 
Posts: n/a
Default Re: Redirecting System.out and exotic characters
On Nov 4, 1:25Â*pm, Mayeul <mayeul.marg...@free.fr> wrote:
> François R wrote:
> > This works well except when I have a character like Č (latin capital
> > letter C with caron, '\u010C') in a string, which is displayed as ? in
> > the text area whereas
> > msg.append(string); would be ok.
> > How could I correct the code above to have such a letter well
> > formed ?

>
> You have a character encoding problem.
>
> Both the constructors PrintStream(OutputStream,boolean) and
> String(byte[]) assume you're using your platform's default character
> encoding to translate chars to bytes and vice-versa.
>
> I expect your platform's default character to _not_ handle characters
> such as U+10C, hence them being replaced with question marks.
>
> The fix is to specify a character encoding to use, a unicode one, for
> instance utf-8.
>
> You can do that by constructing your PrintStream this way:
>
> new PrintStream(new TextAreaOutputStream(msg), true, "utf-8")
>
> And implementing your TextAreaOutputStream differently : it should store
> the bytes in a buffer and wait til the OutputStream is flushed, thus
> probably aligned after a character's final byte, then transform the
> bytes received into a String and update the TextArea with it.
>
> This could be done by writing the bytes you receive to a
> ByteArrayOutputStream, and whenever it is flushed, fetch the byte[] and
> build a String with it as such:
>
> new String(bytes, "utf-8")
>
> Note: one may think that using utf-16 instead of utf-8 would guarantee a
> character to be 2-bytes and thus the solution easier to implement.
> Except that *really* special characters (higher-than-U+FFFF characters)
> still are be 4-bytes instead of 2-bytes with utf-16.
> ucs-4 may work better if well-supported, I'm not sure.
>
> --
> Mayeul


Thanks a lot for the suggestion !
I tried this:
try {
System.setOut(new PrintStream(new TextAreaOutputStream(msg), true,
"utf-8"));
} catch ....

and

private class TextAreaOutputStream extends OutputStream {
JTextArea textArea;
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
TextAreaOutputStream(JTextArea textArea) {
this.textArea = textArea;
}

public void flush() {
//textArea.repaint();
try {
textArea.append(buffer.toString("utf-8"));
buffer.reset();
} catch (UnsupportedEncodingException e){e.printStackTrace();}
}
public void write(int b) {
buffer.write(b);
//try {
//textArea.append(new String(new byte[] {(byte)b}));
// } catch (UnsupportedEncodingException e){e.printStackTrace();}
}

}

And it works well as it seems, with name like CÃ*žek or ÄŒÃ*žek properly
displayed.

François


François R
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
small problem replying in Agent 4.2 GrandpaChuck Computer Support 26 02-10-2007 04:49 AM
Illegal operation carololine Computer Support 12 07-14-2006 01:27 PM




SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46