Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Convert UTF-8 encoded data from file into unicode escape

Reply
Thread Tools

Convert UTF-8 encoded data from file into unicode escape

 
 
Fritz Bayer
Guest
Posts: n/a
 
      10-23-2004
Hi,

I'm looking for a little program, which reads utf-8 data from a file
and writes it in the form of unicode escape into another text file.

Why am I looking for something like this? Well, I have a file which
contains utf-8 encoded data.

That data I would like to build staticly into my program. So I would
like to copy and paste it into a String constant (String
="\uxxxx\uxxxx...").

However, since I can't just open up a viewer and copy and paste the
contents (of course), I would have to convert it into unicode escape.

Then I could copy and paste those escape code into my program. I
thought that their must be some source code / program aroung which
does that?

Fritz
 
Reply With Quote
 
 
 
 
Alex Kizub
Guest
Posts: n/a
 
      10-23-2004
Fritz:
Convertion from utf-8 to unicode is Java loalization privilege.
So, let Java do what it supposed to do.

public class a{
public static void main(String []a) throws Exception {
java.text.DecimalFormat f;
f = new java.text.DecimalFormat();
f.applyPattern("\\u0000");

java.io.FileReader fr=new java.io.FileReader("a.java");
while (fr.ready()) {
System.out.println(f.format(fr.read()));
}
fr.close();
}
}

Alex Kizub.

Fritz Bayer wrote:

> Hi,
>
> I'm looking for a little program, which reads utf-8 data from a file
> and writes it in the form of unicode escape into another text file.
>
> Why am I looking for something like this? Well, I have a file which
> contains utf-8 encoded data.
>
> That data I would like to build staticly into my program. So I would
> like to copy and paste it into a String constant (String
> ="\uxxxx\uxxxx...").
>
> However, since I can't just open up a viewer and copy and paste the
> contents (of course), I would have to convert it into unicode escape.
>
> Then I could copy and paste those escape code into my program. I
> thought that their must be some source code / program aroung which
> does that?
>
> Fritz


 
Reply With Quote
 
 
 
 
Fritz Bayer
Guest
Posts: n/a
 
      10-24-2004
Alex Kizub <> wrote in message news:<>...
> Fritz:
> Convertion from utf-8 to unicode is Java loalization privilege.
> So, let Java do what it supposed to do.
>
> public class a{
> public static void main(String []a) throws Exception {
> java.text.DecimalFormat f;
> f = new java.text.DecimalFormat();
> f.applyPattern("\\u0000");
>
> java.io.FileReader fr=new java.io.FileReader("a.java");
> while (fr.ready()) {
> System.out.println(f.format(fr.read()));
> }
> fr.close();
> }
> }
>
> Alex Kizub.
>
> Fritz Bayer wrote:
>
> > Hi,
> >
> > I'm looking for a little program, which reads utf-8 data from a file
> > and writes it in the form of unicode escape into another text file.
> >
> > Why am I looking for something like this? Well, I have a file which
> > contains utf-8 encoded data.
> >
> > That data I would like to build staticly into my program. So I would
> > like to copy and paste it into a String constant (String
> > ="\uxxxx\uxxxx...").
> >
> > However, since I can't just open up a viewer and copy and paste the
> > contents (of course), I would have to convert it into unicode escape.
> >
> > Then I could copy and paste those escape code into my program. I
> > thought that their must be some source code / program aroung which
> > does that?
> >
> > Fritz


Thank you Alex. I`m experience a small problem so. Some of the escapes
look like:

\u65533

ie they are too long. I also noticed that none of the escapes contain
hexadecimal, which seems to be wrong since unicode escapes contain
them.
 
Reply With Quote
 
Alex Kizub
Guest
Posts: n/a
 
      10-24-2004
My appology. Of course it should be hex numbers.
It's hard to think in the middele of the night.
But it's obvious and you can do hex numbers by yourself.

Here is one of solutions.
public class a{
public static void main(String []a) throws Exception {

java.io.FileReader fr=new java.io.FileReader("a.java");
while (fr.ready()) {
String hex=Integer.toHexString(fr.read());
switch (hex.length()){
case 1: System.out.print("\\u000"); break;
case 2: System.out.print("\\u00"); break;
case 3: System.out.print("\\u0"); break;
case 4: System.out.print("\\u"); break;
default: throw new RuntimeException( hex+" is tool long to be a Character");
}
System.out.println(hex);
} fr.close();
}
}

Alex Kizub.
Fritz Bayer wrote:

> Alex Kizub <> wrote in message news:<>...
> > Fritz:
> > Convertion from utf-8 to unicode is Java loalization privilege.
> > So, let Java do what it supposed to do.
> >
> > public class a{
> > public static void main(String []a) throws Exception {
> > java.text.DecimalFormat f;
> > f = new java.text.DecimalFormat();
> > f.applyPattern("\\u0000");
> >
> > java.io.FileReader fr=new java.io.FileReader("a.java");
> > while (fr.ready()) {
> > System.out.println(f.format(fr.read()));
> > }
> > fr.close();
> > }
> > }
> >
> > Alex Kizub.
> >
> > Fritz Bayer wrote:
> >
> > > Hi,
> > >
> > > I'm looking for a little program, which reads utf-8 data from a file
> > > and writes it in the form of unicode escape into another text file.
> > >
> > > Why am I looking for something like this? Well, I have a file which
> > > contains utf-8 encoded data.
> > >
> > > That data I would like to build staticly into my program. So I would
> > > like to copy and paste it into a String constant (String
> > > ="\uxxxx\uxxxx...").
> > >
> > > However, since I can't just open up a viewer and copy and paste the
> > > contents (of course), I would have to convert it into unicode escape.
> > >
> > > Then I could copy and paste those escape code into my program. I
> > > thought that their must be some source code / program aroung which
> > > does that?
> > >
> > > Fritz

>
> Thank you Alex. I`m experience a small problem so. Some of the escapes
> look like:
>
> \u65533
>
> ie they are too long. I also noticed that none of the escapes contain
> hexadecimal, which seems to be wrong since unicode escapes contain
> them.


 
Reply With Quote
 
Michael Borgwardt
Guest
Posts: n/a
 
      10-24-2004
Fritz Bayer wrote:
> I'm looking for a little program, which reads utf-8 data from a file
> and writes it in the form of unicode escape into another text file.


Sun distributes that program with its JDK/SDK. It's called
"native2ascii".
 
Reply With Quote
 
Fritz Bayer
Guest
Posts: n/a
 
      10-25-2004
Michael Borgwardt <> wrote in message news:<>...
> Fritz Bayer wrote:
> > I'm looking for a little program, which reads utf-8 data from a file
> > and writes it in the form of unicode escape into another text file.

>
> Sun distributes that program with its JDK/SDK. It's called
> "native2ascii".


Thanks for the tip. I looked at it and their are just to issues with
the program. The one thing is that is does not encode all ASCII codes
as unicode escapes.

So if non printable characters occur I will not be able to copy and
paste them into my source code. That's why I'm looking for something,
which converts everything.

Second issue I ran into is that if I copy and paste only unicode
escapes into a string, I still have to escape some of the characters
for example the " - This seems very cumbersome. I guess i would also
have to escape newlines and tabs and so on to be sure that everything
gets imported correctly. Oh my...
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Convert unicode escape sequences to unicode in a file Jeremy Python 0 01-11-2011 11:39 PM
Convert unicode escape sequences to unicode in a file Jeremy Python 1 01-11-2011 10:36 PM
How to read strings cantaining escape character from a file and useit as escape sequences? slomo Python 5 12-02-2007 11:39 AM
.NET-ey way to convert XML-encoded/escaped entities into normal characters/HTML? ASP .Net 2 06-20-2007 05:32 PM
Getting unicode escape sequence from unicode character? Kenneth McDonald Python 1 12-27-2006 10:27 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57