Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Need help reading UTF-16 files ...

Reply
Thread Tools

Need help reading UTF-16 files ...

 
 
nnimod@gmail.com
Guest
Posts: n/a
 
      01-13-2006
Hi. I'm having trouble reading some unicode files. Basically, I have to
parse certain files. Some of those files are being input in Japanese,
Chinese etc. The easiest way, I figured, to distinguish between plain
ASCII files I receive and the Unicode ones would be to check if the
first two bytes read 0xFFFE.

But nothing I do seems to be able to do that.

I tried reading it in binary mode and reading two characters in:

FILE *fin; char ch [2];
fin.open (filename, "rb");
if (fin) { fopen (ch, sizeof (char), 2, fin); ......

I tried reading it in binary mode and read a wchar_t in:

FILE *fin; wchar_t wch;
fin.open (filename, "rb");
if (fin) { fopen (&wch, sizeof (wchar_t), 1, fin); ....

I tried using ifstream for two characters/wifstream for wchar_t but to
no avail.

All of them seems to skip the so-called byte-order-mask. I am quite
lost for ideas. I saw a few examples using MFC Class CStdioFile etc.
but I don't want to use those. I'm sure there's a perfectly simple
method to do this.

Sorry about the long msg for such a simple problem, but it is getting
quite frustrating.... Any help would be very much appreciated.

Cheers,
Nemo.

PS. I know the mask is there. I viewed the files using a hex editor.

 
Reply With Quote
 
 
 
 
P.J. Plauger
Guest
Posts: n/a
 
      01-13-2006
<(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...

> Hi. I'm having trouble reading some unicode files. Basically, I have to
> parse certain files. Some of those files are being input in Japanese,
> Chinese etc. The easiest way, I figured, to distinguish between plain
> ASCII files I receive and the Unicode ones would be to check if the
> first two bytes read 0xFFFE.
>
> But nothing I do seems to be able to do that.
>
> I tried reading it in binary mode and reading two characters in:
>
> FILE *fin; char ch [2];
> fin.open (filename, "rb");
> if (fin) { fopen (ch, sizeof (char), 2, fin); ......
>
> I tried reading it in binary mode and read a wchar_t in:
>
> FILE *fin; wchar_t wch;
> fin.open (filename, "rb");
> if (fin) { fopen (&wch, sizeof (wchar_t), 1, fin); ....
>
> I tried using ifstream for two characters/wifstream for wchar_t but to
> no avail.
>
> All of them seems to skip the so-called byte-order-mask. I am quite
> lost for ideas. I saw a few examples using MFC Class CStdioFile etc.
> but I don't want to use those. I'm sure there's a perfectly simple
> method to do this.


See our CoreX library, at our web site. It has exactly what you need.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


 
Reply With Quote
 
 
 
 
Richard Herring
Guest
Posts: n/a
 
      01-17-2006
In message <(E-Mail Removed) .com>,
http://www.velocityreviews.com/forums/(E-Mail Removed) writes
>Hi. I'm having trouble reading some unicode files. Basically, I have to
>parse certain files. Some of those files are being input in Japanese,
>Chinese etc. The easiest way, I figured, to distinguish between plain
>ASCII files I receive and the Unicode ones would be to check if the
>first two bytes read 0xFFFE.
>
>But nothing I do seems to be able to do that.
>
>I tried reading it in binary mode and reading two characters in:
>
>FILE *fin; char ch [2];
>fin.open (filename, "rb");
>if (fin) { fopen (ch, sizeof (char), 2, fin); ......


Try posting the *actual* code that causes the problem. The above is
clearly not it.

--
Richard Herring
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Reading files inside zip files from Javascript Narendra Sisodiya Javascript 7 12-08-2009 02:15 AM
UnauthorizedAccessException when reading XML files (no problem when reading other file-types) blabla120@gmx.net ASP .Net 0 09-15-2006 02:08 PM
need help with reading/writing to files Ben Ruby 3 01-17-2005 05:26 PM
Help! Files, Files, and more Files ... Everywhere JeffS Digital Photography 22 09-19-2004 01:47 AM
JAR files reading list files rob hadow Java 4 05-21-2004 04:41 PM



Advertisments