Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > ANSI/UTF-8 File when save string to it

Reply
Thread Tools

ANSI/UTF-8 File when save string to it

 
 
DDD
Guest
Posts: n/a
 
      02-14-2011
Hi,
I have a question about character encode and file store format.

// xaM= is base64 codes of Chinese character '牛'
// The following codes will get a UTF-8 text file in
XP.
// And it will show a ţ .
char *decodedText = PL_Base64Decode("xaM=", 4,
nsnull);

FILE *fp1;
fp1=fopen("test.txt", "ab");
fwrite(decodedText, sizeof(char), strlen(decodedText), fp1);
fputc('\n', fp1);
fclose(fp1);

// uaTX98Wj is base64 codes of Chinese character "工作
牛"
// The following codes will get a ANSI text file in
XP.
// And it will show a "工作牛" .
char *decodedText1 = PL_Base64Decode("uaTX98Wj", 8, nsnull);

FILE *fp11;
fp11=fopen("test1.txt", "ab");
fwrite(decodedText1, sizeof(char), strlen(decodedText1), fp11);
fputc('\n', fp11);
fclose(fp11);

So, what will cause fwrite function to chose different file store
format, such as UTF-8 or ANSI in windows?

Thanks in advance.
 
Reply With Quote
 
 
 
 
Jens Thoms Toerring
Guest
Posts: n/a
 
      02-14-2011
DDD <(E-Mail Removed)> wrote:
> Hi,
> I have a question about character encode and file store format.


> // xaM= is base64 codes of Chinese character '牛'
> // The following codes will get a UTF-8 text file in
> XP.
> // And it will show a ţ .
> char *decodedText = PL_Base64Decode("xaM=", 4,
> nsnull);


> FILE *fp1;
> fp1=fopen("test.txt", "ab");
> fwrite(decodedText, sizeof(char), strlen(decodedText), fp1);
> fputc('\n', fp1);
> fclose(fp1);


> // uaTX98Wj is base64 codes of Chinese character "工作
> 牛"
> // The following codes will get a ANSI text file in
> XP.
> // And it will show a "工作牛" .
> char *decodedText1 = PL_Base64Decode("uaTX98Wj", 8, nsnull);


> FILE *fp11;
> fp11=fopen("test1.txt", "ab");
> fwrite(decodedText1, sizeof(char), strlen(decodedText1), fp11);
> fputc('\n', fp11);
> fclose(fp11);


> So, what will cause fwrite function to chose different file store
> format, such as UTF-8 or ANSI in windows?


Nothing at all (and that holds for Windows and any other ope-
rating system). fwrite() faithfully writes the content of me-
mory into a file and doesn't care a bit what those data are.
If you want some external tool (that you e.g. use to view the
file with) to recognize its content as UTF-8 then you must
make sure that the data you pass to fwrite() have the correct
form, fwrite() won't change them in any way. Same for ASCII.

Since you seem to set up the memory you write out with fwrite()
using some function named PL_Base64Decode() it boils down to
what this function is doing and what data you pass to it. But
this isn't a standard C function but probably from a third-party
library, so you will rather likely get better answers to that
question in a support forum for that library.

On the other hand you write: "xaM= is base64 codes of Chinese
character '牛'". But it's only a representation of that cha-
racter in a certain encoding system. Since it gets interpreted,
after having been "decoded" and written out to a file, as UTF-8
it rather likely is the UTF-8 representation of that character.
Now I'm not an expert on Chinese at all (those characters do
not even show up with my newsreader) but if I remember correct-
ly there are several encodings for chinese characters in use.
Perhaps the 'uaTX98Wj' you give for the other character is the
base64 code in some other encoding system than UTF-8 that the
tool you use to view the file doesn't know about. And it may
tell you that it's an ASCII text file due to some faulty heu-
ristics it applies to determine the file content type (it can
be very difficult to get it right with only a few bytes in a
file).
Regards, Jens
--
\ Jens Thoms Toerring ___ http://www.velocityreviews.com/forums/(E-Mail Removed)
\__________________________ http://toerring.de
 
Reply With Quote
 
 
 
 
Francois Grieu
Guest
Posts: n/a
 
      02-14-2011
I've found this useful and readable:
The Absolute Minimum Every Software Developer Absolutely,
Positively Must Know About Unicode and Character Sets
(No Excuses!) by Joel Spolsky
<http://www.joelonsoftware.com/articles/Unicode.html>

Francois Grieu
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-14-2011
(E-Mail Removed) (Jens Thoms Toerring) writes:
> DDD <(E-Mail Removed)> wrote:

[...]
>> So, what will cause fwrite function to chose different file store
>> format, such as UTF-8 or ANSI in windows?

>
> Nothing at all (and that holds for Windows and any other ope-
> rating system). fwrite() faithfully writes the content of me-
> mory into a file and doesn't care a bit what those data are.


If the file is opened in text mode, it will perform whatever
binary-to-text translations are appropriate. For Unix-like systems,
typically this does nothing; for Windows-like systems, it typically just
translates '\n' characters to CRLF pairs.

[...]

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Jens Thoms Toerring
Guest
Posts: n/a
 
      02-15-2011
Keith Thompson <(E-Mail Removed)> wrote:
> (E-Mail Removed) (Jens Thoms Toerring) writes:
> > DDD <(E-Mail Removed)> wrote:

> [...]
> >> So, what will cause fwrite function to chose different file store
> >> format, such as UTF-8 or ANSI in windows?

> >
> > Nothing at all (and that holds for Windows and any other ope-
> > rating system). fwrite() faithfully writes the content of me-
> > mory into a file and doesn't care a bit what those data are.


> If the file is opened in text mode, it will perform whatever
> binary-to-text translations are appropriate. For Unix-like systems,
> typically this does nothing; for Windows-like systems, it typically just
> translates '\n' characters to CRLF pairs.


Thanks, forgot about that (probably got to do some serious Win-
dows programming to get bitten by it to make it stick

Regards, Jens
--
\ Jens Thoms Toerring ___ (E-Mail Removed)
\__________________________ http://toerring.de
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Save contents of iframe from parent's save button user ASP .Net 1 04-04-2005 07:44 PM
cannot save or save as in word Phil Edwards Computer Support 0 07-12-2004 07:01 AM
word will not save or save as Alex B Computer Support 5 07-10-2004 05:23 AM
Save, Save As, Paste Phil Edwards Computer Support 1 06-27-2004 03:32 PM
Save alternative file on right-click->save-picture-as Shahar Golan Javascript 5 10-16-2003 05:01 PM



Advertisments