Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > ansi c compiler character encoding

Reply
Thread Tools

ansi c compiler character encoding

 
 
Andreas Lundgren
Guest
Posts: n/a
 
      08-18-2008
Hi!

Is it determined that the C standard compiler always encode characters
with the same character excoding? If for example the functions Foo and
Bar are compiled by different compilers, is it unambiguous how to
interpret the character string in Bar?

Does string.h expect a specific string format?

void Foo(void)
{
char myTextString[11] = "stuvxyzåäö";
Bar(myTextString);
}

void Bar(char* inp)
{
What character set to expect?
}
 
Reply With Quote
 
 
 
 
James Kuyper
Guest
Posts: n/a
 
      08-18-2008
Andreas Lundgren wrote:
> Hi!
>
> Is it determined that the C standard compiler always encode characters
> with the same character excoding?


No.
 
Reply With Quote
 
 
 
 
Andreas Lundgren
Guest
Posts: n/a
 
      08-18-2008
On 18 Aug, 14:07, James Kuyper <jameskuy...@verizon.net> wrote:
> Andreas Lundgren wrote:
> > Hi!

>
> > Is it determined that the C standard compiler always encode characters
> > with the same character excoding?

>
> No.


****.

/Andreas
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      08-18-2008
Andreas Lundgren <> writes:
> Is it determined that the C standard compiler always encode characters
> with the same character excoding? If for example the functions Foo and
> Bar are compiled by different compilers, is it unambiguous how to
> interpret the character string in Bar?
>
> Does string.h expect a specific string format?
>
> void Foo(void)
> {
> char myTextString[11] = "stuvxyzåäö";
> Bar(myTextString);
> }
>
> void Bar(char* inp)
> {
> What character set to expect?
> }


No.

But if the two compilers are being used on the same system, it's very
likely that they'll use the same encoding. Since you're calling one
function from the other, presumably you're using the compilers on the
same system and linking the resulting code into a single executable or
equivalent.

Typically a given operating system will impose representations for
certain things. Though this is outside the scope of the C standard,
it's in the best interest of compiler writers to make their generate
code work and play well with that of other compilers. (For example, a
C compiler for Linux that generates code that's incompatible with code
generated by gcc wouldn't be very useful.)

This goes far beyond character set issues and includes things like
integer and floating-point type representations and function calling
conventions.

Your later followup suggests that you're concerned about some
real-world situation, presumably on some specific system. You should
ask in a newsgroup that deals with that system.

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Jens Thoms Toerring
Guest
Posts: n/a
 
      08-18-2008
Keith Thompson <kst-> wrote:
> Andreas Lundgren <> writes:
> > Is it determined that the C standard compiler always encode characters
> > with the same character excoding? If for example the functions Foo and
> > Bar are compiled by different compilers, is it unambiguous how to
> > interpret the character string in Bar?
> >
> > Does string.h expect a specific string format?
> >
> > void Foo(void)
> > {
> > char myTextString[11] = "stuvxyzåäö";
> > Bar(myTextString);
> > }
> >
> > void Bar(char* inp)
> > {
> > What character set to expect?
> > }


> No.


> But if the two compilers are being used on the same system, it's very
> likely that they'll use the same encoding. Since you're calling one
> function from the other, presumably you're using the compilers on the
> same system and linking the resulting code into a single executable or
> equivalent.


Is it actually a question about the compiler at all? As far as
I can see the compiler will happily create a string literal with
whatever there is in the string, not caring a bit about the en-
coding of the string. I guess the problem is much more one of
how the source files are generated and the expectations of the
output medium.

Consider the case of using one editor for the first file, set
to output files in e.g. one of the different (and incompatible)
russian extended ASCII code pages, and the second file genera-
ted with another editor, set to output in a different encoding.
Even if you use the same compiler this should lead to trouble.
And if then the terminal that receives the output of the pro-
gram is set to a third encoding it becomes a complete mess

Regards, Jens
--
\ Jens Thoms Toerring ___
\__________________________ http://toerring.de
 
Reply With Quote
 
Daniel Molina Wegener
Guest
Posts: n/a
 
      08-18-2008
On Aug 18, 7:48 am, Andreas Lundgren <d99...@efd.lth.se> wrote:
> Hi!
>
> Is it determined that the C standard compiler always encode characters
> with the same character excoding? If for example the functions Foo and
> Bar are compiled by different compilers, is it unambiguous how to
> interpret the character string in Bar?


No, it does not depends on the compiler...

>
> Does string.h expect a specific string format?
>
> void Foo(void)
> {
> char myTextString[11] = "stuvxyzåäö";


Here, instead of char, try with wchar_t and
related functions if you are using unicode
for your messages and your .c files

> Bar(myTextString);
>
> }
>
> void Bar(char* inp)
> {
> What character set to expect?


Thats depends on the user environment, but if the
user environments is using unicode, you can expect no
more than an array of bytes, other case is with
wchar_t and related functions...

>
> }


Regards,
DMW
 
Reply With Quote
 
jameskuyper@verizon.net
Guest
Posts: n/a
 
      08-18-2008
Daniel Molina Wegener wrote:
> On Aug 18, 7:48 am, Andreas Lundgren <d99...@efd.lth.se> wrote:
> > Hi!
> >
> > Is it determined that the C standard compiler always encode characters
> > with the same character excoding? If for example the functions Foo and
> > Bar are compiled by different compilers, is it unambiguous how to
> > interpret the character string in Bar?

>
> No, it does not depends on the compiler...
>
> >
> > Does string.h expect a specific string format?
> >
> > void Foo(void)
> > {
> > char myTextString[11] = "stuvxyz���";

>
> Here, instead of char, try with wchar_t and
> related functions if you are using unicode
> for your messages and your .c files


Whether or not wchar_t has anything to do with unicode depends upon
the compiler; the standard makes no such requirement. When it does,
the way in which you can take advantage of that fact depends upon the
compiler as well.
 
Reply With Quote
 
Flash Gordon
Guest
Posts: n/a
 
      08-18-2008
Daniel Molina Wegener wrote, On 18/08/08 18:29:
> On Aug 18, 7:48 am, Andreas Lundgren <d99...@efd.lth.se> wrote:
>> Hi!
>>
>> Is it determined that the C standard compiler always encode characters
>> with the same character excoding? If for example the functions Foo and
>> Bar are compiled by different compilers, is it unambiguous how to
>> interpret the character string in Bar?

>
> No, it does not depends on the compiler...


You are wrong. See the replies others posted before you for details.

>> Does string.h expect a specific string format?
>>
>> void Foo(void)
>> {
>> char myTextString[11] = "stuvxyzåäö";

>
> Here, instead of char, try with wchar_t and
> related functions if you are using unicode
> for your messages and your .c files
>
>> Bar(myTextString);
>>
>> }
>>
>> void Bar(char* inp)
>> {
>> What character set to expect?

>
> Thats depends on the user environment,


Wrong. It depends on what the function is written to expect and
(assuming the function expects a simple C string, which is likely) on
the encoding the implementation expects.

Actually, the expected encodings for standard C library functions which
handle strings and characters can be changed at run-time using the
setlocale() function, so it could also depend on what the program has
done before calling this function.

> but if the
> user environments is using unicode, you can expect no
> more than an array of bytes,


Not necessarily.

> other case is with
> wchar_t and related functions...


For a start, an array of wchar_t is not simply an array of bytes.

>> }

--
Flash Gordon
 
Reply With Quote
 
Dik T. Winter
Guest
Posts: n/a
 
      08-21-2008
In article <d8838200-350c-4f92-b9d2-> Andreas Lundgren <> writes:
> A simple example may be the letter =D6 that in ASCII is represented by
> the number 153, but in ISO-8859-1 and Unicode is represented by the
> number 214.


That letter is not represented in ASCII. ASCII contains the code points
0 to 127, no more.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
pre-ansi to ansi c++ conversion? Frank Iannarilli C++ 2 07-21-2009 11:05 PM
Are there statistics packages in ANSI C and/or ANSI C++? lbrtchx@gmail.com C Programming 11 04-28-2008 03:00 AM
Are there statistics packages in ANSI C and/or ANSI C++? lbrtchx@gmail.com C++ 1 04-24-2008 06:44 PM
character encoding +missing character sequence raavi Java 2 03-02-2006 05:01 AM
Help with Constant Define--Compiler Issue with ANSI or my compiler or me? No Spam C Programming 7 01-04-2005 01:37 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57