Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Multi-character constants

Reply
Thread Tools

Multi-character constants

 
 
Mirco Wahab
Guest
Posts: n/a
 
      07-09-2008
After reading through some (open) Intel (CPU detection)
C++ source (http://www.intel.com/cd/ids/develope...eng/276611.htm)
I stumbled upon a sketchy use of multibyte characters

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

260:
unsigned int VendorID[3] = {0, 0, 0};
try // If CPUID instruction is supported
{
...
}
catch (...)
{
...
}
return (
(VendorID[0] == 'uneG') &&
(VendorID[1] == 'Ieni') &&
(VendorID[2] == 'letn')
);

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

This seems to work, gcc 4.2 emits a warning:

"warning: multi-character character constant"

and Visual C++ 9 says nothing at all.

Whats the matter w/multibyte characters now?
I didn't use them and would be glad to learn
if they are widely implemented and part of
the standard soon/now?

gcc tells us: (http://gcc.gnu.org/onlinedocs/gcc/Ch...mentation.html)
...
[Characters]
...
The value of a wide character constant containing more than
one multibyte character, or containing a multibyte character
or escape sequence not represented in the extended execution
character set (C90 6.1.3.4, C99 6.4.4.4).
...



Regards & Thanks for clearing this

M.
 
Reply With Quote
 
 
 
 
James Kanze
Guest
Posts: n/a
 
      07-10-2008
On Jul 9, 4:29 pm, Mirco Wahab <(E-Mail Removed)-halle.de> wrote:
> After reading through some (open) Intel (CPU detection)
> C++ source (http://www.intel.com/cd/ids/develope...eng/276611.htm)
> I stumbled upon a sketchy use of multibyte characters


> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 260:
> unsigned int VendorID[3] = {0, 0, 0};
> try // If CPUID instruction is supported
> {
> ...
> }
> catch (...)
> {
> ...
> }
> return (
> (VendorID[0] == 'uneG') &&
> (VendorID[1] == 'Ieni') &&
> (VendorID[2] == 'letn')
> );
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -


> This seems to work, gcc 4.2 emits a warning:


> "warning: multi-character character constant"


> and Visual C++ 9 says nothing at all.


> Whats the matter w/multibyte characters now?


First, do you mean multi-byte characters (e.g. UTF-, or
multicharacter literals. Your example doesn't contain any
multi-byte characters, only multicharacter literals.

> I didn't use them and would be glad to learn if they are
> widely implemented and part of the standard soon/now?


Multicharacter literals are a holdover from the original C. As
far as I can tell, they have no use, and are of no interest
whatsoever. And what they mean is implementation defined. All
of which is probably why g++ warns about them.

Multi-byte characters are becoming more and more frequent as
applications shift to UTF-8, for reasons of
internationalization. True support is still spotty, but getting
there; the next version of the standard will require it (to some
degree---there still won't be functions like isdigit which work
on them).

> gcc tells us: (http://gcc.gnu.org/onlinedocs/gcc/Ch...mentation.html)
> ...
> [Characters]
> ...
> The value of a wide character constant containing more than
> one multibyte character, or containing a multibyte character
> or escape sequence not represented in the extended execution
> character set (C90 6.1.3.4, C99 6.4.4.4).
> ...


Implementation defined behavior is required to be documented by
the implementation. In this case, you've cut the only
significant bit, a link to the implementation defined behavior,
where you'll find:

The compiler values a multi-character character constant
a character at a time, shifting the previous value left
by the number of bits per target character, and then
or-ing in the bit-pattern of the new character truncated
to the width of a target character. The final
bit-pattern is given type int, and is therefore signed,
regardless of whether single characters are signed or
not (a slight change from versions 3.1 and earlier of
GCC). If there are more characters in the constant than
would fit in the target int the compiler issues a
warning, and the excess leading characters are ignored.

For example, 'ab' for a target with an 8-bit char would
be interpreted as `(int) ((unsigned char) 'a' * 256 +
(unsigned char) 'b')', and '\234a' as `(int) ((unsigned
char) '\234' * 256 + (unsigned char) 'a')'.

(Technically, this documentation only applies to C, I think.
But I would be very surprised if C++ did differently.)

But since this is implementation defined, the above is only
valid for gcc (although it does seem to be a frequent behavior).

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
 
 
 
James Kanze
Guest
Posts: n/a
 
      07-10-2008
On Jul 9, 4:39 pm, Victor Bazarov <(E-Mail Removed)> wrote:
> Mirco Wahab wrote:


[...]
> > gcc tells us:
> > (http://gcc.gnu.org/onlinedocs/gcc/Ch...mentation.html)
> > ...
> > [Characters]
> > ...
> > The value of a wide character constant containing more than
> > one multibyte character, or containing a multibyte character
> > or escape sequence not represented in the extended execution
> > character set (C90 6.1.3.4, C99 6.4.4.4).
> > ...


> The are part of C++ since before the first Standard, IIRC.
> The problem with them, however, is that the order of the bytes
> in memory depends on the endianness of the system (or other
> factors). Also, they don't have the type 'char', they have
> the type 'int' and their representation is
> implementation-defined (see [lex.ccon]/1).


They were part of K&R C. Where a character literal always had
type int. Even in C, however, the only place I've seen them
used was for generating the "magic" for certain types of files
in very early Unix. (Presumably, the author of the code "knew"
what his compiler did.) They're one of those misfeatures which
we can't get rid of for reasons of backwards compatibility.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Generic and constants Maki VHDL 11 11-25-2004 06:09 PM
Big integer constants Hal Murray VHDL 5 11-12-2004 03:13 PM
Range constants? Tim Hubberstey VHDL 2 06-29-2004 05:10 PM
Seperate file to hold constants?? kwaj VHDL 2 03-04-2004 06:42 AM
constants declaration Benjamin Todd VHDL 2 02-15-2004 01:49 PM



Advertisments