On Aug 22, 6:40*pm, Öö Tiib <oot...@hot.ee> wrote:
> On Wednesday, August 22, 2012 7:35:57 PM UTC+3, Richard Smith wrote:
> > I recently encountered some C++ code that made use of multicharacter
> > literals -- that is, something that looks like a character literal,
> > but contains more than one character:
>
> > * int i = 'foo';
>
> > I must admit, I hadn't realised that C++ still allowed these and had
> > assumed they went the way of implicit int and K&R-style function
> > declarations. The standard tells me that, unsurprisingly, their
> > representation implementation-defined (and so does the C standard), so
> > my questions here are not about what the standard requires (nor
> > whether I should be using them), but rather what implementations
> > commonly do and why.
>
> ...
>
> That you further found out. There are really no other reasons to use it (and have never been) but to confuse the heck out of a novice maintainer.
Well, clearly that's not true. Compiler writers don't decide to add
functionality simply "to confuse the heck out of a novice
maintainer."
Multicharacter literals go back to the late 1960s in the B language; C
inherited them from B, and C++ from B. It's easy to see why they
existed in B. For one thing, there was no char type: everything was
an int, even if you only cared about the lowest 8 bits. Optimising
for code size was also far more important than today, and if you were
used to writing in assembler, you'd be used to putting small strings
as immediates. If you look in the B manual, you'll see examples of
multicharacter literals used in this way: effectively, optimised very
short strings.
However, GCC's (perfectly legal) implementation choices doesn't allow
that usage. As you point out, compiler writers don't break
compatibility with old code for no reason, yet here, somewhere along
the line, a compiler vendor evidently decided to implement
multicharacter literals in a way that broke their use as small
strings. It would have been trivial to have implemented them on a
little-endian machine so that they worked as short strings. So I can
only assume there was some other use of multicharacter literals that
was more important to keep working. I am curious as to what that
other, more important use was.
Richard
|