Velocity Reviews > C++ > integral types

# integral types

James Kanze
Guest
Posts: n/a

 03-29-2007
On Mar 29, 2:51 am, SasQ <(E-Mail Removed)> wrote:
> Dnia Sun, 25 Mar 2007 01:57:59 +0100, SasQ napisa³(a):

> Thanks for everyone for their explanations. It has cleared me
> a couple of things. Now I'll try to sumarize it:

> Machine word: sizeof(long int):
> 8-bit 4 Emulated as four 8-bit registers.
> 16-bit 4 Emulated as two 16-bit registers.
> 32-bit 4 Doesn't have to be emulated.
> 64-bit 4 or 8? Doesn't have to be emulated.

You're still limiting yourself too much. C++ has nothing to say
about the size of a machine word. All it guarantees is that
char has at least 8 bits (but there have been implementations
with 9, 10 and 32 bit chars, at least), that int is at least 16
bits (I've seen 16, 32, 36 and 48---and 24 wouldn't surprise me
for some machines I've heard of), that long is at least 32 bits,
and the next standard will also require a long long of at least
64 bits.

In addition, you are guaranteed that the size of any type is a
positive integral value, and that:

sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

Note that if char is 32 bits, all of the integral types can have
a size of 1.

> Now I have doubts only on 64-bit machines, where 'long int'
> may be that 'at least 32 bits', but it can be more also
> So if it can, sould it or not?

All of the 64 bit machines I use have a 64 bit long. It seems
the most natural.

> Next, we have 'short int', which the Standard requires to be
> at least 16 bits. So, let's look:

> Machine word: sizeof(short int):
> 8-bit 2 Emulated as two 8-bit registers
> 16-bit 2 Doesn't have to be emulated.
> 32-bit 2 Doesn't have to be emulated.
> 64-bit 2 Doesn't have to be emulated.

> Here, I have some doubts on 64-bit platform. On 32-bit
> 'short' is mostly 16-bit and no more, because it should be
> able to be shorter than plain 'int'.

That's not required. On word addressed machines (e.g. Unisys
2200), it's probably the same as int. In fact, on a Unisys
2200, I would expect short, int and long all to have 36 bits.

> On 64-bit platforms
> it could be more, if plain 'int' were 64-bit. But I don't
> know how it is there, and I've seen only one particular
> case, where plain 'int' has still 32 bits, so the 'short
> int' has to be 16-bit.

There's no "has to" about it. In practice, on a byte addressed
64 bit machine, the vendor will probably want to offer access to
all natively supported lengths. Since there are four, and there
are only four integral types, there is only one solution.

> And now we come to plain 'int' type
> The Standard requires it to be at least as much as 'short int',
> and defines it as the type most convenient for integer arithmetics
> on the particular platform. So, if we apply the same rules as
> for 'long int' [with the emulation], we would get:

> Machine word: sizeof(int):
> 8-bit 2?? Emulated as two 8-bit registers??
> 16-bit 2 Doesn't have to be emulated.
> 32-bit 4 Doesn't have to be emulated.
> 64-bit 4 or 8? Doesn't have to be emulated.

> I don't think emulating 'int' as two 8-bit registers to be
> the most convenient for the 8-bit platform to compute on
> integers

It's more convenient that using even more bytes, and it is the
least the standard allows.

> Even if 16-bit platforms could emulate C++
> Standard rules and feel good with it, for 8-bit machines
> there is something wrong, I think. Something, that was
> missed by the creators of Standard, or [more probably ]
> by me :/ So what is the thing I am missing here?

That people wanted C to be useful, and so defined a generally
useful set of rules for the period. (One could easily argue
today that an int should be required to be at least 32 bits.)
C++ just took over these rules.

> I think I know the theory [C++ Standard] but I don't know
> how to apply it in practice.

You apply it in practice by first deciding what your goals are.
If you're code targets desktop computers or larger, for example,
it's perfectly reasonable to assume that an int is at least 32
bits. If your code makes extensive use of the Windows API, for
its GUI, you might as well assume that int is 32 bits, and 2's
complement to boot. If you think that your code might have to
run on embedded systems, or on mainframes, or legacy systems,
then you'll have to be a lot more careful. Still, for most
code, all you have to worry about is the maximum and minimum
values you need to handle.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

SasQ
Guest
Posts: n/a

 03-30-2007
Dnia Thu, 29 Mar 2007 02:44:27 -0700, James Kanze napisa³(a):

> C++ has nothing to say about the size of a machine word.
> All it guarantees is that char has at least 8 bits
> (but there have been implementations with 9, 10 and 32 bit
> chars, at least), that int is at least 16 bits
> (I've seen 16, 32, 36 and 48---and 24 wouldn't surprise me
> for some machines I've heard of), that long is at least 32
> bits, and the next standard will also require a long long of
> at least 64 bits.

Is it sure for now?
Would there be 'long long long int' in a future then?

And where do you know that from? I had problems with obtaining
a copy of C++ Standard document [it isn't available for free in
the Internet in a way W3C standards do, so I had to borrow a copy
from a friend], and the more for obtaining any information about
the new C++0x Standard plans.

> In addition, you are guaranteed that the size of any type is a
> positive integral value

Good to know that it wouldn't be negative or a fraction
I haven't seen any negative size in my life yet ))

> and that:
>
> sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

Yeah, that's what I've known before.

> Note that if char is 32 bits, all of the integral types can have
> a size of 1.

Interesting note.

>> Now I have doubts only on 64-bit machines, where 'long int'
>> may be that 'at least 32 bits', but it can be more also
>> So if it can, sould it or not?

>
> All of the 64 bit machines I use have a 64 bit long.
> It seems the most natural.

For me too. But probably not for Microsoft: they use 32 bits
for 'long int' on 64-bit architecture and use nonstandard
[not for today version of C++ Standard] 'long long int' for
64-bit integers in their compiler.
I know the Standard doesn't require 'long int' to be the
maximal size on a particular platform, so the 32-bit 'long int'
is acceptable [thus stupid, for me]. But using 'long long int'
for 64 bits and showing a finger to Standards when it's possible
to use 'long int' for that, it's a mess IMHO.

>> Here, I have some doubts on 64-bit platform. On 32-bit
>> 'short' is mostly 16-bit and no more, because it should be
>> able to be shorter than plain 'int'.

>
> That's not required. On word addressed machines (e.g. Unisys
> 2200), it's probably the same as int. In fact, on a Unisys
> 2200, I would expect short, int and long all to have 36 bits.

Yes, I've thought '<=' but said '<' ;P My bad.

>> I don't think emulating 'int' as two 8-bit registers to be
>> the most convenient for the 8-bit platform to compute on
>> integers

>
> It's more convenient that using even more bytes, and it is the
> least the standard allows.

OK, I've found the following example:
http://www.z88dk.org/old/zcc.html#compdatat
and it explained me a lot.
Seems like the "emulation" is the way to go to implement the
rules of Standard in practice, on non-32-bit platforms.

--
SasQ

=?iso-8859-1?q?Erik_Wikstr=F6m?=
Guest
Posts: n/a

 03-30-2007
On 30 Mar, 02:31, SasQ <(E-Mail Removed)> wrote:
> Dnia Thu, 29 Mar 2007 02:44:27 -0700, James Kanze napisa³(a):
>
> > C++ has nothing to say about the size of a machine word.
> > All it guarantees is that char has at least 8 bits
> > (but there have been implementations with 9, 10 and 32 bit
> > chars, at least), that int is at least 16 bits
> > (I've seen 16, 32, 36 and 48---and 24 wouldn't surprise me
> > for some machines I've heard of), that long is at least 32
> > bits, and the next standard will also require a long long of
> > at least 64 bits.

>
> Is it sure for now?
> Would there be 'long long long int' in a future then?

You can get a copy of the working document for the next standard from
the C++ Standard Committee's page [1]. There you can also get to read
the proposals for new features and other information. If you does not
have the standard available you can always try to get a copy of the
draft closest to the standard.

Anyway, on this neat little list [2] you can see that the long long
type is part of the current working document and will thus be in the
next standard.

1. http://www.open-std.org/jtc1/sc22/wg21/
2. http://www.open-std.org/jtc1/sc22/wg...2006/n2122.htm

--
Erik Wikström

James Kanze
Guest
Posts: n/a

 03-30-2007
On Mar 30, 2:31 am, SasQ <(E-Mail Removed)> wrote:
> Dnia Thu, 29 Mar 2007 02:44:27 -0700, James Kanze napisa³(a):

> > C++ has nothing to say about the size of a machine word.
> > All it guarantees is that char has at least 8 bits
> > (but there have been implementations with 9, 10 and 32 bit
> > chars, at least), that int is at least 16 bits
> > (I've seen 16, 32, 36 and 48---and 24 wouldn't surprise me
> > for some machines I've heard of), that long is at least 32
> > bits, and the next standard will also require a long long of
> > at least 64 bits.

> Is it sure for now?

About as sure as anything can be. C99 has it. C++ adopted it,
partially at least on the grounds of C compatibility. Formally,
the next version of the standard hasn't been adopted, and the
committee could vote to remove it. Practically, I'd say that
the probability of that happening is about as close to 0 as you
can get.

> Would there be 'long long long int' in a future then?

I doubt it. The C committee adopted long long because it was
long was not the right solution, because it doesn't scale, so
they also provided a more generic solution for future expansion
(which has also been adopted by the committee).

> And where do you know that from?

I'm a technical expert with the French national body (AFNOR),
and participate in the standardization effort.

[...]
> >> Now I have doubts only on 64-bit machines, where 'long int'
> >> may be that 'at least 32 bits', but it can be more also
> >> So if it can, sould it or not?

> > All of the 64 bit machines I use have a 64 bit long.
> > It seems the most natural.

> For me too. But probably not for Microsoft: they use 32 bits
> for 'long int' on 64-bit architecture and use nonstandard
> [not for today version of C++ Standard] 'long long int' for
> 64-bit integers in their compiler.

Every C++ compiler I know accepts long long today. It is part
of the C standard, after all.

> I know the Standard doesn't require 'long int' to be the
> maximal size on a particular platform,

It requires long to be the largest integral size.

This lead to extensive debate in the C committee. There is a
lot of C code out there doing things like:
size_t s ;
printf( "%lu", (unsigned long)s ) ;
unsigned long is used, of course, because it was guaranteed that
no larger integral type existed.

The adoptation of long long broke such code. The C committee
was not happy about that, but felt, in the end, that they didn't
have much choice.

In the next version of the C++ standard, there will be no
requirement that long (or even long long) be the largest
integral type available. A system with 128 bit words could
still define long long to be a 64 bit type, and define a
int128_t as well. In the future, if you want the largest
integral type available, you will have to use intmax_t (defined
in <stdint.h>, and in C++ <cstdint>). To write something like
the above, you would have to include <stdint.h> and <inttype.h>,
and write:
size_t s ;
printf( "%ju", (uintmax_t)s ) ;
or:
size_t s ;
printf( "%zu", s ) ;

> so the 32-bit 'long int'
> is acceptable [thus stupid, for me]. But using 'long long int'
> for 64 bits and showing a finger to Standards when it's possible
> to use 'long int' for that, it's a mess IMHO.

I don't think it a particularly wise decision either. But
vendors don't like to break user code, even when it is already
"broken"; presumably, they fear that there is some user code
which depends on long being 32 bits.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34