Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > On what does size of data types depend?

Reply
Thread Tools

On what does size of data types depend?

 
 
Skarmander
Guest
Posts: n/a
 
      10-05-2005
Alexei A. Frounze wrote:
<snip>
>>(I remember seeing a bug while a program was being ported: A function
>>took an argument of type long, and the value passed was "- sizeof
>>(short)". The function received a value of 65534 which was a bit
>>unexpected. And yes, the compiler was right. )

>
>
> C is wonderful in this respect. Perhaps because of this Java AFAIK has no
> unsigned types.
>

Which, incidentally, is a spectacular misfeature when they then call the
8-bit signed type "byte". Either you keep promoting things back and
forth to integers, or you use two's complement (which Java conveniently
guarantees). Either way it's annoying. Consistency isn't everything.

S.
 
Reply With Quote
 
 
 
 
Alexei A. Frounze
Guest
Posts: n/a
 
      10-06-2005
"Skarmander" <(E-Mail Removed)> wrote in message
news:434455c6$0$11078$(E-Mail Removed)4all.nl...
> Alexei A. Frounze wrote:
> >>(I remember seeing a bug while a program was being ported: A function
> >>took an argument of type long, and the value passed was "- sizeof
> >>(short)". The function received a value of 65534 which was a bit
> >>unexpected. And yes, the compiler was right. )

> >
> > C is wonderful in this respect. Perhaps because of this Java AFAIK has

no
> > unsigned types.
> >

> Which, incidentally, is a spectacular misfeature when they then call the
> 8-bit signed type "byte". Either you keep promoting things back and
> forth to integers, or you use two's complement (which Java conveniently
> guarantees). Either way it's annoying. Consistency isn't everything.


But you know, there are different kinds and levels of consistency. In
certain places I'd like C behave more like in math (e.g. singed vs unsigned,
promotions and related things), more humane and straightforward (e.g. the
way the type of a variable in the declaration/definition is specified), etc
etc. I'm not saying Java or C is definetely better, no. Each has its good
sides and bad sides and there's always a room for an improvmenet, not
necessarily big or very important, but good enough to be considered and
desired...

Alex


 
Reply With Quote
 
 
 
 
Eric Sosman
Guest
Posts: n/a
 
      10-06-2005


Alexei A. Frounze wrote On 10/05/05 18:30,:
> "Christian Bau" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
> ...
>
>>printf("short: %lu\n", (unsigned long) sizeof(short));
>>
>>will work as long as a short is fewer than four billion bytes

>
>
> Correct
>
>
>>(I remember seeing a bug while a program was being ported: A function
>>took an argument of type long, and the value passed was "- sizeof
>>(short)". The function received a value of 65534 which was a bit
>>unexpected. And yes, the compiler was right. )

>
>
> C is wonderful in this respect. Perhaps because of this Java AFAIK has no
> unsigned types.


Java has two unsigned types (one of which might be
better termed "signless"). IMHO, it would be better if
it had three.

--
http://www.velocityreviews.com/forums/(E-Mail Removed)

 
Reply With Quote
 
Jack Klein
Guest
Posts: n/a
 
      10-07-2005
On 05 Oct 2005 14:14:03 +0100, John Devereux
<(E-Mail Removed)> wrote in comp.lang.c:

> "Sunil" <(E-Mail Removed)> writes:
>
> > Hi all,
> >
> > I am using gcc compiler in linux.I compiled a small program
> > int main()
> > {
> > printf("char : %d\n",sizeof(char));
> > printf("unsigned char : %d\n",sizeof(unsigned char));
> > printf("short : %d\n",sizeof(short));

>
> <SNIP>
>
> This brings to mind something that I have wondered about.
>
> I often see advice elsewhere, and in other peoples programs,
> suggesting hiding all C "fundamental" types behind typedefs such as
>
> typedef char CHAR;
> typedef int INT32;
> typedef unsigned int UINT32;


The first one is useless, the second two are worse than useless, they
are dangerous, because on another machine int might have only 16 bits
and INT32 might need to be a signed long.

> typedef char* PCHAR;


This is more dangerous yes, never typedef a pointer this way. At
least not if the pointer will ever be dereferenced using that alias.

> The theory is that application code which always uses these typedefs
> will be more likely to run on multiple systems (provided the typedefs
> are changed of course).


More than theory, very real fact.

> I used to do this. Then I found out that C99 defined things like
> "uint32_t", so I started using these versions instead. But after
> following this group for a while I now find even these ugly and don't
> use them unless unavoidable.


Nobody says you have to care about portability if you don't want to.
That's between you, your bosses, and your users. If you are writing a
program for your own use, the only one you ever have to answer to is
yourself.

On the other hand, both UNIXy and Windows platforms are having the
same problems with the transition from 32 to 64 bits that they had
moving from 16 to 32 bits, if perhaps not quite so extreme.

For more than a decade, the natural integer type and native machine
word on Windows has been called a DWORD, and on 64 bit Windows the
native machine word is going to be a QWORD.

> What do people here think is best?


On one embedded project we had CAN communications between the main
processor, a 32-bit ARM, and slave processors that were 16/32 bit
DSPs.

The only types that were identical between the two were signed and
unsigned short, and signed and unsigned long. In fact, here are the
different integer types for the two platforms:

'plain' char unsigned 8-bit signed 16-bit
signed char signed 8-bit signed 16-bit
unsigned char unsigned 8-bit unsigned 16-bit
signed short signed 16-bit signed 16-bit
unsigned short unsigned 16-bit signed 16-bit
signed int signed 32-bit signed 16-bit
unsigned int unsigned 32-bit unsigned 16-bit
signed long signed 32-bit signed 32-bit
unsigned long unsigned 32-bit unsigned 32-bit

Both processors had hardware alignment requirements. The 32-bit
processor can only access 16-bit data at an even address and 32-bit
data on an address divisible by four. The penalty for misaligned
access is a hardware trap. The DSP only addresses memory in 16-bit
words, so there is no misalignment possible for anything but long, and
they had to be aligned on an even address (32-bit alignment). The
penalty for misaligned access is just wrong data (read), or
overwriting the wrong addresses (write).

Now the drivers for the CAN controller hardware are completely
off-topic here, but the end result on both systems is two 32-bit words
in memory containing the 0 to 8 octets (0 to 64 bits) of packet data.
These octets can represent any quantity of 8-bit, signed or unsigned
16-bit, or 32-bit data values that can fit in 64 bits, and have any
alignment.

So your mission, Mr. Phelps, if you decide to accept it, is to write
code that will run on both processors despite their different
character sizes and alignment requirements, that can use a format
specifier to parse 1 to 8 octets into the proper types with the proper
values.

The code I wrote runs on both processors with no modifications. And I
couldn't even use 'uint8_t', since the DSP doesn't have an 8-bit type.
I used 'uint_least8_t' instead.

As for the C99 choice of type definitions like 'unit8_t' and so on,
they are not the best I have ever seen, but they are also far from the
worst. And they have the advantage of being in a C standard, so with
a little luck they will eventually edge out all the others.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
 
Reply With Quote
 
John Devereux
Guest
Posts: n/a
 
      10-07-2005
Jack Klein <(E-Mail Removed)> writes:

> On 05 Oct 2005 14:14:03 +0100, John Devereux
> <(E-Mail Removed)> wrote in comp.lang.c:
>
> > This brings to mind something that I have wondered about.
> >
> > I often see advice elsewhere, and in other peoples programs,
> > suggesting hiding all C "fundamental" types behind typedefs such as
> >
> > typedef char CHAR;
> > typedef int INT32;
> > typedef unsigned int UINT32;

>
> The first one is useless, the second two are worse than useless, they
> are dangerous, because on another machine int might have only 16 bits
> and INT32 might need to be a signed long.


Perhaps I was not clear; the typedefs go in a single "portability"
header file and are specific to the machine. E.g.

#ifdef __X86
typedef short int INT16;
....
#endif
#ifdef __AVR
typedef int INT16;
....
#endif

(made up examples)

It should be understood that this file will need to be changed for
each new machine, but that hopefully nothing else will. By using
UINT32 etc thoughout, nothing needs to change except this one file.

> typedef char* PCHAR;
>
> This is more dangerous yes, never typedef a pointer this way. At
> least not if the pointer will ever be dereferenced using that alias.
>
> > The theory is that application code which always uses these typedefs
> > will be more likely to run on multiple systems (provided the typedefs
> > are changed of course).

>
> More than theory, very real fact.


So that would make them a good thing? Sorry if I miss the point; you
seem to be saying they are "worse than useless" but do improve
portability?

> > I used to do this. Then I found out that C99 defined things like
> > "uint32_t", so I started using these versions instead. But after
> > following this group for a while I now find even these ugly and don't
> > use them unless unavoidable.

>
> Nobody says you have to care about portability if you don't want to.
> That's between you, your bosses, and your users. If you are writing a
> program for your own use, the only one you ever have to answer to is
> yourself.


I don't really care about portability to the extent sometimes apparent
on CLC. For example, I am quite happy to restrict myself to twos
complement machines. However the idea of writing code the right way
once, rather than the wrong way lots of times, does appeal! I am
starting to see real productivity benefits from my attempts to do this
in my work.

> On the other hand, both UNIXy and Windows platforms are having the
> same problems with the transition from 32 to 64 bits that they had
> moving from 16 to 32 bits, if perhaps not quite so extreme.
>
> For more than a decade, the natural integer type and native machine
> word on Windows has been called a DWORD, and on 64 bit Windows the
> native machine word is going to be a QWORD.


I had to write a fairly simple windows program last week, and it was
horrible. All those WORDS, DWORDS, LPCSTR, HPARAMS, LPARAMS etc. I
think that experience was what prompted my post.

> > What do people here think is best?

>
> On one embedded project we had CAN communications between the main
> processor, a 32-bit ARM, and slave processors that were 16/32 bit
> DSPs.
>
> The only types that were identical between the two were signed and
> unsigned short, and signed and unsigned long. In fact, here are the
> different integer types for the two platforms:
>
> 'plain' char unsigned 8-bit signed 16-bit
> signed char signed 8-bit signed 16-bit
> unsigned char unsigned 8-bit unsigned 16-bit
> signed short signed 16-bit signed 16-bit
> unsigned short unsigned 16-bit signed 16-bit
> signed int signed 32-bit signed 16-bit
> unsigned int unsigned 32-bit unsigned 16-bit
> signed long signed 32-bit signed 32-bit
> unsigned long unsigned 32-bit unsigned 32-bit
>
> Both processors had hardware alignment requirements. The 32-bit
> processor can only access 16-bit data at an even address and 32-bit
> data on an address divisible by four. The penalty for misaligned
> access is a hardware trap. The DSP only addresses memory in 16-bit
> words, so there is no misalignment possible for anything but long, and
> they had to be aligned on an even address (32-bit alignment). The
> penalty for misaligned access is just wrong data (read), or
> overwriting the wrong addresses (write).
>
> Now the drivers for the CAN controller hardware are completely
> off-topic here, but the end result on both systems is two 32-bit words
> in memory containing the 0 to 8 octets (0 to 64 bits) of packet data.
> These octets can represent any quantity of 8-bit, signed or unsigned
> 16-bit, or 32-bit data values that can fit in 64 bits, and have any
> alignment.
>
> So your mission, Mr. Phelps, if you decide to accept it, is to write
> code that will run on both processors despite their different
> character sizes and alignment requirements, that can use a format
> specifier to parse 1 to 8 octets into the proper types with the proper
> values.
>
> The code I wrote runs on both processors with no modifications. And I
> couldn't even use 'uint8_t', since the DSP doesn't have an 8-bit type.
> I used 'uint_least8_t' instead.
>
> As for the C99 choice of type definitions like 'unit8_t' and so on,
> they are not the best I have ever seen, but they are also far from the
> worst. And they have the advantage of being in a C standard, so with
> a little luck they will eventually edge out all the others.


Thanks for the detailed discussion. I have been working with slightly
similar programming tasks recently, implementing modbus on PC and two
embedded systems. I must be getting better; the generic modbus code I
wrote for the (8 bit) AVR system did compile and run fine on the 32
bit ARM system.

--

John Devereux
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      10-07-2005


John Devereux wrote On 10/07/05 05:40,:
>
> Perhaps I was not clear; the typedefs go in a single "portability"
> header file and are specific to the machine. E.g.
>
> #ifdef __X86
> typedef short int INT16;
> ...
> #endif
> #ifdef __AVR
> typedef int INT16;
> ...
> #endif
>
> (made up examples)
>
> It should be understood that this file will need to be changed for
> each new machine, but that hopefully nothing else will. By using
> UINT32 etc thoughout, nothing needs to change except this one file.


IMHO it's preferable to base such tests on the actual
characteristics of the implementation and not on the name
of one of its constituent parts:

#include <limits.h>
#if INT_MAX == 32767
typedef int INT16;
#elif SHRT_MAX == 32767
typedef short INT16;
#else
#error "DeathStation 2000 not supported"
#endif

This inflicts <limits.h> on every module that includes
the portability header, but that seems a benign side-
effect.

--
(E-Mail Removed)

 
Reply With Quote
 
John Devereux
Guest
Posts: n/a
 
      10-07-2005
Eric Sosman <(E-Mail Removed)> writes:

> John Devereux wrote On 10/07/05 05:40,:
> >
> > Perhaps I was not clear; the typedefs go in a single "portability"
> > header file and are specific to the machine. E.g.
> >
> > #ifdef __X86
> > typedef short int INT16;
> > ...
> > #endif
> > #ifdef __AVR
> > typedef int INT16;
> > ...
> > #endif
> >
> > (made up examples)
> >
> > It should be understood that this file will need to be changed for
> > each new machine, but that hopefully nothing else will. By using
> > UINT32 etc thoughout, nothing needs to change except this one file.

>
> IMHO it's preferable to base such tests on the actual
> characteristics of the implementation and not on the name
> of one of its constituent parts:
>
> #include <limits.h>
> #if INT_MAX == 32767
> typedef int INT16;
> #elif SHRT_MAX == 32767
> typedef short INT16;
> #else
> #error "DeathStation 2000 not supported"
> #endif
>
> This inflicts <limits.h> on every module that includes
> the portability header, but that seems a benign side-
> effect.


That does seem much better. Why did I not think of that?

--

John Devereux
 
Reply With Quote
 
Walter Roberson
Guest
Posts: n/a
 
      10-07-2005
In article <di5uku$gii$(E-Mail Removed)>,
Eric Sosman <(E-Mail Removed)> wrote:
> IMHO it's preferable to base such tests on the actual
>characteristics of the implementation and not on the name
>of one of its constituent parts:


> #include <limits.h>
> #if INT_MAX == 32767
> typedef int INT16;
> #elif SHRT_MAX == 32767
> typedef short INT16;
> #else
> #error "DeathStation 2000 not supported"
> #endif


An implementation is not required to use the entire arithmetic space
possible with its hardware. In theory, INT_MAX == 32767 could
happen on (say) an 18 bit machine.
--
Watch for our new, improved .signatures -- Wittier! Profounder! and
with less than 2 grams of Trite!
 
Reply With Quote
 
Skarmander
Guest
Posts: n/a
 
      10-07-2005
John Devereux wrote:
> Eric Sosman <(E-Mail Removed)> writes:
>
>
>>John Devereux wrote On 10/07/05 05:40,:
>>
>>>Perhaps I was not clear; the typedefs go in a single "portability"
>>>header file and are specific to the machine. E.g.
>>>
>>>#ifdef __X86
>>>typedef short int INT16;
>>>...
>>>#endif
>>>#ifdef __AVR
>>>typedef int INT16;
>>>...
>>>#endif
>>>
>>>(made up examples)
>>>
>>>It should be understood that this file will need to be changed for
>>>each new machine, but that hopefully nothing else will. By using
>>>UINT32 etc thoughout, nothing needs to change except this one file.

>>
>> IMHO it's preferable to base such tests on the actual
>>characteristics of the implementation and not on the name
>>of one of its constituent parts:
>>
>> #include <limits.h>
>> #if INT_MAX == 32767
>> typedef int INT16;
>> #elif SHRT_MAX == 32767
>> typedef short INT16;
>> #else
>> #error "DeathStation 2000 not supported"
>> #endif
>>
>>This inflicts <limits.h> on every module that includes
>>the portability header, but that seems a benign side-
>>effect.

>
>
> That does seem much better. Why did I not think of that?
>

Possibly because when you've got system dependencies, there tend to be
more of them than the size of the data types. So it's very common to get
stuff like

everything.h:
#ifdef __FOONLY
typedef short INT16;
#define HAVE_ALLOCA 1
#define HCF __asm__("hcf")
#define TTY_SUPPORTS_CALLIGRAPHY 1
#include <foonlib.h>
...etc...

In fact, the ever-popular GNU autoconf does this, except that it takes
care of all the tests and writes just one header with the appropriate
defines.

S.
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      10-07-2005


Walter Roberson wrote On 10/07/05 11:16,:
> In article <di5uku$gii$(E-Mail Removed)>,
> Eric Sosman <(E-Mail Removed)> wrote:
>
>> IMHO it's preferable to base such tests on the actual
>>characteristics of the implementation and not on the name
>>of one of its constituent parts:

>
>
>> #include <limits.h>
>> #if INT_MAX == 32767
>> typedef int INT16;
>> #elif SHRT_MAX == 32767
>> typedef short INT16;
>> #else
>> #error "DeathStation 2000 not supported"
>> #endif

>
>
> An implementation is not required to use the entire arithmetic space
> possible with its hardware. In theory, INT_MAX == 32767 could
> happen on (say) an 18 bit machine.


Adjust the tests appropriately for the semantics
you desire for "INT16". As shown they're appropriate
for an "exact" type (which is a pretty silly thing to
ask for in a signed integer; sorry for the bad example).
If you want "fastest," change == to >=. If you want
"at least," change == to >= and test short before int.
If you want some other semantics, test accordingly.

It is not possible to test in this way for every
possible characteristic somebody might want to ask
about -- there's no Standard macro or other indicator
to say what happens on integer overflow, for example.
Still, I believe tests that *can* be made portably
*should* be made portably, and as broadly as possible.
Testing the name of the compiler or of the host machine
is not broad; it's the opposite. Test them if you must,
but test more portably if you can.

--
(E-Mail Removed)

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Preferred Size, Minimum Size, Size Jason Cavett Java 5 05-25-2008 08:32 AM
mega pixels, file size, image size, and print size - Adobe Evangelists Frank ess Digital Photography 0 11-14-2006 05:08 PM
equivalent c data types for vc++ data types ramu C Programming 2 02-20-2006 09:33 AM
size of a class having enumerated data types? ypjofficial@indiatimes.com C++ 8 01-23-2006 12:29 PM
Size of data types in C? siliconwafer C Programming 12 10-11-2005 03:35 PM



Advertisments