Velocity Reviews > integral promotion., sign extension

# integral promotion., sign extension

mohangupta13
Guest
Posts: n/a

 06-01-2010
on the above three topics ....its soo much information and it creates
more confusion ...
kindly guide me to some well written articles on these topics...
1. Integral promotion :

What i understand: " If two types are intermixed then the type with
smaller capacity is promoted to the type with the larger capacity" ..i
am sure i am wrong here .

2. Sign extension:

what i understand: whenever say a char is promoted to int the msb of
char is copied in the extra bits in the msb of the int ...i.e char=
10010111---> 1111111110010111 (in int)...

3. Unsigned or signed character : what i understand is that if you
declare char a=10; then 'a' is supposed to be signed character by
default as no explicit unsigned qualifier was attached ..but some
threads say its implementation define ...confusion here???

Mohan Gupta

kathir
Guest
Posts: n/a

 06-01-2010
On Jun 1, 9:57*am, mohangupta13 <(E-Mail Removed)> wrote:
> Well i had a though time extracting information from previous threads
> on the above three topics ....its soo much information and it creates
> more confusion ...
> *kindly guide me to some well written articles on these topics...
> 1. Integral promotion :
>
> What i understand: " If two types are intermixed then the type with
> smaller capacity is promoted to the type with the larger capacity" ..i
> am sure i am wrong here .
>
> 2. Sign extension:
>
> what i understand: whenever say a char is promoted to int the msb of
> char is copied in the extra bits in the msb of the int ...i.e char=
> 10010111---> 1111111110010111 (in int)...
>
> 3. Unsigned or signed character : what i understand is that if you
> declare char a=10; then 'a' is supposed to be signed character by
> default as no explicit unsigned qualifier was attached ..but some
> threads say its implementation define ...confusion here???
>
> Mohan Gupta

Regarding Unsigned or signed character:
Project Configuration -> C/C++ -> Language tab, you can set the value
for "Default Char Unsigned". When you create a project, it would be
set to No by default. You can change it to Yes, if you want. The
compilter switch would be /J.

Thanks and Regards,
Kathir
http://www.softwareandfinance.com/

mohangupta13
Guest
Posts: n/a

 06-01-2010
On Jun 1, 10:17*pm, kathir <(E-Mail Removed)> wrote:
> On Jun 1, 9:57*am, mohangupta13 <(E-Mail Removed)> wrote:
>
>
>
> > Well i had a though time extracting information from previous threads
> > on the above three topics ....its soo much information and it creates
> > more confusion ...
> > *kindly guide me to some well written articles on these topics...
> > 1. Integral promotion :

>
> > What i understand: " If two types are intermixed then the type with
> > smaller capacity is promoted to the type with the larger capacity" ..i
> > am sure i am wrong here .

>
> > 2. Sign extension:

>
> > what i understand: whenever say a char is promoted to int the msb of
> > char is copied in the extra bits in the msb of the int ...i.e char=
> > 10010111---> 1111111110010111 (in int)...

>
> > 3. Unsigned or signed character : what i understand is that if you
> > declare char a=10; then 'a' is supposed to be signed character by
> > default as no explicit unsigned qualifier was attached ..but some
> > threads say its implementation define ...confusion here???

>
> > Thanks in advance ...
> > Mohan Gupta

>
> Regarding Unsigned or signed character:
> Project Configuration -> C/C++ -> Language tab, you can set the value
> for "Default Char Unsigned". When you create a project, it would be
> set to No by default. You can change it to Yes, if you want. The
> compilter switch would be /J.

What i am asking is why not like int a declaration of "char a" just
not mean an signed char ..why is it implementation defined (if its
really so )??..as "int a " mean 'a' is "signed int" .
>
> Thanks and Regards,
> Kathirhttp://www.softwareandfinance.com/

dbtid
Guest
Posts: n/a

 06-01-2010
kathir wrote:

> Regarding Unsigned or signed character:
> Project Configuration -> C/C++ -> Language tab, you can set the value
> for "Default Char Unsigned". When you create a project, it would be
> set to No by default. You can change it to Yes, if you want. The
> compilter switch would be /J.

Gee, I can't find anything like "Project Configuration" ANYWHERE on my
computer!!! What do I do? Does this mean I can't program in C?

You might consider it wiser to refrain from posting such details from a
specific tool when answering questions in c.l.c.

dbtid
Guest
Posts: n/a

 06-01-2010
mohangupta13 wrote:
> Well i had a though time extracting information from previous threads
> on the above three topics ....its soo much information and it creates
> more confusion ...
> kindly guide me to some well written articles on these topics...
> 1. Integral promotion :
>
> What i understand: " If two types are intermixed then the type with
> smaller capacity is promoted to the type with the larger capacity" ..i
> am sure i am wrong here .
>
> 2. Sign extension:
>
> what i understand: whenever say a char is promoted to int the msb of
> char is copied in the extra bits in the msb of the int ...i.e char=
> 10010111---> 1111111110010111 (in int)...
>
> 3. Unsigned or signed character : what i understand is that if you
> declare char a=10; then 'a' is supposed to be signed character by
> default as no explicit unsigned qualifier was attached ..but some
> threads say its implementation define ...confusion here???
>
> Mohan Gupta

Do you have a copy of the standard, or perhaps a draft copy?
They both do a pretty good job of explaining how things are supposed to
be converted.

For 1) your concept of going from less capacity to more capacity is
intuitively correct. When mixing expressions with, say, doubles and
ints, ints are converted to doubles to maintain precision.

For 2) I think it depends on whether you're dealing with a signed or
unsigned character type. For example:

char a = -2;
int x = a;
printf ("x = %d\n", x);

can print out 'x = -2';

Then

unsigned char a = -2;
int x = a;
printf ("x = %d\n", x);

can print out 'x = 254'

So it depends on whether char is treated is signed or unsigned.

For 3) See ISO/IEC 9899:1999, Section J.3 "Implementation-defined
behavior", J.3.4 "Characters":

Which of signed char or unsigned char has the same range,
representation, and behavior as ‘‘plain’’ char (6.2.5, 6.3.1.1).

What that boils down to is that it's up to the implementation on how to
treat 'char' variables. Implementors are free to choose signed or
unsigned behavior when dealing with the 'char' type. Every
implementation I'm aware of gives a way to tell the compiler what
behavior YOU want it to use.

Best wishes to you.

dbtid

Richard Bos
Guest
Posts: n/a

 06-01-2010
kathir <(E-Mail Removed)> wrote:

> On Jun 1, 9:57=A0am, mohangupta13 <(E-Mail Removed)> wrote:

> > =A0kindly guide me to some well written articles on these topics...

K&R, K&R, K&R. Always start at K&R. If they don't solve your problem,
either you need more detail from the Standard, or you don't actually
have the problem you think you have.

> > 1. Integral promotion :
> >
> > What i understand: " If two types are intermixed then the type with
> > smaller capacity is promoted to the type with the larger capacity" ..i
> > am sure i am wrong here .

Well... in essence, you're right. The devil is in the details. The
details can be found in the Standard.

> > 2. Sign extension:
> >
> > what i understand: whenever say a char is promoted to int the msb of
> > char is copied in the extra bits in the msb of the int ...i.e char=3D
> > 10010111---> 1111111110010111 (in int)...

No. For the time being, forget about the representation of integers in
bits, signed or unsigned. All integer conversions are done _by value_,
not by representation. And converting char to int is (unless you have a
_very_ unusual implementation) very simple: the value of the resulting
int is the same as the value the char had.
How that value is represented in bits is not really relevant to this.
The only things you need to remember are: if the value fits, it fits; if
the value doesn't fit in an unsigned integer (_not_ just unsigned int,
any unsigned integer) it is made to fit; if it doesn't fit in a signed
integer, the result is implementation-defined or raises a signal (which
may have UB, so don't do it).

> > 3. Unsigned or signed character : what i understand is that if you
> > declare char a=3D10; then 'a' is supposed to be signed character by
> > default as no explicit unsigned qualifier was attached ..but some
> > threads say its implementation define ...confusion here???

> Regarding Unsigned or signed character:
> Project Configuration -> C/C++ -> Language tab,

Funny, I don't have a project called Configuration. And the only
Languages tab I can find is in Open Office, where it selects the
language the spellcheck uses. Presumably if you have Chinese installed
that uses characters, but not in English or Dutch.

Richard

Eric Sosman
Guest
Posts: n/a

 06-01-2010
On 6/1/2010 12:57 PM, mohangupta13 wrote:
> Well i had a though time extracting information from previous threads
> on the above three topics ....its soo much information and it creates
> more confusion ...
> kindly guide me to some well written articles on these topics...
> 1. Integral promotion :
>
> What i understand: " If two types are intermixed then the type with
> smaller capacity is promoted to the type with the larger capacity" ..i
> am sure i am wrong here .

What you've described is, loosely, the "usual arithmetic
conversions." Most arithmetic operators in C require that their
operands be of the same type, because most CPU's on which C code
runs have a similar requirement. Few CPU's can add an int to a
double, or compare a long to a short; they can add two doubles or
compare two longs, but can't work directly with mixed types.

When you as a programmer need to add an int to a double, what
must you do? You could convert the double to an int and add the
two ints, or you could convert the int to a double and add the two
doubles, or you could convert both operands to long double and add
those, or ... The usual arithmetic conversions tell you what C
will do when presented with mixed operands (I'm considering only
the arithmetic operands here, not pointers):

- If either is a long double, the other converts to long double.

- Otherwise, if either is a double, the other converts to double.

- Otherwise, if either is a float, the other converts to float.

- Otherwise, both operands are integers of some kind, and the
story continues below ...

There's also something called the "integer promotions," which
exist because many CPU's can perform arithmetic only on integers of
a few "widths." For example, a CPU that does all integer arithmetic
in registers may have no instructions for doing arithmetic on char
values: You fetch the char into a register (widening it as you go),
do the arithmetic in the wide register, and store (part of) the
result back in the char variable again. The integer promotions are
C's description of this kind of conversion (which happens almost
every time "narrow" integer is used, not just with mixed types):

- If the integer is "narrower" than int *and* if the range of
int includes the entire range of the original type, the
integer converts to (promotes to) int.

- If the integer is "narrower" than int *and* if some values
of the original type are out of range for int, the integer
converts to unsigned int.

- All other integer types (including int and unsigned int
themselves) are left unpromoted.

We return now to the usual arithmetic conversions, where we've
already covered the floating-point cases and are left with operands
of integer types. C applies the integer promotions to both operands,
which may resolve the type mismatch right there: a char and a short
might both promote to int, for example, and the types are no longer
mixed (it needn't always happen this way; see below). But if a
mismatch still exists:

- If both operand types have the same "signedness" -- both signed
or both unsigned -- the "narrower" operand converts to the type
of the "wider."

- Otherwise, we have one signed and one unsigned type. If the
signed type is "narrower" than the unsigned type, the signed
operand converts to the unsigned type. (This will change the
numeric value if the signed operand was negative.)

- Otherwise, if the range of the signed type includes all possible
values of the unsigned type, the unsigned type converts to the
signed type.

- Otherwise, we've got a signed type that is "wider" than the
unsigned type but can't represent all values of the unsigned
type. (This sounds like a contradiction, but it's not C's
fault: I've been using terms like "wide" and "narrow," and
they're just loose terms. The formal definition of C uses a
scheme of "integer conversion ranks" to describe this stuff
precisely, but I thought that dragging that intricate business
in would be more confusing than enlightening. Stick with "wide"
and "narrow" for now, and promise yourself that you'll look up
"integer conversion rank" later, when you're more secure.)
Anyhow, if we get to this seemingly contradictory situation,
both operands convert to the unsigned type of the same width
as the signed type (e.g., long + uint64_t might promote both
operands to unsigned long).

... and *that* covers both the usual arithmetic conversions and
the integer promotions. I've simplified a bit by using "wide" and
"narrow" and waving my hands a little, and I've also ignored complex
arithmetic, but I hope it's enough to get you started.

> 2. Sign extension:
>
> what i understand: whenever say a char is promoted to int the msb of
> char is copied in the extra bits in the msb of the int ...i.e char=
> 10010111---> 1111111110010111 (in int)...

Only on some machines. There are three things going on here:

First, each system makes its own decision as to whether plain char
is a signed type or an unsigned type. Since promoting an unsigned type
to a wider type (signed or unsigned) preserves the value, the sign
extension you describe won't happen on a system that chooses to treat
char as unsigned.

Second, each system makes its own decision about how to represent
negative integers. By far the most common is the two's complement
scheme you illustrate, but C also allows two other schemes, ones'
complement and signed magnitude. In a signed magnitude scheme, the
sign bit simply "relocates" during widening; it doesn't "propagate."

Third, each system makes its own decision about how wide the
various integer types are. There are some limits on what can be
chosen, but it is possible for char and int to have the same width
(this is said to be common practice in CPU's that specialize in digital
signal processing). On such a system, if char is taken to be unsigned,
a review of the integer promotions above will show that char promotes
to unsigned int, not to int. So you've got an unsigned char that
promotes to an unsigned int -- unsigned all the way, so there's no
"sign" to be manipulated in the first place.

> 3. Unsigned or signed character : what i understand is that if you
> declare char a=10; then 'a' is supposed to be signed character by
> default as no explicit unsigned qualifier was attached ..but some
> threads say its implementation define ...confusion here???

The signedness of plain char is implementation-defined, as
mentioned above. This is C's recognition of the fact that some
CPU's like to treat characters as small signed integers, while
others treat them as little bunches of bits, nominally unsigned.
If you're using char to store numeric data (as opposed to things
that are notionally "just character codes,") you should probably
specify signed char or unsigned char, since unadorned char will
behave differently on different systems.

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid

lawrence.jones@siemens.com
Guest
Posts: n/a

 06-01-2010
mohangupta13 <(E-Mail Removed)> wrote:
>
> What i am asking is why not like int a declaration of "char a" just
> not mean an signed char ..why is it implementation defined (if its
> really so )??..as "int a " mean 'a' is "signed int" .

Historical reasons. "Plain" char is supposed to be suitable for
representing ordinary characters, which are supposed to have
non-negative values. On ASCII systems where the ordinary characters
have the top bit zero, it's common to make char signed (like plain int
is signed) because it is (or was) more efficient in most cases. But on
systems that use other character sets (like EBCDIC), some of the
ordinary characters have the top bit set and would be negative if char
were signed, so it's not signed on those systems.
--
Larry Jones

Oh, what the heck. I'll do it. -- Calvin

Eric Sosman
Guest
Posts: n/a

 06-01-2010
On 6/1/2010 4:09 PM, (E-Mail Removed) wrote:
> mohangupta13<(E-Mail Removed)> wrote:
>>
>> What i am asking is why not like int a declaration of "char a" just
>> not mean an signed char ..why is it implementation defined (if its
>> really so )??..as "int a " mean 'a' is "signed int" .

>
> Historical reasons. "Plain" char is supposed to be suitable for
> representing ordinary characters, which are supposed to have
> non-negative values. On ASCII systems where the ordinary characters
> have the top bit zero, it's common to make char signed (like plain int
> is signed) because it is (or was) more efficient in most cases. But on
> systems that use other character sets (like EBCDIC), some of the
> ordinary characters have the top bit set and would be negative if char
> were signed, so it's not signed on those systems.

One would have to check with DMR and friends, but I'd imagine
efficiency had something to do with the matter. On a PDP-11, say,
the MOVB instruction that loads a byte into a register treats it
as a small integer and extends the sign. But on S/360 the IC
instruction replaces a register's low-order bits with a byte from
memory, doing no sign extension (in fact, leaving the high-order
24 bits undisturbed). So:

- If char were required to be signed, PDP-11 would love it
but S/360 would need extra instructions to propagate the
sign (maybe two shifts: 24 to the left and 24 to the right).

- If char were required to be unsigned, S/360 would be happy
but PDP-11 would be penalized with extra code (perhaps an
AND after each fetch).

Since C programs did a lot of character manipulation (and still
do), the penalty of extra code on every character access might well
have been too much to swallow. A C that dictated the signedness of
char might have wound up as "a language for DEC machines that's real
slow on IBM" or vice versa. Similar considerations very likely
applied to other machines that were attractive targets for C. Had
C insisted on one signedness or the other, it might have been a good
deal less successful than it in fact became.

--
Eric Sosman
(E-Mail Removed)lid

Barry Schwarz
Guest
Posts: n/a

 06-02-2010
On Tue, 1 Jun 2010 09:57:01 -0700 (PDT), mohangupta13
<(E-Mail Removed)> wrote:

>on the above three topics ....its soo much information and it creates
>more confusion ...
> kindly guide me to some well written articles on these topics...
>1. Integral promotion :
>
>What i understand: " If two types are intermixed then the type with
>smaller capacity is promoted to the type with the larger capacity" ..i
>am sure i am wrong here .

This is true if the two types have the same signedness. If one is
unsigned and the other signed, it becomes a little more complicated.

conversions" and the "integer promotions" pretty clearly.

>
>2. Sign extension:
>
>what i understand: whenever say a char is promoted to int the msb of
>char is copied in the extra bits in the msb of the int ...i.e char=
>10010111---> 1111111110010111 (in int)...

You are discussing how a value is represented. That is not the issue.
When a char is promoted to int, the value is unchanged. If the
hardware uses a signed magnitude representation, then the sign bit is
not extended.

You really shouldn't care unless you are writing code that depends on
a particular representation. If you are, you should be asking in a
newsgroup devoted to your system since it is not really a language
issue.

>
>3. Unsigned or signed character : what i understand is that if you
>declare char a=10; then 'a' is supposed to be signed character by
>default as no explicit unsigned qualifier was attached ..but some
>threads say its implementation define ...confusion here???

There are three distinct types, char, signed char, and unsigned char.
char is guaranteed to be equivalent to one of the other two but it is
up to the implementation developer to chose which one.

The decision is usually based on how characters are represented.
Windows systems have char equivalent to signed char. IBM mainframes,
where letters and numbers have the high order bit set (A is
represented by 11000001) have char equivalent to unsigned char so that
A will compare less than B.

--
Remove del for email