Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Assigning values to char arrays

Reply
Thread Tools

Assigning values to char arrays

 
 
Ark Khasin
Guest
Posts: n/a
 
      11-03-2007
santosh wrote:
> Ark Khasin wrote:
>
>> Richard wrote:

>
> <snip>
>
>> I am not to argue who of us two is more of a newbie, but your post
>> sheds no light on the question asked. Ego bubbling?

>
> This is most hilarious sentence I've read in c.l.c. this year.
>

Ty. But Richard offered a satisfactory explanation.
--
Ark
 
Reply With Quote
 
 
 
 
santosh
Guest
Posts: n/a
 
      11-03-2007
Ark Khasin wrote:

> santosh wrote:
>> Ark Khasin wrote:
>>> Ben Bacarisse wrote:

>>
>> <snip>
>>
>>>> No. unsigned char may not have padding bits. All the bits must be
>>>> value bits.

>>
>>> Why?
>>> 6.2.6.2 says "For unsigned integer types other than unsigned char,
>>> the bits of the object representation shall be divided into two
>>> groups: value bits and padding bits (there need not be any of the
>>> latter). But I couldn't find anything saying that unsigned char *may
>>> not* have padding bits.

>>
>> Well the above quote says that unsigned char may not have _both_
>> padding and value bits. Obviously the bit type left out has to be
>> padding bits - otherwise one would not be able to potably use
>> unsigned char objects.
>>

> Is this "just a theory"? IMHO, 6.2.6.2 says *exactly nothing* about
> unsigned char.


<quote n1256.pdf>

6.2.6.2 Integer types

1 For unsigned integer types other than unsigned char, the bits of the
object representation shall be divided into two groups: value bits and
padding bits (there need not be any of the latter).

<endquote>

Note closely the text within the parenthesis. To me it _strongly_
implies, to say the least, that value bits are mandatory for objects of
all unsigned integer types. Since unsigned char is disallowed from
having padding bits, it must be composed only of value bits.

<quote n1256.pdf>

If there are N value bits, each bit shall represent a different power of
2 between 1 and 2N-1, so that objects of that type shall be capable of
representing values from 0 to 2N -1 using a pure binary representation;
this shall be known as the value representation. The values of any
padding bits are unspecified.44)

6.2.6.1

3 Values stored in unsigned bit-fields and objects of type unsigned char
shall be represented using a pure binary notation.40)

<endquote>

Again 6.2.6.1(3) in conjunction with 6.2.6.2(1) reinforces the
requirement that unsigned char may not have padding bits.

<quote n1256.pdf>

4 Values stored in non-bit-field objects of any other object type
consist of n CHAR_BIT bits, where n is the size of an object of that
type, in bytes. The value may be copied into an object of type unsigned
char [n] (e.g., by memcpy); the resulting setof bytes is
called the object representation of the value. Values stored in
bit-fields consist of m bits, where m is the size specified for the
bit-field. The object representation is the set of m bits the bit-field
comprises in the addressable storage unit holding it. Two values (other
than NaNs) with the same object representation compare equal, but values
that compare equal may have different object representations.

<endquote>

This answers the other issue that you raised concerning null pointers
not being all bits zero.

 
Reply With Quote
 
 
 
 
pete
Guest
Posts: n/a
 
      11-03-2007
Ark Khasin wrote:
>
> santosh wrote:
> > Ark Khasin wrote:
> >> Ben Bacarisse wrote:

> >
> > <snip>
> >
> >>> No. unsigned char may not have padding bits.


> Is this "just a theory"?


No.

N869
5.2.4.2.1 Sizes of integer types <limits.h>

[#2] The value UCHAR_MAX+1
shall equal 2 raised to the power CHAR_BIT.

--
pete
 
Reply With Quote
 
Ark Khasin
Guest
Posts: n/a
 
      11-03-2007
pete wrote:
> Ark Khasin wrote:
>> santosh wrote:
>>> Ark Khasin wrote:
>>>> Ben Bacarisse wrote:
>>> <snip>
>>>
>>>>> No. unsigned char may not have padding bits.

>
>> Is this "just a theory"?

>
> No.
>
> N869
> 5.2.4.2.1 Sizes of integer types <limits.h>
>
> [#2] The value UCHAR_MAX+1
> shall equal 2 raised to the power CHAR_BIT.
>

Thanks to adding to my confusion
So I have an 11-bit machine bytes and UCHAR_MAX==8 and 3 padding most
significant bits. Anything wrong?
BTW, if I am not mistaken, in other integer types padding bits don't
have to be contiguous.
--
Ark
 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      11-03-2007
Ark Khasin wrote:
>

.... snip ...
>
> As a practitioner, I didn't think twice to clear all bits with
> memset clones including the likes of the code above. But now this
> post scared me: if unsigned char has padding bits in its
> representation (which I guess is allowed) then what do I get?
> unsigned a;
> memset_as_above(&a, 0, sizeof(a));
> Will a necessarily compare equal to 0?


Can't happen. In C, char is expressly forbidden to have padding
bits.

--
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
Try the download section.



--
Posted via a free Usenet account from http://www.teranews.com

 
Reply With Quote
 
santosh
Guest
Posts: n/a
 
      11-03-2007
Ark Khasin wrote:

> pete wrote:
>> Ark Khasin wrote:
>>> santosh wrote:
>>>> Ark Khasin wrote:
>>>>> Ben Bacarisse wrote:
>>>> <snip>
>>>>
>>>>>> No. unsigned char may not have padding bits.

>>
>>> Is this "just a theory"?

>>
>> No.
>>
>> N869
>> 5.2.4.2.1 Sizes of integer types <limits.h>
>>
>> [#2] The value UCHAR_MAX+1
>> shall equal 2 raised to the power CHAR_BIT.
>>

> Thanks to adding to my confusion
> So I have an 11-bit machine bytes and UCHAR_MAX==8 and 3 padding most
> significant bits. Anything wrong?


What do you mean by UCHAR_MAX==8? Do you mean CHAR_BIT==8?

As far as the Standard is concerned a char i.e., a byte (as defined by
C) contains CHAR_BIT bits. Additionally unsigned char may not contain
padding bits.

I don't know what you mean by "machine bytes" above. Are they supposed
to be different from C bytes?

> BTW, if I am not mistaken, in other integer types padding bits don't
> have to be contiguous.


Yes. Padding bits need not be contiguous.

 
Reply With Quote
 
Chris Torek
Guest
Posts: n/a
 
      11-03-2007
>pete wrote:
>> [#2] The value UCHAR_MAX+1
>> shall equal 2 raised to the power CHAR_BIT.


In article <4w4Xi.5793$kH.3510@trndny04>
Ark Khasin <(E-Mail Removed)> wrote:
>Thanks to adding to my confusion
>So I have an 11-bit machine bytes and UCHAR_MAX==8 and 3 padding most
>significant bits. Anything wrong?


Two.

First, I assume you meant for CHAR_BIT to be 8 here (so that
UCHAR_MAX is 255).

Second, if you have 11-bit machine bytes, CHAR_BIT must be 11
(or, alternatively, the implementation can make CHAR_BIT be 8
and completely hide the existence of the other 3 bits, by
emulating an 8-bit machine; but in this case, you do not have
11-bit machine bytes, you have 8-bit machine bytes in the
emulated machine on which the C system runs).

In other words, "unsigned char" has no padding bits.

>BTW, if I am not mistaken, in other integer types padding bits don't
>have to be contiguous.


Right. In practice, they tend to be clumped at one end (e.g., a
la Burroughs A-series machine where "integer" just means "floating
point value with carefully controlled exponent"). The most likely
candidate for "internal" padding bits would be a ones' complement
machine with no native at-least-32 and/or at-least-64 bit types,
where "long" and/or "long long" are made up of several machine
words glued together, with the sign bit unused (and always-0) in
the lower order words. One might do this when implementing C99 on
a Univac 11xx (Unisys? what *are* they called these days?) series
machine, for instance.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
 
Reply With Quote
 
Chris Torek
Guest
Posts: n/a
 
      11-03-2007
(Given:

T **mem;
size_t size = n * sizeof *mem;
mem = malloc(size);
... check for success ...
memset(mem, 0, size);

)

In article <(E-Mail Removed) .com>
somenath <(E-Mail Removed)> wrote:
>... my understanding was in pointer context 0 and NULL is converted to
>null pointer.


Assuming you mean the same thing I do by "in pointer context",
yes.

>And converting to null pointer is compiler responsibility.


Yes -- but providing "pointer context" is the programmer's.

>So I thought 0 in memset will be converted to null
>pointer (which is system specific).


This is where you go astray: the call to memset() loses the "pointer
context" in question.

Remember that we *can* do this:

unsigned char table[100];
...
/* now zero out all 100 bytes in table[] */
memset(table, 0, 100);

as well as the example at the top of this article. How will memset()
"know" whether we passed the address of the first byte of 100
"unsigned char"s (i.e., "table"), or the first byte of n "pointer
to ..."s (i.e., "mem")?

The answer is that it does *not* know. Instead, it just *assumes*
that its first argument is a pointer to "ordinary integer" bytes,
i.e., in the style of memset(table). It thus sets all the bytes
to "integer zeros", not "pointer nulls".

To put it another way, "pointer context" survives only across "very
short distances" in C, specifically, certain operators.

Given an operator that takes two operands -- such as the ordinary
assignment operator "=", or the comparison operators "==" and "!="
-- a C compiler will detect that one operand has some pointer type
while the other is the integer constant zero, and in those specific
cases, will convert the "integer constant zero" to "null pointer of
appropriate type". Thus, in:

T **mem;
mem = NULL;
mem = 0;

the two assignments have the exact same effect, because the compiler
can see that "mem" has a pointer type (specifically "pointer to
pointer to T", whatever type T may be). Or, after calling malloc()
successfully to set mem to something non-NULL:

mem[i] = 0;

again supplies a pointer type on the left (because mem[i] has type
"pointer to T") and an integer-constant-zero on the right, and the
compiler can -- indeed, must -- convert that zero to an appropriate
null pointer value.

These are examples of "pointer context". Arguments to prototyped
functions also provide a short-term "pointer context", because
parameter passing (when prototypes are used) is defined in terms
of ordinary assignment:

void zorg(double *evil);
...
zorg(0);

is very much like writing "evil = 0" (except that the assignment
is actually to whatever parameter-name zorg() uses, which may
differ:

void zorg(double *trouble) { ... }

The name in the prototype is optional and need not match the
actual formal parameter name; the "assignment" happens during
the subroutine call).

In the case of memset(), however, the one pointer parameter --
which I call "base" here -- has type "void *":

void *memset(void *base, int c, size_t n);

and "void *" points to no type at all. We cannot tell from the
call alone whether memset() will use "base" as a pointer to pointers,
or to integers, or to floating-point values, or indeed anything.
The only information we have is in whatever documentation we have
(in this case, the C standard itself!) describing the function.
It tells us that, internally, the mem*() functions convert their
pointer parameters to "unsigned char *", and treat the memory region
as an array of bytes ("unsigned char"s, an integral type).

(Note that in the absence of a prototype, or if the prototype ends
in ", ..." and we are in the "..." part of the call, parameters
are *not* passed as if by ordinary assignment, but rather with the
"default argument promotions". For most C programmers in most
situations, this little wrinkle can be ignored, since most of us
will use prototypes always. The exception occurs in calls to
variadic functions like printf(), where we have to be careful with
our parameters, especially with pointers and the "%p" directive.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
 
Reply With Quote
 
Flash Gordon
Guest
Posts: n/a
 
      11-03-2007
Ark Khasin wrote, On 03/11/07 20:05:
> pete wrote:
>> Ark Khasin wrote:
>>> santosh wrote:
>>>> Ark Khasin wrote:
>>>>> Ben Bacarisse wrote:
>>>> <snip>
>>>>
>>>>>> No. unsigned char may not have padding bits.

>>
>>> Is this "just a theory"?

>>
>> No.
>>
>> N869
>> 5.2.4.2.1 Sizes of integer types <limits.h>
>>
>> [#2] The value UCHAR_MAX+1 shall equal 2 raised to the
>> power CHAR_BIT.
>>

> Thanks to adding to my confusion
> So I have an 11-bit machine bytes and UCHAR_MAX==8 and 3 padding most
> significant bits. Anything wrong?


CHAR_BIT is the number of bits in a signed, unsigned and plain char.
Note, the number of bits, NOT the number of value bits. Therefore, as
UCHAR_MAX is 2 raised to the power of CHAR_BIT all of the bits must be
value bits.

> BTW, if I am not mistaken, in other integer types padding bits don't
> have to be contiguous.


The padding bits can be anywhere, but short of using an unsigned char
pointer to look at the representation they are hard to get at since the
bitwise operations are defined as operating on values.
--
Flash Gordon
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      11-03-2007
Ark Khasin wrote:
> Ben Bacarisse wrote:
>> Ark Khasin <(E-Mail Removed)> writes:
>>

[>> Ben Bacarisse wrote:]
....
>> RH's point was something else altogether -- that all bits zero is not
>> guaranteed to produce a null pointer (to be scrupulously correct, it
>> is not guaranteed to produce a value that compares equal to a null
>> pointer constant).


The parenthesized comment was not actually needed to make the statement
"scrupulously correct"; it would have been just as correct, and less
confusing, without it.

> That's where I am lost and reading the standard doesn't help:
> What's the difference between a value of an object and how it compares
> equal? I mean, if a==b, whatever their representations, in what
> context(s) does it make sense to say they may have different values?


There is no difference. Don't let the unnecessary "clarification"
confuse you. The issue isn't having different values with the same
representation in a single type - that can't happen. The issue is that
there can be multiple different representations of the same value in a
given type. However, the values of objects of that type containing those
different representations must compare equal.

You're tripping over a minor issue; the fact that there can be multiple
representations of a null pointer. However, you've lost track of the key
issue: that a pointer object with all of its bits set to 0 doesn't have
to be one of those representations. In fact, it doesn't have to
represent a valid pointer value of any kind.

> [NEGATIVE_ZERO comes to mind - and goes away. BTW, is it fair to say
> that bitwise logic is a magic performed on representations, and not on
> values?]


No. In general, the bitwise operations are defined in terms of their
actions on the values, not the representations. For instance, E>>1 is
defined as dividing the value of E by 2. The complicated exceptions all
involve sign bits, and most result in undefined behavior, which is why
it's strongly recommended that bitwise operations be restricted to
unsigned types, or at least restricted to values which are guaranteed to
be positive both before and after the operation.

>> void *a;;
>> memset_as_above(&a, 0, sizeof a);


There is, at this point, no guarantee that 'a' contains a valid pointer
representation. Therefore, the next line renders the behavior of your
entire program undefined:

>> if (a == 0) {
>> /* not guaranteed */

> //Which is correct but implies
> {
> void **pNULL = 0;
> if(a==*pNULL) {
> /* not guaranteed */


I'm not sure what your point was; but you've just attempted to
dereference a null pointer, again making the behavior undefined.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
assigning "const char *" to "char *" thomas C++ 8 08-21-2012 01:32 PM
C: assigning values to arrays inside structures nflemming2004 C Programming 0 06-09-2008 11:13 PM
assigning const char* to char* Peithon C Programming 6 06-01-2007 08:20 PM
(const char *cp) and (char *p) are consistent type, (const char **cpp) and (char **pp) are not consistent lovecreatesbeauty C Programming 1 05-09-2006 08:01 AM
/usr/bin/ld: ../../dist/lib/libjsdombase_s.a(BlockGrouper.o)(.text+0x98): unresolvable relocation against symbol `std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostre silverburgh.meryl@gmail.com C++ 3 03-09-2006 12:14 AM



Advertisments