Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > NULL with representation other then all bits 0

Reply
Thread Tools

NULL with representation other then all bits 0

 
 
Keith Thompson
Guest
Posts: n/a
 
      01-29-2006
Joe Wright <(E-Mail Removed)> writes:
[...]
> I am not a guru. The only pointer value defined by the C standard is
> NULL.


I'm not sure what you mean by that. NULL is a macro that expands to a
null pointer constant; a null pointer constant yields a null pointer
value when converted to a pointer type. So NULL is two steps removed
from an actual pointer value. (A null pointer constant is a syntactic
entity that occurs only in source code; a null pointer value occurs
only during the execution of a program.)

Yes, I'm being picky; it's not entirely unreasonable to use NULL as a
shorthand for a null pointer value. But the address of any object or
function is a pointer value.

> It is defined variously as 0 or (void*)0.


Basically yes. I'm going to go into pedantic mode; feel free to
ignore the next few paragraphs.

(void*)0 is not a valid definition for NULL because of C99 7.1.2p5:

Any definition of an object-like macro described in this clause
shall expand to code that is fully protected by parentheses where
necessary, so that it groups in an arbitrary expression as if it
were a single identifier.

If you have
#define NULL (void*)0
then the expression
sizeof NULL
becomes a syntax error.

On the other hand, it's not clear that ((void*)0) is a valid
definition for NULL either. NULL is required to be a null pointer
constant. The standard's definition of a null pointer constant is:

An integer constant expression with the value 0, or such an
expression cast to type void *

6.5.1p5 says that:

A parenthesized expression is a primary expression. Its type and
value are identical to those of the unparenthesized expression. It
is an lvalue, a function designator, or a void nexpression if the
unparenthesized expression is, respectively, an lvalue, a function
designator, or a void expression.

We cannot directly conclude from this that a parenthesized null
pointer constant is a null pointer constant.

However, just as a matter of common sense, it seems obvious that
((void*)0) *should* be a null pointer constant, and therefore a valid
definition of NULL. Some implementations do define NULL this way.
The wording of the standard should be corrected.

End of pedantry (for now).

> The zero value is
> chosen specifically because it is within the range of all possible
> pointer values.


I'm not sure what this means. Pointers are not numbers; they don't
have ranges.

> No pointer value other than NULL can be tested for
> validity.


Again, the address of any object or function is a pointer value. What
do you mean by "can be tested for validity"?

> A C program can safely assume NULL as zero. If it is really not it is
> the implementation's job to take care of it and lie to us.


A null pointer value is a particular value of a pointer type, just as
0 is a particular value of an integer type and 0.0 is a particular
value of a floating-point type. It just happens that the language
uses a very strange way to represent a null pointer literal.

It's best to think of a null pointer value as a null pointer value,
not as "zero". The fact that 0 can be used *in source* to represent a
run-time null pointer value is just an oddity that's hidden behind the
NULL macro.

> The conditional (NULL == 0) will yield 1 everywhere. Or not?


Yes, because both will be converted to a common type. If NULL is 0,
it's just (0 == 0), which is an integer comparison. If NULL is
((void*)0), it's a pointer comparison.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
 
 
 
Jordan Abel
Guest
Posts: n/a
 
      01-29-2006
On 2006-01-29, Joe Wright <(E-Mail Removed)> wrote:
> Keith Thompson wrote:
>> (E-Mail Removed) (Gordon Burditt) writes:
>>
>>>(E-Mail Removed) writes:

>>
>> [...]
>>
>>>>- AFAIK I can't `#define NULL 0x10000' since `void* p=0;' should work
>>>>just like `void* p=NULL'. Is this correct?
>>>
>>>You, as programmer, are not allowed to do this.
>>>You, as compiler implementor, are allowed to do this.

>>
>>
>> The NULL macro must expand to a null pointer constant. 0x10000 is not
>> a null pointer constant as the term is defined by the standard, so
>> it's not immediately obvious that
>>
>> #define NULL 0x10000
>>
>> is legal for an implementation, even if converting that value to a
>> pointer always yields a null pointer value.
>>
>> On the other hand, C99 6.6p10 says:
>>
>> An implementation may accept other forms of constant expressions.
>>
>> It's not clear whether that means an implementation may accept other
>> forms of null pointer constant (0x10000 is already a constant
>> expression). And C99 4p6 says:
>>
>> A conforming implementation may have extensions (including
>> additional library functions), provided they do not alter the
>> behavior of any strictly conforming program.
>>
>> so treating 0x10000 as a null pointer constant is (I think) allowed on
>> that basis, as long as the implementation documents it.
>>
>> On the other other hand, since 0 *must* be a null pointer constant,
>> there's little point in defining NULL as anything other than 0 or
>> ((void*)0). Giving null pointers a representation other than
>> all-bits-zero doesn't require any extensions to make everything work
>> properly. And accepting 0x10000 as a null pointer constant would
>> encourage programmers to write non-portable code that depends on this
>> assumption.
>>
>> Even if all-bits-zero is a valid address, you might still consider
>> representing null pointers as all-bits-zero. Whatever is at that
>> location, no portable program can access it anyway. If it's something
>> important, there's some risk of buggy programs clobbering it by
>> writing through a null pointer -- but then there's going to be some
>> risk of buggy programs clobbering it by writing through address zero.
>>
>> BTW, Gordon, please don't snip attribution lines.
>>

>
> I am not a guru. The only pointer value defined by the C standard is
> NULL. It is defined variously as 0 or (void*)0. The zero value is chosen
> specifically because it is within the range of all possible pointer
> values. No pointer value other than NULL can be tested for validity.
>
> A C program can safely assume NULL as zero. If it is really not it is
> the implementation's job to take care of it and lie to us.
>
> The conditional (NULL == 0) will yield 1 everywhere. Or not?
>


#define NULL ((void *)0xFFFFFFFF), assuming that that is in fact a null
pointer, will guarantee that.
 
Reply With Quote
 
 
 
 
Chris Torek
Guest
Posts: n/a
 
      01-29-2006
In article <(E-Mail Removed) .com>
<(E-Mail Removed)> wrote:
>Hi!
>
>There is a system where 0x0 is a valid address, but 0xffffffff isn't.
>How can null pointers be treated by a compiler (besides the typical
>"solution" of still using 0x0 for "null")?


Ignoring all the debate that has been triggered by your list
(which I have snipped), here is the answer I think you may be
looking for.

Suppose you are the compiler-writer for this machine. Suppose
further that you have decide to use 0xffffffff (all-one-bits)
as your internal representation for the null pointer, so that:

char *p = 0;
use(*p);

will trap at runtime, even though 0 is a valid address. How will
you, as the compiler-writer, achieve this?

The answer lies in your code generator. At any point in dealing
with the conversion of C source code to machine-level instructions,
you *always* know the type(s) of all the operand(s) of every
operator. This is of course absolutely necessary on most machines.
Consider, for instance, something like:

sum = a + b;

If a and b are ordinary "int"s, you probably need to generate an
integer-add instruction with integer operands:

ld r1, ... # load variable "a" into integer register
ld r2, ... # load variable "b" into integer register
add r0,r1,r2 # compute integer sum, reg+reg -> reg
st r0, ... # store sum back to memory

while if "a" and "b" are ordinary "double"s, you probably need to
generate a double-add instruction with double operands:

ldd f2, ... # load double "a" into f2/f3 register pair
ldd f4, ... # load double "b" into f4/f5 register pair
addd f0,f2,f4 # compute double-precision sum
std f0, ... # store sum back into memory

If one operand is an "int" and one is a "double", you have to
convert the int to a double and do the addition as two doubles,
and so on. The only way to know which instructions to generate is
to keep track of the types of all the operands.

So, now you have a chunk of C source level code that includes
the line:

p = 0;

where "p" has type "char *", i.e., a pointer type. The operand on
the right-hand-side of the assignment is an integer *and* is a
constant (you must keep track of this, too, but of course you will,
in order to do constant-folding). So you have an assignment that
has "integer constant zero" as the value to be assigned. Inside
the compiler, you check, and you GENERATE DIFFERENT CODE!

if (is_integer_constant(rhs) && value(rhs) == 0)
generate_store(lhs, 0xffffffff);
else
...

and thus, what comes out in the machine code is:

mov r0, -1 # set r0 to 0xffffffff
st r0, ... # store to pointer "p"

Likewise, in places where you have a comparision or test that
might be comparing a pointer to integer-constant-zero, you check
for this in the compiler, and generate the appropriate code:

is_null = false;
if (is_pointer(lhs) && is_integer_constant(rhs) && value(rhs) == 0) {
is_null = true;
ptr_operand = lhs;
} else if (is_pointer(rhs) &&
is_integer_constant(lhs) && value(lhs) == 0) {
is_null = true;
ptr_operand = rhs;
}
if (is_null)
generate_compare(ptr_operand, 0xffffffff);
else
...

There is only one place this goes wrong, and that is:

extern void somefunc(int firstparam, ...);
...
somefunc(3, ptr1, 0, ptr3); /* where "0" is meant to be a null pointer */

Here, inside your compiler, you see that the second parameter is
an integer constant zero, so you check the function prototype to
see if you need a pointer in this position. All you have is the
literal "..." part, so you must assume that this is really an
integer here, not a pointer. You pass zero (0x00000000) instead
of 0xffffffff. But this source code call is wrong! The programmer
*should* have used a cast:

somefunc(3, ptr1, (char *)0, ptr3);

In this version, you have a cast of an integer constant zero to a
pointer type, which produces 0xffffffff as appropriate, and only
then looks at the prototype. As before, there is no extra information
given by the prototype, but now you pass 0xffffffff as desired.

Now, given that I believe you have indeed correctly identified how
to do this inside the compiler:

>- AFAIK I can identify contexts where `0' is used as a pointer and use
>the numeric value 0xffffffff rather then 0x0. Is this correct?


Yes.

>In particular, should `void* p;' initialize p to "null pointer" rather
>then "zero" (so it has to be placed in ".data" rather then ".bss" in
>terms of typical implementations if "null pointer" is not represented
>as all bits 0)?


If "p" has static duration, yes. If "p" has automatic duration it
does not have to have any useful value upon creation. Note that
this also applies to structures containing pointers:

struct S { int a; long *b; double c; void *d; };
static struct S x;

will have to put x in a data segment in order to set x.b and x.d
to 0xffffffff internally. Unions may also contain pointers, but
in C89 only the first element is initialized, so only if the first
element is a pointer will you have to do this. (C99 offers designated
initializers, but those just fall out naturally.)

(You may, depending on implementation, want to have an "all one
bits" segment that you place either before or after your "bss"
segment. This will handle pointers that are not members of
structures.)

>Worse, should `memset(&p, 0, sizeof(void*))' set p to the
>"null pointer" rather then "zero"?


No.

>Should casts from int to void* convert (int)0 (bits: 0x0) to (void*)0
>(bits: 0xffffffff)?


If the (int)0 is an integer *constant*, yes (because semantically,
a cast is just an assignment to an unnamed temporary, except that
an actual temporary would be an lvalue and a cast produces an rvalue).

If the int that happens to contain zero is *not* a constant, this
is up to you -- but I would not. This allows programmers to write:

int zero = 0;
char *p = (char *)zero;
... now use *p to access hardware location 0 ...

>I know that this topic has been discussed a lot. That's even one of the
>reasons I'm not sure what the real answers are - I remember too many of
>them and can't tell the right ones from the wrong ones...


The usual answer is to skip all of this and simply make sure that
hardware-location-zero is occupied, so that no *C* object or function
actually has address zero. Of course, this does not trap erroneous
null-pointer dereferences, but C implementations are rarely kind
to programmers that way. We seem to prefer to drive our race cars
without seatbelts.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
 
Reply With Quote
 
Alex Fraser
Guest
Posts: n/a
 
      01-29-2006
"Keith Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On the other hand, it's not clear that ((void*)0) is a valid
> definition for NULL either. NULL is required to be a null pointer
> constant. The standard's definition of a null pointer constant is:
>
> An integer constant expression with the value 0, or such an
> expression cast to type void *
>
> 6.5.1p5 says that:
>
> A parenthesized expression is a primary expression. Its type and
> value are identical to those of the unparenthesized expression. It
> is an lvalue, a function designator, or a void nexpression if the
> unparenthesized expression is, respectively, an lvalue, a function
> designator, or a void expression.
>
> We cannot directly conclude from this that a parenthesized null
> pointer constant is a null pointer constant.


(In N869,) 6.6 says that a constant expression is (grammatically) a
conditional expression - with some constraints, of course.

Grammatically, a primary expression is a conditional expression.

Alex


 
Reply With Quote
 
Christian Bau
Guest
Posts: n/a
 
      01-29-2006
In article <(E-Mail Removed) .com>,
(E-Mail Removed) wrote:

> Hi!
>
> There is a system where 0x0 is a valid address, but 0xffffffff isn't.
> How can null pointers be treated by a compiler (besides the typical
> "solution" of still using 0x0 for "null")?
>
> - AFAIK C allows "null pointers" to be represented differently then
> "all bits 0". Is this correct?
> - AFAIK I can't `#define NULL 0x10000' since `void* p=0;' should work
> just like `void* p=NULL'. Is this correct?
> - AFAIK I can identify contexts where `0' is used as a pointer and use
> the numeric value 0xffffffff rather then 0x0. Is this correct? In
> particular, should `void* p;' initialize p to "null pointer" rather
> then "zero" (so it has to be placed in ".data" rather then ".bss" in
> terms of typical implementations if "null pointer" is not represented
> as all bits 0)? Worse, should `memset(&p, 0, sizeof(void*))' set p to
> the "null pointer" rather then "zero"? Should casts from int to void*
> convert (int)0 (bits: 0x0) to (void*)0 (bits: 0xffffffff)?
>
> I know that this topic has been discussed a lot. That's even one of the
> reasons I'm not sure what the real answers are - I remember too many of
> them and can't tell the right ones from the wrong ones...


All your compiler has to do is to make sure that a cast from an integer
zero to a pointer type produces a null pointer, and a cast from a null
pointer to an integer type produces an integer zero.

If for example sizeof (int) == sizeof (void *), and a null pointer of
type (void *) has exactly the same representation as an int with a value
of 0x10000, then the cast from int to void* might translate to an "add"
instruction which adds 0x10000, and a cast from void* to int might
translate to a "subtract" instruction which subtracts 0x10000, or both
might translate to an "exclusive or" instruction which does an
exclusive-or with a value of 0x10000.

One other bit where the compiler must be careful: All static and extern
pointer variables without an explicit initialisation must be initialised
to a null pointer. Some compilers just produce code that sets everything
to zeroes and then fills in bits that are explicitely initialised; that
will not be enough if null pointers or floating point zeroes are not all
bits zeroes.
 
Reply With Quote
 
Joe Wright
Guest
Posts: n/a
 
      01-29-2006
Keith Thompson wrote:
> Joe Wright <(E-Mail Removed)> writes:
> [...]
>
>>I am not a guru. The only pointer value defined by the C standard is
>>NULL.

>
>
> I'm not sure what you mean by that. NULL is a macro that expands to a
> null pointer constant; a null pointer constant yields a null pointer
> value when converted to a pointer type. So NULL is two steps removed
> from an actual pointer value. (A null pointer constant is a syntactic
> entity that occurs only in source code; a null pointer value occurs
> only during the execution of a program.)
>
> Yes, I'm being picky; it's not entirely unreasonable to use NULL as a
> shorthand for a null pointer value. But the address of any object or
> function is a pointer value.
>
>
>> It is defined variously as 0 or (void*)0.

>
>
> Basically yes. I'm going to go into pedantic mode; feel free to
> ignore the next few paragraphs.
>
> (void*)0 is not a valid definition for NULL because of C99 7.1.2p5:
>
> Any definition of an object-like macro described in this clause
> shall expand to code that is fully protected by parentheses where
> necessary, so that it groups in an arbitrary expression as if it
> were a single identifier.
>
> If you have
> #define NULL (void*)0
> then the expression
> sizeof NULL
> becomes a syntax error.
>
> On the other hand, it's not clear that ((void*)0) is a valid
> definition for NULL either. NULL is required to be a null pointer
> constant. The standard's definition of a null pointer constant is:
>
> An integer constant expression with the value 0, or such an
> expression cast to type void *
>
> 6.5.1p5 says that:
>
> A parenthesized expression is a primary expression. Its type and
> value are identical to those of the unparenthesized expression. It
> is an lvalue, a function designator, or a void nexpression if the
> unparenthesized expression is, respectively, an lvalue, a function
> designator, or a void expression.
>
> We cannot directly conclude from this that a parenthesized null
> pointer constant is a null pointer constant.
>
> However, just as a matter of common sense, it seems obvious that
> ((void*)0) *should* be a null pointer constant, and therefore a valid
> definition of NULL. Some implementations do define NULL this way.
> The wording of the standard should be corrected.
>
> End of pedantry (for now).
>
>
>> The zero value is
>>chosen specifically because it is within the range of all possible
>>pointer values.

>
>
> I'm not sure what this means. Pointers are not numbers; they don't
> have ranges.
>

Pointer values share some characteristics of numbers. You can add to
them, subtract from them and subtract one from another. Pointers have a
range from 0 to the maximum allowed memory address.

As the C programmer doesn't know the memory model of the target, the
natural choice for a 'pointer to nothing' would be 0 or (void*)0.

>
>> No pointer value other than NULL can be tested for
>>validity.

>
>
> Again, the address of any object or function is a pointer value. What
> do you mean by "can be tested for validity"?
>

Consider..
int *ptr;
ptr = malloc(100 * sizeof *ptr);
if (ptr == NULL) {/* do something about the failure */}
.. use ptr with careless abandon ..
free(ptr);
.. use ptr at your peril ..
The value of ptr probably hasn't changed but the call to free(ptr) has
made it indeterminate. You can't examine ptr to determine its validity.

>
>>A C program can safely assume NULL as zero. If it is really not it is
>>the implementation's job to take care of it and lie to us.

>
>
> A null pointer value is a particular value of a pointer type, just as
> 0 is a particular value of an integer type and 0.0 is a particular
> value of a floating-point type. It just happens that the language
> uses a very strange way to represent a null pointer literal.
>
> It's best to think of a null pointer value as a null pointer value,
> not as "zero". The fact that 0 can be used *in source* to represent a
> run-time null pointer value is just an oddity that's hidden behind the
> NULL macro.
>
>
>>The conditional (NULL == 0) will yield 1 everywhere. Or not?

>
>
> Yes, because both will be converted to a common type. If NULL is 0,
> it's just (0 == 0), which is an integer comparison. If NULL is
> ((void*)0), it's a pointer comparison.
>


--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
 
Reply With Quote
 
Christian Bau
Guest
Posts: n/a
 
      01-29-2006
In article <(E-Mail Removed)>,
Keith Thompson <(E-Mail Removed)> wrote:

> (E-Mail Removed) (Gordon Burditt) writes:
> > (E-Mail Removed) writes:

> [...]
> >>- AFAIK I can't `#define NULL 0x10000' since `void* p=0;' should work
> >>just like `void* p=NULL'. Is this correct?

> >
> > You, as programmer, are not allowed to do this.
> > You, as compiler implementor, are allowed to do this.

>
> The NULL macro must expand to a null pointer constant. 0x10000 is not
> a null pointer constant as the term is defined by the standard, so
> it's not immediately obvious that
>
> #define NULL 0x10000
>
> is legal for an implementation, even if converting that value to a
> pointer always yields a null pointer value.
>
> On the other hand, C99 6.6p10 says:
>
> An implementation may accept other forms of constant expressions.


I think the C Standard defines "constant expressions" a bit before null
pointer constants. A null pointer constant is then defined as a
"constant expression" which has some additional properties, for example
either being an integer expression of value 0, or such an expression
cast to void*. 0x10000 cannot be a null pointer constant, because it
doesn't have a value of zero.

I think an implementation might for example define strlen ("") as a
constant which would have a value of zero and might therefore become a
null pointer constant (but I think there will other restrictions in the
definition of "integer constant expression" and "null pointer constant"
that prevent it from being a null pointer constant).
 
Reply With Quote
 
Christian Bau
Guest
Posts: n/a
 
      01-29-2006
In article <(E-Mail Removed)>,
Joe Wright <(E-Mail Removed)> wrote:


> A C program can safely assume NULL as zero. If it is really not it is
> the implementation's job to take care of it and lie to us.


No. Saying "NULL is zero" is nonsense. NULL can either be an integer
constant with a value of 0, or it is such a constant cast to void*. In
that case is a pointer. Saying that a pointer is zero is pure nonsense.
A pointer can point to an object, or it can point past the last byte of
an object, or it can be a null pointer which points to no object at all,
or it can be some indeterminate value, but it cannot be zero. It cannot
be pi, or e, or sqrt (2), or one, or zero, or any other number. It
cannot be green, yellow, red or blue either. These are all things that
don't make any sense for pointers.

In a comparison (p == 0), where p is a pointer, the integer constant 0
is converted to a null pointer because there is a special rule in the C
language that in this kind of situation, integer constants of value 0
are automatically converted to pointers, while any other integer
constants, for example those with a value of 1, are not converted. The
pointer p is _never_ compared with a zero. It is always compared with
another pointer value.
 
Reply With Quote
 
Christian Bau
Guest
Posts: n/a
 
      01-29-2006
In article <(E-Mail Removed)>,
Jordan Abel <(E-Mail Removed)> wrote:

> #define NULL ((void *)0xFFFFFFFF), assuming that that is in fact a null
> pointer, will guarantee that.


But it is not a null pointer constant, because 0xFFFFFFFF doesn't have a
value of zero.
 
Reply With Quote
 
Vladimir S. Oka
Guest
Posts: n/a
 
      01-29-2006
Joe Wright wrote:

> Keith Thompson wrote:
>> Joe Wright <(E-Mail Removed)> writes:


<snipped quite a lot, hopefully not too much>

>> I'm not sure what this means. Pointers are not numbers; they don't
>> have ranges.
>>

> Pointer values share some characteristics of numbers. You can add to
> them, subtract from them and subtract one from another. Pointers have
> a range from 0 to the maximum allowed memory address.


I think associating pointers with numbers, despite the `similarities`
quoted above is not a good idea. The `addition` and `subtraction` work
in (not so) subtly different ways than expected of `ordinary` numbers
(think pointers to a structure with size of 17 bytes). Also, ranges are
not necessarily contiguous in the sense the ranges of real world
numbers are (an architecture may have no memory mapped in the byte
address range 0x1000 to 0x2000, as it's reserved for memory-mapped
I/O).

>
> As the C programmer doesn't know the memory model of the target, the
> natural choice for a 'pointer to nothing' would be 0 or (void*)0.


This may be the `natural` assumption, but it suffers the same problems
as outlined above.

IMHO, It might have been better if C went the Pascal way and had just
NULL, and didn't allow numbers to be mixed with pointers, unless as a
non-standard extension.

My tuppence, anyway...

Cheers

Vladimir


--
Heavy, adj.:
Seduced by the chocolate side of the force.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
all-bits-zero pointer-to-object representation Ersek, Laszlo C Programming 20 04-30-2010 11:32 AM
Can "all bits zero" be a trap representation for integral types? Army1987 C Programming 6 07-07-2007 12:01 PM
Help. SessionID is x then y then x then y BodiKlamph@gmail.com ASP General 0 09-03-2005 03:02 PM
Read all of this to understand how it works. then check around on otherRead all of this to understand how it works. then check around on other thelisa martin Computer Support 2 08-18-2005 06:40 AM
8-Bits vs 12 or 16 bits/pixel; When does more than 8 bits count ? Al Dykes Digital Photography 3 12-29-2003 07:08 PM



Advertisments