Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Re: Who owns the variable in my header file ?

Reply
Thread Tools

Re: Who owns the variable in my header file ?

 
 
Keith Thompson
Guest
Posts: n/a
 
      10-07-2012
"BartC" <(E-Mail Removed)> writes:
> "Richard Damon" <(E-Mail Removed)> wrote in message
> news:k4qm0b$jr0$(E-Mail Removed)...
>> On 10/6/12 5:30 AM, Nick Keighley wrote:
>>> As someone remarked this business with "undefined behaviour" is true
>>> of pretty much all programming languages (I'm not convinced Godel has
>>> anything to contribute to this). To some extent C stresses it more,
>>> this is partly because C runs nearly everywhere and has huge numbers
>>> of implementations.

>
>> If we removed pointers into arrays (and passing
>> arrays with unspecified bounds), then the compiler could easily add code
>> to check the subscripts to the array and trap on error conditions. If we
>> want to support pointers into arrays, then these pointers could also be
>> made "fatter" to include the bounds of the object they point to (and for
>> multidimensional arrays, the bounds for each of the larger arrays the
>> array is part of).

>
> Arrays can have any numbers of dimensions, so would be highly impractical
> for any of a thousand possible pointers into an array for each to duplicate
> it's half-dozen or dozen dimensions. You would likely also need different
> pointers for each of the sub-dimensions.


C's multidimensional arrays are nothing more or less than arrays of
arrays. Whatever mechanism existed for 1D arrays would automatically
apply to all higher dimensions.

> And for an array whose dimensions are not realised until runtime, or for
> 'ragged' arrays where the bounds vary through the array, how would
> such a pointer be initialised? Other languages would tend to build the
> bounds into the arrays themselves.


The language has no built-in "ragged" arrays; they're built in user code
by allocating the rows. Whatever method allocates the rows (say,
malloc()) would have to deal with any bounds tracking; for example,
malloc() would have to return a fat pointer.

> In any case, C allows pointers into all sorts of objects, including
> non-arrays, or a single element of that multi-dimensional array, or to cast
> one type of pointer into another; you wouldn't then be able to step or do
> arithmetic on such a pointer, without by-passing the bounds checking.


If C had fat pointers, all the operations you're describing would have
to maintain the "fatness".

[...]

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
 
 
 
BartC
Guest
Posts: n/a
 
      10-07-2012


"James Kuyper" <(E-Mail Removed)> wrote in message
news:k4s571$l59$(E-Mail Removed)...
> On 10/07/2012 06:40 AM, BartC wrote:
>>
>>
>> "Richard Damon" <(E-Mail Removed)> wrote in message
>> news:k4qm0b$jr0$(E-Mail Removed)...

> ...
>>> If we removed pointers into arrays (and passing
>>> arrays with unspecified bounds), then the compiler could easily add code
>>> to check the subscripts to the array and trap on error conditions. If we
>>> want to support pointers into arrays, then these pointers could also be
>>> made "fatter" to include the bounds of the object they point to (and for
>>> multidimensional arrays, the bounds for each of the larger arrays the
>>> array is part of).

>>
>> Arrays can have any numbers of dimensions, so would be highly impractical
>> for any of a thousand possible pointers into an array for each to
>> duplicate
>> it's half-dozen or dozen dimensions. You would likely also need different
>> pointers for each of the sub-dimensions.

>
> None of that matters; only one range is needed at any given time - it
> can be modified whenever changing levels in the multidimensional array.
> Whenever an lvalue of array type gets converted to a pointer of it's
> element type, that pointer can be given a range corresponding to the
> beginning and ending of the array. It doesn't matter whether the element
> type is itself an array type - that can only come into play upon
> conversion of an lvalue of the element type being converted to a pointer
> to it's first element; at which point the same rule applies, giving the
> pointer a different range.


Do you have any syntax examples of how it might work?

>> And for an array whose dimensions are not realised until runtime, or for
>> 'ragged' arrays where the bounds vary through the array, how would
>> such a pointer be initialised?

>
> In C, ragged arrays can only be implemented by allocating each row from
> a larger memory space. If the allocation is handled by malloc(), then
> the bounds can be inserted at the time malloc() is called.


OK, the array is defined by a single fat pointer. But what would that look
like in actual code? And how do you set up a pointer to a row, or element,
in a way that includes the bounds? And what would a fat pointer actually
contain? Examples I've seen discussed here seem to be very complicated.

>> In any case, C allows pointers into all sorts of objects, including
>> non-arrays,

>
> That poses no problems - the C standard specifies that a pointer to a
> non-array object can be treated as a pointer to the first and only
> element of a 1-element array of the object's type.
>
>> ... or a single element of that multi-dimensional array,

>
> That poses no problem, either; the bounds for the pointer to the single
> element are the bounds for the array from which it was selected.


You want a pointer to a single element which rattles around the larger
array? That's consistent with C programmers demanding to do arithmetic on
their pointers!

I favour using a simple pointer to individual elements (on which you can
choose to do unchecked arithmetic), and slices to ranges of elements. Or a
slice can point to a single element too. The difference with a slice,
compared a fat pointer that includes the array limits into which it points,
is you're only allowed to access what's represented in the slice.

In the array:

int A[10];

I can pass the slice (A+5,3) to a function, it will see a 3-element array
indexed 0..2, corresponding to A[5..7]. It's not interested in the other
elements! That's more awkward and less efficient to do with a proper 'fat'
pointer; while it can point to element [5] as (A,5,10) (ptr, offset,
length), the range A[5..7] would need to be (A+5,0,3).

--
bartc

 
Reply With Quote
 
 
 
 
BartC
Guest
Posts: n/a
 
      10-07-2012
"Keith Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "BartC" <(E-Mail Removed)> writes:
>> "Richard Damon" <(E-Mail Removed)> wrote in message
>> news:k4qm0b$jr0$(E-Mail Removed)...
>>> On 10/6/12 5:30 AM, Nick Keighley wrote:
>>>> As someone remarked this business with "undefined behaviour" is true
>>>> of pretty much all programming languages (I'm not convinced Godel has
>>>> anything to contribute to this). To some extent C stresses it more,
>>>> this is partly because C runs nearly everywhere and has huge numbers
>>>> of implementations.

>>
>>> If we removed pointers into arrays (and passing
>>> arrays with unspecified bounds), then the compiler could easily add code
>>> to check the subscripts to the array and trap on error conditions. If we
>>> want to support pointers into arrays, then these pointers could also be
>>> made "fatter" to include the bounds of the object they point to (and for
>>> multidimensional arrays, the bounds for each of the larger arrays the
>>> array is part of).

>>
>> Arrays can have any numbers of dimensions, so would be highly impractical
>> for any of a thousand possible pointers into an array for each to
>> duplicate
>> it's half-dozen or dozen dimensions. You would likely also need different
>> pointers for each of the sub-dimensions.

>
> C's multidimensional arrays are nothing more or less than arrays of
> arrays. Whatever mechanism existed for 1D arrays would automatically
> apply to all higher dimensions.


It's more complicated than that. You have simple arrays like this:

int A[5][4][3];

and more dynamic ones like this:

int ***B,***C;

Which might be set up to have dimensions [7][2][4], and [6][6][6].

I can appreciate that it might not be practical to pass a pointer to any of
A, B or C to a function which expects a 3D array of ints; A is not
compatible with B and C.

Setting up a pointer into A would mean a fat pointer with details of 3
dimensions; other static arrays might have N dimensions, which is the
difficulty I was thinking of (a fat pointer would itself be a linear
array!).

With B and C, the same difficulty exists, *unless* the pointers comprising
the arrays are themselves fat pointers, each containing the dimension of
that row. (But RD suggested all dimensions were included in each pointer,
not just the current level.)

--
Bartc

 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      10-07-2012
On 10/07/2012 04:26 PM, BartC wrote:
>
>
> "James Kuyper" <(E-Mail Removed)> wrote in message
> news:k4s571$l59$(E-Mail Removed)...
>> On 10/07/2012 06:40 AM, BartC wrote:
>>>
>>>
>>> "Richard Damon" <(E-Mail Removed)> wrote in message
>>> news:k4qm0b$jr0$(E-Mail Removed)...

>> ...
>>>> If we removed pointers into arrays (and passing
>>>> arrays with unspecified bounds), then the compiler could easily add code
>>>> to check the subscripts to the array and trap on error conditions. If we
>>>> want to support pointers into arrays, then these pointers could also be
>>>> made "fatter" to include the bounds of the object they point to (and for
>>>> multidimensional arrays, the bounds for each of the larger arrays the
>>>> array is part of).
>>>
>>> Arrays can have any numbers of dimensions, so would be highly impractical
>>> for any of a thousand possible pointers into an array for each to
>>> duplicate
>>> it's half-dozen or dozen dimensions. You would likely also need different
>>> pointers for each of the sub-dimensions.

>>
>> None of that matters; only one range is needed at any given time - it
>> can be modified whenever changing levels in the multidimensional array.
>> Whenever an lvalue of array type gets converted to a pointer of it's
>> element type, that pointer can be given a range corresponding to the
>> beginning and ending of the array. It doesn't matter whether the element
>> type is itself an array type - that can only come into play upon
>> conversion of an lvalue of the element type being converted to a pointer
>> to it's first element; at which point the same rule applies, giving the
>> pointer a different range.

>
> Do you have any syntax examples of how it might work?



double multi_array[3][4][5];

double (*two_array)[4][5] = multi_array+1;
// the bounds for two_array are multi_array+1 and multi_array+2

double (*array)[5] = multi_array[2]+2;
// The bounds for array are multi_array[2] and multi_array[3].

double *single = multi_array[0][3] + 1;
// The bounds for single are multi_array[0][3] and multi_array[0][4]


Here (and elsewhere in this message, for that matter) they may be
off-by-one errors, or pointer-type mismatches. Sorry - I can only double
check what I wrote so many times without getting blurry eyes, and this
is inherently tricky to write.

>>> And for an array whose dimensions are not realised until runtime, or for
>>> 'ragged' arrays where the bounds vary through the array, how would
>>> such a pointer be initialised?

>>
>> In C, ragged arrays can only be implemented by allocating each row from
>> a larger memory space. If the allocation is handled by malloc(), then
>> the bounds can be inserted at the time malloc() is called.

>
> OK, the array is defined by a single fat pointer. But what would that look
> like in actual code? ...


void *p = malloc(100*sizeof(int));
// the bounds of p are the current value of p, and (char*)p + 1000.

int *int_array = (int*)p;
// bounds of int_array are the same as for p.

int (*int_twod)[5] = (int(*)[5])p;
// the bounds for int_twod are the same as for p

int_array = int_twod[2];
// The new bounds for int_array are int_twod[2] and int_twod[3]

> ... And how do you set up a pointer to a row, or element,
> in a way that includes the bounds? ...


The whole point of the fat pointer thing is that it occurs automatically
- it is NOT under your control. However, I did suggest a possible syntax
for imposing a stricter limit:

int_array = *(int (*)[50])int_array;

This could be defined as setting the bounds for int_array to the current
value of int_array, and the current value of int_array + 50.

>... And what would a fat pointer actually
> contain? Examples I've seen discussed here seem to be very complicated.


The fat pointer would have to contain three pieces of information: the
location it currently points at, the lowest location in memory that can
be reached by pointer subtraction with defined behavior, and the highest
location in memory that can reached by pointer addition with defined
behavior. If the maximum size of any object is much smaller than the
number of distinct memory locations that can be pointed at, it may make
sense to describe the two of those memory locations as offsets from the
third, rather than as absolute locations.

>>> In any case, C allows pointers into all sorts of objects, including
>>> non-arrays,

>>
>> That poses no problems - the C standard specifies that a pointer to a
>> non-array object can be treated as a pointer to the first and only
>> element of a 1-element array of the object's type.
>>
>>> ... or a single element of that multi-dimensional array,

>>
>> That poses no problem, either; the bounds for the pointer to the single
>> element are the bounds for the array from which it was selected.

>
> You want a pointer to a single element which rattles around the larger
> array? That's consistent with C programmers demanding to do arithmetic on
> their pointers!


The whole point of the proposal, as I understand it, is to make
violation of the current rules for pointer addition detectable by the
compiler, thereby allowing what might otherwise be dangerous
consequences to be replaced with standard-defined behavior (what that
behavior would be has not yet been specified - perhaps the raising of a
signal?).

However, you could achieve the effect you're talking about by using
*(int (*)[1])(int_array+3). That would create a pointer which can only
point at int_array[3] or int_array[4], and can only be dereferenced when
pointing at int_array[3]. It's a bit clumsy, but I'm not sure it would
be a common enough need to require a more elegant syntax.
--
James Kuyper
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      10-08-2012
On 10/07/2012 04:53 PM, BartC wrote:
> "Keith Thompson" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> "BartC" <(E-Mail Removed)> writes:

....
>>> Arrays can have any numbers of dimensions, so would be highly impractical
>>> for any of a thousand possible pointers into an array for each to
>>> duplicate
>>> it's half-dozen or dozen dimensions. You would likely also need different
>>> pointers for each of the sub-dimensions.

>>
>> C's multidimensional arrays are nothing more or less than arrays of
>> arrays. Whatever mechanism existed for 1D arrays would automatically
>> apply to all higher dimensions.

>
> It's more complicated than that. You have simple arrays like this:
>
> int A[5][4][3];
>
> and more dynamic ones like this:
>
> int ***B,***C;


As far as C is concerned, those are pointers; they may end up pointing
at arrays, but the relevant boundaries are determined by the arrays that
they point at, not by these pointers themselves.

> Which might be set up to have dimensions [7][2][4], and [6][6][6].


As pointers, not arrays, they can't have any of those dimensions.

> I can appreciate that it might not be practical to pass a pointer to any of
> A, B or C to a function which expects a 3D array of ints; A is not
> compatible with B and C.
>
> Setting up a pointer into A would mean a fat pointer with details of 3
> dimensions; other static arrays might have N dimensions, which is the
> difficulty I was thinking of (a fat pointer would itself be a linear
> array!).
> With B and C, the same difficulty exists, *unless* the pointers comprising
> the arrays are themselves fat pointers, each containing the dimension of
> that row. (But RD suggested all dimensions were included in each pointer,
> not just the current level.)


I hadn't noticed that. That's unnecessary, and a mistake on his part, I
think.
--
James Kuyper
 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      10-08-2012
"James Kuyper" <(E-Mail Removed)> wrote in message
news:k4t5ab$tle$(E-Mail Removed)...
> On 10/07/2012 04:53 PM, BartC wrote:


>> It's more complicated than that. You have simple arrays like this:
>>
>> int A[5][4][3];
>>
>> and more dynamic ones like this:
>>
>> int ***B,***C;

>
> As far as C is concerned, those are pointers; they may end up pointing
> at arrays, but the relevant boundaries are determined by the arrays that
> they point at, not by these pointers themselves.
>
>> Which might be set up to have dimensions [7][2][4], and [6][6][6].

>
> As pointers, not arrays, they can't have any of those dimensions.


How also would you create a 3D array from dimensions known at runtime?

Using a hierarchy of pointers, with a *** one at the top, is the only way I
know in C, even if fiddly (see below). The dimensions (a 3-element array)
are not associated with the pointers, true, but that is what we're talking
about.

/* mallocs not fully checked in this code... */
#include <stdio.h>
#include <stdlib.h>

typedef int T;

T* make1darrayT(int dim) {
return malloc(dim*sizeof(T));
}

T** make2darrayT(int *dims) {
T** p;
int i;
p=malloc(dims[0]*sizeof(T*));
if (p)
for (i=0; i<dims[0]; ++i)
p[i]=make1darrayT(dims[1]);
return p;
}

T*** make3darrayT(int *dims) {
T*** p;
int i;
p=malloc(dims[0]*sizeof(T**));
if (p)
for (i=0; i<dims[0]; ++i)
p[i]=make2darrayT(dims+1);
return p;
}

void printarrayT(T ***A,int *dims) {
int i,j,k;
for (i=0; i<dims[0]; ++i)
for (j=0; j<dims[1]; ++j)
for (k=0; k<dims[2]; ++k)
printf("A[%d][%d][%d] = %d\n",i,j,k,A[i][j][k]);
}

int main (void){
int adims[]={7,2,4};
T ***A;
int i,j,k;

A=make3darrayT(adims);
if (!A) exit(0);

for (i=0; i<adims[0]; ++i)
for (j=0; j<adims[1]; ++j)
for (k=0; k<adims[2]; ++k)
A[i][j][k]=i*10000+j*100+k;

printarrayT(A,adims);
}

--
Bartc

 
Reply With Quote
 
Richard Damon
Guest
Posts: n/a
 
      10-08-2012
On 10/7/12 6:40 AM, BartC wrote:
>
>
> "Richard Damon" <(E-Mail Removed)> wrote in message
> news:k4qm0b$jr0$(E-Mail Removed)...
>> On 10/6/12 5:30 AM, Nick Keighley wrote:
>>
>>> As someone remarked this business with "undefined behaviour" is true
>>> of pretty much all programming languages (I'm not convinced Godel has
>>> anything to contribute to this). To some extent C stresses it more,
>>> this is partly because C runs nearly everywhere and has huge numbers
>>> of implementations.

>
>> If we removed pointers into arrays (and passing
>> arrays with unspecified bounds), then the compiler could easily add code
>> to check the subscripts to the array and trap on error conditions. If we
>> want to support pointers into arrays, then these pointers could also be
>> made "fatter" to include the bounds of the object they point to (and for
>> multidimensional arrays, the bounds for each of the larger arrays the
>> array is part of).

>
> Arrays can have any numbers of dimensions, so would be highly impractical
> for any of a thousand possible pointers into an array for each to duplicate
> it's half-dozen or dozen dimensions. You would likely also need
> different pointers for each of the sub-dimensions.
>
> And for an array whose dimensions are not realised until runtime, or for
> 'ragged' arrays where the bounds vary through the array, how would
> such a pointer be initialised? Other languages would tend to build the
> bounds into the arrays themselves.
>
> In any case, C allows pointers into all sorts of objects, including
> non-arrays, or a single element of that multi-dimensional array, or to
> cast one type of pointer into another; you wouldn't then be able to step
> or do arithmetic on such a pointer, without by-passing the bounds checking.
>
> So 'undefined behaviour', if it's as simple as having the wrong value in a
> pointer, is built-in to the language!
>
> (For single-dimensional arrays, a 'fat' pointer containing exactly one
> bound, could work, provided they are a new explicit type in addition to
> regular pointers. Then an array allocator could return such a pointer,
> which
> can be passed to functions and would carry it's length for use by programs,
> and could optionally be used for bounds checking by internal code. But for
> multi-dimensions, it gets complicated...)
>
>> This add significant overhead to the pointer and the
>> operations.

>
> Not if the alternative is to have to always pass the length of the array
> together with a pointer to the array. Having bounds-checking code inserted
> would be an extra overhead, but that can be optional.
>


If we have an array: int foo[5][6];
there are 3 types that might be used with this.

int (*p1)[5][6] = &foo;
which is a pointer to the full array. Such a pointer could also be used
to point into int bar[4][5][6];

There is also int (*p2)[6] = &foo[n]; which points to one row of the
array, and

int *p3 = &foo[n][m]; which points to an element of foo;

Note that for p1 set equal to &foo p1++, p1--, p1+1, or p1[1] are all
invalid operations, as p1 doesn't point into an array, but a single
object (that happens to be an array).

We have the ability to set p2 = &p1[1] to move down the dimensions, and
if the pointers are fat in that they store the base and limits, all that
information can be derived from the value in p1. What is problematic is
converting a p2 to a p1, but this can only be done via a cast in C (of
use of void* s implicit casting), but it should be noted that this type
of pointer cast is inherently very prone to "undefined behavior" so
language trying to be C like, but avoiding/limiting undefined behavior
might just prohibit this action or warn on the construct and allow
undefined behavior thereafter. Note that even in C this sort of
upcasting of an element to the array is fairly rare, and very prone to
invoking undefined behavior.

void* pointers need to be limited if not eliminated. The void pointer
needs to either be a source of undefined behavior, or it needs to
remember the bounds of the object it pointed into (and maybe of the full
object it is pointing into), making its type punning ability much less
useful (as it becomes disallowed).

It is possible to add checking for this sort of thing, if you generate
for every array (single or multidimensional) a table describing it, and
include in the pointer information on where in the array you are
pointing. Then the cast operation could look into this data an see if
the up cast is valid and if so the new bounds.

For run time generated arrays, malloc and family would need to be
smarter, and somehow be passed the type of the array rather than just a
byte count (it would become a bit more like the C++ new operator), so
that it could return the appropriate fat pointer, since that is the only
can of pointer that can point into an array.

For a pointer to a single object as opposed to into an array, one
solution would be to consider such an object as part of a 1 dimensional
array. If you are using the option of pointer up casting and needing
description blocks for arrays, you can either create a block of all such
objects that have their address taken, or define a special case that a
value of 0 for the pointer to the array block is a signal that the
pointer isn't into an array (and thus upcasting, or manipulating the
address are illegal).

The comment on overhead is comparing to C. A C pointer is typically 1
register big (or at least 1 address register big), and tends to be able
to be manipulated with simple direct machine instructions. The fat
pointer needed to avoid undefined behavior needs to be bigger, as it
must store extra information. The operations on it will typically not be
a simple direct machine instruction (unless the machine is one designed
for this sort of purpose, where address manipulations include bounds
testing) so is slower.

C code attempting to duplicate this bounds checking would be slower than
C code without all this checking, but possible faster than the language
using fat pointers. One reason being that the programmer can possible
know more about what pointers point into and perhaps come up with better
test to "prove" that the accesses are valid. In normal C code, a lot of
accesses the programmer can "prove" to himself that they are safe from
other guarantees, that the compiler might not be able to prove for
itself (like know that all strings do have a null terminator, and thus a
char scanning loop is safe). Of course the danger is that the programmer
also might make a mistake.




 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      10-08-2012
On 10/07/2012 09:14 PM, BartC wrote:
> "James Kuyper" <(E-Mail Removed)> wrote in message
> news:k4t5ab$tle$(E-Mail Removed)...
>> On 10/07/2012 04:53 PM, BartC wrote:

>
>>> It's more complicated than that. You have simple arrays like this:
>>>
>>> int A[5][4][3];
>>>
>>> and more dynamic ones like this:
>>>
>>> int ***B,***C;

>>
>> As far as C is concerned, those are pointers; they may end up pointing
>> at arrays, but the relevant boundaries are determined by the arrays that
>> they point at, not by these pointers themselves.
>>
>>> Which might be set up to have dimensions [7][2][4], and [6][6][6].

>>
>> As pointers, not arrays, they can't have any of those dimensions.

>
> How also would you create a 3D array from dimensions known at runtime?


Using malloc(), calloc(), or VLAs. Pointers returned by malloc() or
calloc() based upon a VLA would have a range set based upon the size
passed to malloc, the sizes passed by calloc(), or the dimensions of the
VLA, respectively. For this purpose, realloc() is treated the same as
malloc().
--
James Kuyper
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      10-08-2012
"BartC" <(E-Mail Removed)> writes:
> "Keith Thompson" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> "BartC" <(E-Mail Removed)> writes:
>>> "Richard Damon" <(E-Mail Removed)> wrote in message
>>> news:k4qm0b$jr0$(E-Mail Removed)...
>>>> On 10/6/12 5:30 AM, Nick Keighley wrote:
>>>>> As someone remarked this business with "undefined behaviour" is true
>>>>> of pretty much all programming languages (I'm not convinced Godel has
>>>>> anything to contribute to this). To some extent C stresses it more,
>>>>> this is partly because C runs nearly everywhere and has huge numbers
>>>>> of implementations.
>>>
>>>> If we removed pointers into arrays (and passing
>>>> arrays with unspecified bounds), then the compiler could easily add code
>>>> to check the subscripts to the array and trap on error conditions. If we
>>>> want to support pointers into arrays, then these pointers could also be
>>>> made "fatter" to include the bounds of the object they point to (and for
>>>> multidimensional arrays, the bounds for each of the larger arrays the
>>>> array is part of).
>>>
>>> Arrays can have any numbers of dimensions, so would be highly impractical
>>> for any of a thousand possible pointers into an array for each to
>>> duplicate
>>> it's half-dozen or dozen dimensions. You would likely also need different
>>> pointers for each of the sub-dimensions.

>>
>> C's multidimensional arrays are nothing more or less than arrays of
>> arrays. Whatever mechanism existed for 1D arrays would automatically
>> apply to all higher dimensions.

>
> It's more complicated than that.


No, on the C language level it really is exactly that simple.

> You have simple arrays like this:
>
> int A[5][4][3];


Right.

> and more dynamic ones like this:
>
> int ***B,***C;


Those aren't arrays, they're pointers.

> Which might be set up to have dimensions [7][2][4], and [6][6][6].


You can use multi-level pointers like that to create data structures
that behave like dynamic multi-dimensional arrays. To do so, you
have to explicitly allocate memory for each row, and for each row
of pointers to rows, and so on. Each allocation (presumably a call
to malloc() would have to create a properly initialized fat pointer.

[snip]

The current rules of the language permit an implementation to
make all pointers fat, with information propagating through object
definitions, allocations, assignments, and so forth, so that bounds
checks can be made to fail in circumstances where the behavior
is undefined.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      10-08-2012
"Keith Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "BartC" <(E-Mail Removed)> writes:


>> and more dynamic ones like this:
>>
>> int ***B,***C;

>
> Those aren't arrays, they're pointers.
>
>> Which might be set up to have dimensions [7][2][4], and [6][6][6].

>
> You can use multi-level pointers like that to create data structures
> that behave like dynamic multi-dimensional arrays. To do so, you
> have to explicitly allocate memory for each row, and for each row
> of pointers to rows, and so on. Each allocation (presumably a call
> to malloc() would have to create a properly initialized fat pointer.


Which according to:

"James Kuyper" <(E-Mail Removed)> wrote in message
news:k4t503$s2b$(E-Mail Removed)...

> The fat pointer would have to contain three pieces of information: the
> location it currently points at, the lowest location in memory that can
> be reached by pointer subtraction with defined behavior, and the highest
> location in memory that can reached by pointer addition with defined
> behavior.


might look like this:

int *p=malloc(100*sizeof(int));

p might contain (A, A, A+400), since malloc() knows nothing about what kind
of objects are to be stored in the block. (A is the address of some heap
memory, and A+400 is the highest value p can have, but cannot be
dereferenced). (All offsets in fat pointers are char offsets!)

While (p+5) might be (A+20,A,A+400). And in:

int B[10];
int q=&B;

q might contain(B,B,B+40). A lone 'B' terminal would decay to (B,B,40) as
well, as would a '&B' term (but with different types).

In the case of my int[7][2][4] dynamic array, the top level pointer could
be (C,C,C+84), and each of the next tier could be (D,D,D+24), located at
(C+i*12,C,C+84) (with pointers being 12 bytes). A pointer to any actual
element would have to be (X+offset,X,X+16).

A pointer to any location in my int[5][4][3] static array would be
(E+offset,E,E+240), slightly different (the pointer could be stepped
anywhere in the entire array, not just in that row).

And a pointer to an isolated int value would be just (F,F,F+4).

> The current rules of the language permit an implementation to
> make all pointers fat, with information propagating through object
> definitions, allocations, assignments, and so forth, so that bounds
> checks can be made to fail in circumstances where the behavior
> is undefined.


OK, so it's just about workable. But I can see some issues:

o It seems it needs to be all or nothing; *all* pointers in an
implementation must be fat, including those you have no intention of doing
any arithmetic on.

o It doesn't really address the issue of array bounds: it only stops a
pointer from wandering outside an allocated block, but which contains
multiple arrays or array rows. Not even in the static [5][4][3] array. Only
in static 1D arrays are the bounds protected. So, under the scheme I
outlined above, they can't be used for subscript checking as proposed by
Richard Damon.

o In fact they don't seem designed for programmer use at all, only for
internal protection, which means:

o Can't be used to extract the length of an array (pointer) passed to a
function
o Can't be used to construct a slice or sub-range of a larger array or
block

o Doubtless there are miscellaneous language issues to be sorted (converting
to and from an int for example; bounds info will be lost)

--
Bartc

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Who owns the variable in my header file ? Edward A. Falk C Programming 5 10-11-2012 08:30 PM
Re: Who owns the variable in my header file ? James Kuyper C Programming 0 10-04-2012 12:43 PM
Re: Who owns the variable in my header file ? Ike Naar C Programming 0 10-03-2012 07:52 PM
Re: Who owns the variable in my header file ? Kaz Kylheku C Programming 0 10-03-2012 07:40 PM



Advertisments