Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Limitations and workarounds to expressions defining size of an array

Reply
Thread Tools

Limitations and workarounds to expressions defining size of an array

 
 
Francois Grieu
Guest
Posts: n/a
 
      09-05-2012
On 04/09/2012 20:08, James Kuyper wrote:
> On 09/04/2012 01:14 PM, Francois Grieu wrote:
>> I'm trying to figure out the exact limitations that (possibly:
>> various versions of ISO/IEC 9899) C puts to expressions defining
>> the size of a plain array at compile time, and how to workaround
>> that.

>
>
> In C99:
> 6.7.5.2p1: "... the expression shall have an integer type. If the
> expression is a constant expression, it shall have a value greater than
> zero."
> 6.7.5.2p2: "... If an identifier is declared to be an object with static
> storage duration, it shall not have a variable length array type."
> 6.7.5.2p4: "... If the size is an integer constant expression and the
> element type has a known constant size, the array type is not a variable
> length array type; otherwise, the array type is a variable length array
> type."
> 6.7.5.2p5: "If the size is an expression that is not an integer constant
> expression expression: if it occurs in a declaration at function
> prototype scope, it is treated as if it were replaced by *; otherwise,
> each time it is evaluated it shall have a value greater than zero."


Yes; also relevant is 6.6p6
An integer constant expression (.. used to specify .. the size of an
array ..) shall have integer type and shall only have operands
that are integer constants, enumeration constants, character
constants, sizeof expressions whose results are integer constants,
and floating constants that are the immediate operands of casts.
Cast operators in an integer constant expression shall only
convert arithmetic types to integer types, except as part of an
operand to the sizeof operator.

I read that as C99 NOT allowing (except in automatic context, as a
VLA) even basic use of floating point as in

struct foo * bar[(unsigned)(1000*2.71];


>> I hoped at least to use intermediate values (other than
>> macro identifiers) to clarify things and remain below
>> the "4095 characters in a logical source line" limit


> Whether or not that's permitted depends upon precisely what you mean
> by that phrase - it's not clear from context.


I was hoping for (not tested)

// select temporary type
#include <limits.h>
#ifdef ULLONG_MAX
#define UTMP static const unsigned long long
#else
#define UTMP static const unsigned long
#endif

#define EXPECTED 15000
UTMP kExpected = EXPECTED;
UTMP kRoot2 = 12*12*Expected;
// we want ceil(sqrt(12*12*Expected)), use Newton/Rapson
UTMP kExTmp0 = (uwl)3<<(kRoot2>>30 > 0 ? 15 :
kRoot2>>24 > 0 ? 12 :
kRoot2>>18 > 0 ? 9 :
kRoot2>>12 > 0 ? 6 : 3);
UTMP kExTmp1 = (kExTmp0+kRoot2/kExTmp0)/2;
UTMP kExTmp2 = (kExTmp1+kRoot2/kExTmp1)/2;
UTMP kExTmp3 = (kExTmp2+kRoot2/kExTmp2)/2;
UTMP kExTmp4 = (kExTmp3+kRoot2/kExTmp3)/2;
UTMP kExTmp5 = (kExTmp4+kRoot2/kExTmp4)/2;
UTMP kExTmp6 = (kExTmp5+kRoot2/kExTmp5)/2;
UTMP kExTmp7 = (kExTmp6+kRoot2/kExTmp6)/2;
UTMP kExTmp8 = (kExTmp7+kRoot2/kExTmp7)/2;
UTMP kExTmp9 = (kExTmp8+kRoot2/kExTmp/2;
UTMP kExTmpA = (kExTmp9+kRoot2/kExTmp9)/2;
UTMP kExTmpB = (kExTmpA+kRoot2/kExTmpA)/2;
UTMP kExTmpC = (kExTmpB+kRoot2/kExTmpB)/2;
UTMP kRoot = kExTmpC+(kExTmpB>kExTmpC?1:0);

struct foo * bar[kExpected+kRoot];

Is that valid in C11?

Francois Grieu
 
Reply With Quote
 
 
 
 
Francois Grieu
Guest
Posts: n/a
 
      09-05-2012
On 04/09/2012 20:08, James Kuyper wrote:
> On 09/04/2012 01:14 PM, Francois Grieu wrote:
>> I'm trying to figure out the exact limitations that (possibly:
>> various versions of ISO/IEC 9899) C puts to expressions defining
>> the size of a plain array at compile time, and how to workaround
>> that.

>
>
> In C99:
> 6.7.5.2p1: "... the expression shall have an integer type. If the
> expression is a constant expression, it shall have a value greater than
> zero."
> 6.7.5.2p2: "... If an identifier is declared to be an object with static
> storage duration, it shall not have a variable length array type."
> 6.7.5.2p4: "... If the size is an integer constant expression and the
> element type has a known constant size, the array type is not a variable
> length array type; otherwise, the array type is a variable length array
> type."
> 6.7.5.2p5: "If the size is an expression that is not an integer constant
> expression expression: if it occurs in a declaration at function
> prototype scope, it is treated as if it were replaced by *; otherwise,
> each time it is evaluated it shall have a value greater than zero."


Yes; also relevant is 6.6p6
An integer constant expression (.. used to specify .. the size of an
array ..) shall have integer type and shall only have operands
that are integer constants, enumeration constants, character
constants, sizeof expressions whose results are integer constants,
and floating constants that are the immediate operands of casts.
Cast operators in an integer constant expression shall only
convert arithmetic types to integer types, except as part of an
operand to the sizeof operator.

I read that as C99 NOT allowing (except in automatic context, as a
VLA) even basic use of floating point as in

struct foo * bar[(unsigned)(1000*2.71];


>> I hoped at least to use intermediate values (other than
>> macro identifiers) to clarify things and remain below
>> the "4095 characters in a logical source line" limit


> Whether or not that's permitted depends upon precisely what you mean
> by that phrase - it's not clear from context.


I was hoping for
// select temporary type
#include <limits.h>
#ifdef ULLONG_MAX
typedef unsigned long long uw;
#else
typedef unsigned long uw;
#endif
#define UTMP static const uw

#define EXPECTED 15000
UTMP kExpected = EXPECTED;
UTMP kRoot2 = 12*12*kExpected;
// we want ceil(sqrt(12*12*kExpected)), use Newton/Rapson
UTMP kExTmp0 = (uw)3<<(kRoot2>>30 > 0 ? 15 :
kRoot2>>24 > 0 ? 12 :
kRoot2>>18 > 0 ? 9 :
kRoot2>>12 > 0 ? 6 : 3);
UTMP kExTmp1 = (kExTmp0+kRoot2/kExTmp0)/2;
UTMP kExTmp2 = (kExTmp1+kRoot2/kExTmp1)/2;
UTMP kExTmp3 = (kExTmp2+kRoot2/kExTmp2)/2;
UTMP kExTmp4 = (kExTmp3+kRoot2/kExTmp3)/2;
UTMP kExTmp5 = (kExTmp4+kRoot2/kExTmp4)/2;
UTMP kExTmp6 = (kExTmp5+kRoot2/kExTmp5)/2;
UTMP kExTmp7 = (kExTmp6+kRoot2/kExTmp6)/2;
UTMP kExTmp8 = (kExTmp7+kRoot2/kExTmp7)/2;
UTMP kExTmp9 = (kExTmp8+kRoot2/kExTmp/2;
UTMP kExTmpA = (kExTmp9+kRoot2/kExTmp9)/2;
UTMP kExTmpB = (kExTmpA+kRoot2/kExTmpA)/2;
UTMP kExTmpC = (kExTmpB+kRoot2/kExTmpB)/2;
UTMP kRoot = kExTmpC+(kExTmpB>kExTmpC?1:0);

struct foo * bar[kExpected+kRoot];


This is not valid C99 (the definition of kRoot2 is invalid);
is that valid in C11? <ot>Note: it seem valid in C++</ot>.

Francois Grieu
 
Reply With Quote
 
 
 
 
Francois Grieu
Guest
Posts: n/a
 
      09-05-2012
On 05/09/2012 02:58, Eric Sosman wrote:
> The Standard draws a line and says "All C compilers accept
> everything on this side of the line as constants." But the line
> is permeable: "An implementation may accept other forms of constant
> expressions" (6.6p10). The upshot:
>
> 1) A C compiler must accept `static int x[2+1];', because
> `2+1' is on the hither side of the line,
>
> 2) A C compiler may reject `static int x[strlen("abc")];'
> because `strlen("abc")' is on the thither side, but
>
> 3) A C compiler may accept `static int x[strlen("abc")];'
> under the "other forms" allowance, and need not even
> issue a diagnostic.
>
> Cases (2) and (3) illustrate why "It works with my compiler"
> does not define the language.


Well said.

Unless I err, we have a portable idiom equivalent to that:

static int x[sizeof("abc")-1];


Francois Grieu
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      09-05-2012
On 09/05/2012 03:09 AM, Francois Grieu wrote:
> On 04/09/2012 20:08, James Kuyper wrote:
>> On 09/04/2012 01:14 PM, Francois Grieu wrote:

....
> Yes; also relevant is 6.6p6
> An integer constant expression (.. used to specify .. the size of an
> array ..) shall have integer type and shall only have operands
> that are integer constants, enumeration constants, character
> constants, sizeof expressions whose results are integer constants,
> and floating constants that are the immediate operands of casts.
> Cast operators in an integer constant expression shall only
> convert arithmetic types to integer types, except as part of an
> operand to the sizeof operator.
>
> I read that as C99 NOT allowing (except in automatic context, as a
> VLA) even basic use of floating point as in
>
> struct foo * bar[(unsigned)(1000*2.71];


That is correct. Which is why, if you want to do this, it will have to
be an array with automatic storage duration. Does it need to have static
storage duration?

....
> I was hoping for (not tested)
>
> // select temporary type
> #include <limits.h>
> #ifdef ULLONG_MAX
> #define UTMP static const unsigned long long
> #else
> #define UTMP static const unsigned long
> #endif


I'd recommend using a typedef rather than #define for that purpose. If
you do use a typedef, you should also change the name, since
fully-capitalized names are, by convention, used mainly for macros.

> #define EXPECTED 15000
> UTMP kExpected = EXPECTED;
> UTMP kRoot2 = 12*12*Expected;
> // we want ceil(sqrt(12*12*Expected)), use Newton/Rapson
> UTMP kExTmp0 = (uwl)3<<(kRoot2>>30 > 0 ? 15 :
> kRoot2>>24 > 0 ? 12 :
> kRoot2>>18 > 0 ? 9 :
> kRoot2>>12 > 0 ? 6 : 3);
> UTMP kExTmp1 = (kExTmp0+kRoot2/kExTmp0)/2;
> UTMP kExTmp2 = (kExTmp1+kRoot2/kExTmp1)/2;
> UTMP kExTmp3 = (kExTmp2+kRoot2/kExTmp2)/2;
> UTMP kExTmp4 = (kExTmp3+kRoot2/kExTmp3)/2;
> UTMP kExTmp5 = (kExTmp4+kRoot2/kExTmp4)/2;
> UTMP kExTmp6 = (kExTmp5+kRoot2/kExTmp5)/2;
> UTMP kExTmp7 = (kExTmp6+kRoot2/kExTmp6)/2;
> UTMP kExTmp8 = (kExTmp7+kRoot2/kExTmp7)/2;
> UTMP kExTmp9 = (kExTmp8+kRoot2/kExTmp/2;
> UTMP kExTmpA = (kExTmp9+kRoot2/kExTmp9)/2;
> UTMP kExTmpB = (kExTmpA+kRoot2/kExTmpA)/2;
> UTMP kExTmpC = (kExTmpB+kRoot2/kExTmpB)/2;
> UTMP kRoot = kExTmpC+(kExTmpB>kExTmpC?1:0);
>
> struct foo * bar[kExpected+kRoot];
>
> Is that valid in C11?


I don't see any problems. Did you have any particular reason for
doubting that it was valid? If so, you might be right - I'm far less
reliable when I say "I see no problems" than when I say "I see a problem".
--
James Kuyper
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      09-05-2012
On 09/05/2012 03:56 AM, Francois Grieu wrote:
....
> I was hoping for
> // select temporary type
> #include <limits.h>
> #ifdef ULLONG_MAX
> typedef unsigned long long uw;
> #else
> typedef unsigned long uw;
> #endif
> #define UTMP static const uw
>
> #define EXPECTED 15000
> UTMP kExpected = EXPECTED;
> UTMP kRoot2 = 12*12*kExpected;
> // we want ceil(sqrt(12*12*kExpected)), use Newton/Rapson
> UTMP kExTmp0 = (uw)3<<(kRoot2>>30 > 0 ? 15 :
> kRoot2>>24 > 0 ? 12 :
> kRoot2>>18 > 0 ? 9 :
> kRoot2>>12 > 0 ? 6 : 3);
> UTMP kExTmp1 = (kExTmp0+kRoot2/kExTmp0)/2;
> UTMP kExTmp2 = (kExTmp1+kRoot2/kExTmp1)/2;
> UTMP kExTmp3 = (kExTmp2+kRoot2/kExTmp2)/2;
> UTMP kExTmp4 = (kExTmp3+kRoot2/kExTmp3)/2;
> UTMP kExTmp5 = (kExTmp4+kRoot2/kExTmp4)/2;
> UTMP kExTmp6 = (kExTmp5+kRoot2/kExTmp5)/2;
> UTMP kExTmp7 = (kExTmp6+kRoot2/kExTmp6)/2;
> UTMP kExTmp8 = (kExTmp7+kRoot2/kExTmp7)/2;
> UTMP kExTmp9 = (kExTmp8+kRoot2/kExTmp/2;
> UTMP kExTmpA = (kExTmp9+kRoot2/kExTmp9)/2;
> UTMP kExTmpB = (kExTmpA+kRoot2/kExTmpA)/2;
> UTMP kExTmpC = (kExTmpB+kRoot2/kExTmpB)/2;
> UTMP kRoot = kExTmpC+(kExTmpB>kExTmpC?1:0);
>
> struct foo * bar[kExpected+kRoot];
>
>
> This is not valid C99 (the definition of kRoot2 is invalid);
> is that valid in C11? <ot>Note: it seem valid in C++</ot>.


Ah - I see the point you're raising now. Sorry for missing it in your
previous message - I didn't notice your use of 'static' in the
#definition of UTMP. Initializers for an object of static storage
duration must be constant expressions. In C++, a const variable
initialized with a constant expression can itself be used as a constant
expression, but in C it is not allowed. You'd have to re-write this as a
single constant expression, and you'll have to be careful to avoid
running into line length limitations if you do so. I haven't checked to
be sure, but you might have settle for fewer iterations.

If you remove the 'static' and define these inside a function, there
should be no problem except with the definition of bar itself, which
requires VLA support. Of course, if you can use a VLA, you wouldn't need
to use such a round-about method to do this, you could simply calculate
the sqrt() directly.
--
James Kuyper
 
Reply With Quote
 
army1987
Guest
Posts: n/a
 
      09-05-2012
On Tue, 04 Sep 2012 17:41:25 -0400, James Kuyper wrote:

> On 09/04/2012 05:31 PM, army1987 wrote:
>> On Tue, 04 Sep 2012 21:26:29 +0000, army1987 wrote:
>>
>>> On Tue, 04 Sep 2012 17:07:39 -0400, Eric Sosman wrote:
>>>
>>>> Don't confuse definition with optimization. Also, try your
>>>> compilers on
>>>>
>>>> strcspn("BartC", "abcdefghijklmnopqrsBCuvwxyz")
>>>>
>>>> ... and let us know if they replace it with `3'.
>>>
>>> You meant strspn?

>>
>> BTW, I'm shocked by the result. It optimizes away the call to strspn
>> but it actually calls printf with arguments "%d\n" and 3? WTF?

>
> The only printf() calls explicitly mentioned in this thread so far were
> in my example program, which used "%zu", not "%d", so you're presumably
> referring to some other program. What does the source code look like?


#include <stdio.h>
int main(void)
{
printf("%d\n", (int)strspn("BartC", "abcdefghijklmnopqrsBCuvwxyz"));
return 0;
}



--
[ T H I S S P A C E I S F O R R E N T ]
Troppo poca cultura ci rende ignoranti, troppa ci rende folli.
-- fathermckenzie di it.cultura.linguistica.italiano
<http://xkcd.com/397/>
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      09-05-2012
On 09/05/2012 04:45 PM, army1987 wrote:
> On Tue, 04 Sep 2012 17:41:25 -0400, James Kuyper wrote:
>
>> On 09/04/2012 05:31 PM, army1987 wrote:

....
>>> BTW, I'm shocked by the result. It optimizes away the call to strspn
>>> but it actually calls printf with arguments "%d\n" and 3? WTF?

>>
>> The only printf() calls explicitly mentioned in this thread so far were
>> in my example program, which used "%zu", not "%d", so you're presumably
>> referring to some other program. What does the source code look like?

>
> #include <stdio.h>
> int main(void)
> {
> printf("%d\n", (int)strspn("BartC", "abcdefghijklmnopqrsBCuvwxyz"));
> return 0;
> }


OK, that's roughly what I expected, but I wanted to be sure. In
particular, your "WTF?" left me wondering whether you thought the
optimization was invalid. I'm mildly impressed by that optimization, but
not nearly as much as you were, apparently.
 
Reply With Quote
 
army1987
Guest
Posts: n/a
 
      09-05-2012
On Wed, 05 Sep 2012 17:23:04 -0400, James Kuyper wrote:

> In
> particular, your "WTF?" left me wondering whether you thought the
> optimization was invalid.


I don't. But 1) I was surprised that strspn() was optimized away, and 2)
I seemed to remember seeing a compiler optimizing away printf() some time
ago, so I expected that if a compiler would do the former it would do the
latter too.



--
[ T H I S S P A C E I S F O R R E N T ]
Troppo poca cultura ci rende ignoranti, troppa ci rende folli.
-- fathermckenzie di it.cultura.linguistica.italiano
<http://xkcd.com/397/>
 
Reply With Quote
 
Francois Grieu
Guest
Posts: n/a
 
      09-05-2012
On 05/09/2012 13:33, James Kuyper wrote:
> (..) Initializers for an object of static storage
> duration must be constant expressions. In C++, a const variable
> initialized with a constant expression can itself be used as a constant
> expression, but in C it is not allowed. You'd have to re-write this as a
> single constant expression, and you'll have to be careful to avoid
> running into line length limitations if you do so. I haven't checked to
> be sure, but you might have settle for fewer iterations.


I think that I found a conforming workaround: use enums instead
of constants. There is the limitation that enums can only be
trusted to hold 15 bits, but I can workaround that too using
the following ZS and ZG macros.


// select temporary type uw
#include <limits.h>
#ifdef ULLONG_MAX
typedef unsigned long long uw;
#else
typedef unsigned long uw;
#endif

// Utility macro to set symbol s to value v (60-bit capacity)
#define ZS(s,v) enum{e0##s=(v)&32767,e1##s=(v)>>15&32767,\
e2##s=(v)>>15>>15&32767,e3##s=(v)>>15>>15>>15&3276 7}

// Utility macro to get value of symbol s (60-bit capacity)
#define ZG(s) (((((uw)e3##s<<15|e2##s)<<15|e1##s))<<15|e0##s)

// Set of macros to compute floor(sqrt(x)) for x>0.
// Find a raw approximation of sqrt(x)
#define SQRT0(x) ((uw)8<<(3*((x)<512?0x)>>15<1?1x)>>15<64?2:\
(x)>>15>>9<8?3x)>>15>>15<8?4x)>>15>>15>>9<1?5: \
(x)>>15>>15>>15<1?6x)>>15>>15>>15<64?7x)>>15>> 15>>15>>9<8?8:9)))
// One Newton-Raphson refinement step
#define SQRT1(x,a) (((x)/(a)+(a))/2)
// Four Newton-Raphson refinement steps
#define SQRT4(x,a) (SQRT1(x,SQRT1(x,SQRT1(x,SQRT1(x,a)))))
// Final tweak to get exact value
#define SQRTF(x,a) (SQRT1(x,a)-(SQRT1(x,a)>(a)?1:0))

#define EXPECTED 400000000

// Now we want:
// struct foo * bar[EXPECTED+ceil(12*sqrt(EXPECTED)];
// that is
// struct foo * bar[EXPECTED+1+floor(sqrt(12*12*EXPECTED-1)];
#define KVALUE (12*12*(uw)(EXPECTED)-1)
ZS(kExTmp0, SQRT0(KVALUE)); // first approximation
ZS(kExTmp1, SQRT4(KVALUE,ZG(kExTmp0))); // improve..
ZS(kExTmp2, SQRT4(KVALUE,ZG(kExTmp1))); // improve..
ZS(kExTmp3, SQRT4(KVALUE,ZG(kExTmp2))); // improve..
#define KROOT SQRTF(KVALUE,ZG(kExTmp3)) // finalize
struct foo * bar[EXPECTED+1+KROOT];


This works and (as far as I can tell) is conformant, but many
will understandably frown on it as hairy. Especially if we push
it further and implement floating-point-as-enum.


Questions: In C11, what liberties do we have?
In particular,
- can we define the size of a static array (non-VLA)
using a static const long ?
- can we use a static const long freely in an
expression that defines another static const long ?
- can we use floating point arithmetic in expressions defining
a static const long ?
- can we use exp() and log() in such expressions ?

Francois Grieu
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      09-05-2012
army1987 <(E-Mail Removed)> writes:

> On Wed, 05 Sep 2012 17:23:04 -0400, James Kuyper wrote:
>
>> In
>> particular, your "WTF?" left me wondering whether you thought the
>> optimization was invalid.

>
> I don't. But 1) I was surprised that strspn() was optimized away, and 2)
> I seemed to remember seeing a compiler optimizing away printf() some time
> ago, so I expected that if a compiler would do the former it would do the
> latter too.


gcc will replace some printf calls that use %s with calls to simpler
functions like puts, but replacing a call that uses a %d format is at
least two steps: convert the constant integer to a string and then
replace the printf call. I suspect the conversion puts too great a
distance between what's written and the improved result.

An implementation could choose to replace printf calls that is %d with
other functions designed to printf numbers directly, but that would mean
introducing special functions for this relatively rare optimisation.
Had C already had print_int and so on I imagine it might have been
considered (is that hedged enough!?).

--
Ben.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
total array size limitations? glennklockwood C Programming 10 03-16-2009 08:49 AM
service-policy output on a 3560 doesn't work, any workarounds ? hm klette Cisco 2 10-21-2005 09:36 AM
ASP.NET restricts server forms to one per page... workarounds... ??? nzanella@gmail.com ASP .Net 3 01-14-2005 04:54 PM
defining or not defining destructors johny smith C++ 8 07-02-2004 08:51 AM
g++: class size or file size limitations: NONTRIVIAL BUG Neil Zanella C++ 4 10-09-2003 01:49 PM



Advertisments