Velocity Reviews > Array addressing styles: arr[i].d[j] versus pArr->d[j], etc.

# Array addressing styles: arr[i].d[j] versus pArr->d[j], etc.

John Reye
Guest
Posts: n/a

 05-27-2012
Hi!

In writing code I often find numerous ways of expressing the same
thing.
I have some questions of "style" and "optimization" that I'd like to

Example1
========

Variant 1a)

arr[i].a = f1();
arr[i].b = f2();
arr[i].c = f3();

OR

Variant 1b)

struct tmp * const pArr = &arr[i]; // arr + i
pArr->a = f1();
pArr->b = f2();
pArr->c = f3();

Here I probably prefer the 1b). But not on reasons of style, but on
the suspicion (*) that some compilers will produce more efficient code
with the 1b).

(*) - what do you think?

Example2
========

Variant 2a)

arr[i].a = f1();
arr[i].b = f2();
arr[i].c = f3();
arr[i].d[j++] = f4();
arr[i].d[j] = f5();

OR

Variant 2b)

struct tmp * const pArr = &arr[i]; // arr + i
pArr->a = f1();
pArr->b = f2();
pArr->c = f3();
pArr->d[j++] = f4();
pArr->d[j] = f5();

OR

Variant 2c)

struct tmp * const pArr = &arr[i]; // arr + i
pArr->a = f1();
pArr->b = f2();
pArr->c = f3();
int * pd = &(pArr->d[j++]);
*pd++ = f4();
*pd = f5();

Here I'd definatley NOT use 2a) because I suspect that double array-
indexing via 2 `[]'-operators is not efficient.
I'd use either 2b) or 2c).

What are you're opinions on this? Which style do you use?
Any tips are highly appreciated. Thanks.

John Reye
Guest
Posts: n/a

 05-27-2012
For a concrete example, consider the following:
Do you prefer fill_array1() or fill_array2()

#define ARR_ELEMS 4

struct st1
{
char * name;
int i;
int ringbuf[ARR_ELEMS];
unsigned ringbuf_current_index;
};

#define ST1_ELEMS 10

struct st1 st[ST1_ELEMS];
unsigned statistics[ST1_ELEMS];
/* | */
/* common index */

char *gen_name()
{
static char a[] = "asdf";
return a;
}

int gen_value()
{
static int a = 0;
return a++;
}

unsigned inc_ringbuf_index(unsigned i)
{
if (++i == ARR_ELEMS)
return 0U;
return i;
}

void fill_array1(unsigned i)
{
st[i].i = i;
st[i].name = gen_name(i);
st[i].ringbuf[ st[i].ringbuf_current_index ] = gen_value();
st[i].ringbuf_current_index =
inc_ringbuf_index( st[i].ringbuf_current_index );
statistics[i]++;
}

void fill_array2(unsigned i)
{
struct st1 * const pst = &st[i]; // st + i
unsigned idx = pst->ringbuf_current_index;

pst->i = i;
pst->name = gen_name(i);
pst->ringbuf[ idx ] = gen_value();
pst->ringbuf_current_index = inc_ringbuf_index( idx );
statistics[i]++;
}

Eric Sosman
Guest
Posts: n/a

 05-27-2012
On 5/27/2012 1:30 PM, John Reye wrote:
> Hi!
>
> In writing code I often find numerous ways of expressing the same
> thing.
> I have some questions of "style" and "optimization" that I'd like to
> [...]
> (*) - what do you think?

I think this is Question 20.14 on the comp.lang.c Frequently

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)d

Thomas Richter
Guest
Posts: n/a

 05-27-2012
On 27.05.2012 19:30, John Reye wrote:
> Hi!
>
> In writing code I often find numerous ways of expressing the same
> thing.
> I have some questions of "style" and "optimization" that I'd like to
>
> Example1
> ========
>
> Variant 1a)
>
> arr[i].a = f1();
> arr[i].b = f2();
> arr[i].c = f3();
>
> OR
>
> Variant 1b)
>
> struct tmp * const pArr =&arr[i]; // arr + i
> pArr->a = f1();
> pArr->b = f2();
> pArr->c = f3();
>
>
> Here I probably prefer the 1b). But not on reasons of style, but on
> the suspicion (*) that some compilers will produce more efficient code
> with the 1b).
>
> (*) - what do you think?

"Premature optimization is the root of all evil.". Your coding style
shouldn't be based on suspicions on what the compiler might or might not
be able to do. Go for clarity. For a simple program as given, I would
pick a) because it is more readable. And should profiling *really*
identify this as a bottleneck, then, and only then, consider alternatives.

Frankly, I believe any decent compiler will likely generate identical
code for both cases.

Johannes Bauer
Guest
Posts: n/a

 05-28-2012
On 27.05.2012 22:03, Thomas Richter wrote:

>> Example1
>> ========
>>
>> Variant 1a)
>>
>> arr[i].a = f1();
>> arr[i].b = f2();
>> arr[i].c = f3();
>>
>> OR
>>
>> Variant 1b)
>>
>> struct tmp * const pArr =&arr[i]; // arr + i
>> pArr->a = f1();
>> pArr->b = f2();
>> pArr->c = f3();

>
> Frankly, I believe any decent compiler will likely generate identical
> code for both cases.

Just for fun I tried this out on x86-64 Linux with gcc 4.6.2 (knowing
full well this is a *language* group, but oh well...).

Fun fact: 1a) for my constellation produces *faster* code than 1b)! Very
cool. If you look at the assembly of 1a)

400512: 48 81 ec 08 10 00 00 sub \$0x1008,%rsp
400519: 48 8d 5c 24 04 lea 0x4(%rsp),%rbx
40051e: 48 8d ac 24 04 10 00 lea 0x1004(%rsp),%rbp
400525: 00

Loop:
400526: e8 2f 01 00 00 callq 40065a <f1>
40052b: 89 03 mov %eax,(%rbx)
40052d: e8 2e 01 00 00 callq 400660 <f2>
400532: 89 43 04 mov %eax,0x4(%rbx)
400535: e8 2c 01 00 00 callq 400666 <f3>
40053a: 89 43 08 mov %eax,0x8(%rbx)
40053d: 48 83 c3 20 add \$0x20,%rbx
400541: 48 39 eb cmp %rbp,%rbx
400544: 75 e0 jne 400526 <main+0x16>

and compare against 1b)

400512: 48 81 ec 08 10 00 00 sub \$0x1008,%rsp
400519: 31 db xor %ebx,%ebx

Loop:
40051b: 48 89 dd mov %rbx,%rbp
40051e: 48 c1 e5 05 shl \$0x5,%rbp
400522: 48 8d 04 24 lea (%rsp),%rax
400526: 48 01 c5 add %rax,%rbp
400529: e8 34 01 00 00 callq 400662 <f1>
40052e: 89 45 04 mov %eax,0x4(%rbp)
400531: e8 32 01 00 00 callq 400668 <f2>
400536: 89 45 08 mov %eax,0x8(%rbp)
400539: e8 30 01 00 00 callq 40066e <f3>
40053e: 89 45 0c mov %eax,0xc(%rbp)
400541: 48 83 c3 01 add \$0x1,%rbx
400545: 48 81 fb 80 00 00 00 cmp \$0x80,%rbx
40054c: 75 cd jne 40051b <main+0xb>

I.e. variant 1a) does a load effective address once and increments by
the size of the struct each time, using the dispacement to assign the
values to the struct members.

Variant 2a) counts through 0...(n-1) and does a LEA *each* iteration.

Very cool result IMO (especially since it nicely reinforces the point
that "optimizing" code thinking it will run faster is a stupid idea
without knowing *exactly* what happens).

Best regards,
Joe

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3\$om2\$(E-Mail Removed)>

Tim Rentsch
Guest
Posts: n/a

 05-31-2012
John Reye <(E-Mail Removed)> writes:

> Hi!
>
> In writing code I often find numerous ways of expressing the same
> thing.
> I have some questions of "style" and "optimization" that I'd like to
>
> Example1
> ========
>
> Variant 1a)
>
> arr[i].a = f1();
> arr[i].b = f2();
> arr[i].c = f3();
>
> OR
>
> Variant 1b)
>
> struct tmp * const pArr = &arr[i]; // arr + i
> pArr->a = f1();
> pArr->b = f2();
> pArr->c = f3();
>
>
> Here I probably prefer the 1b). But not on reasons of style, but on
> the suspicion (*) that some compilers will produce more efficient code
> with the 1b).
>
> (*) - what do you think?
>
>
>
> Example2
> ========
>
> Variant 2a)
>
> arr[i].a = f1();
> arr[i].b = f2();
> arr[i].c = f3();
> arr[i].d[j++] = f4();
> arr[i].d[j] = f5();
>
> OR
>
> Variant 2b)
>
> struct tmp * const pArr = &arr[i]; // arr + i
> pArr->a = f1();
> pArr->b = f2();
> pArr->c = f3();
> pArr->d[j++] = f4();
> pArr->d[j] = f5();
>
> OR
>
> Variant 2c)
>
> struct tmp * const pArr = &arr[i]; // arr + i
> pArr->a = f1();
> pArr->b = f2();
> pArr->c = f3();
> int * pd = &(pArr->d[j++]);
> *pd++ = f4();
> *pd = f5();
>
> Here I'd definatley NOT use 2a) because I suspect that double array-
> indexing via 2 `[]'-operators is not efficient.
> I'd use either 2b) or 2c).
>
>
> What are you're opinions on this? Which style do you use?
> Any tips are highly appreciated. Thanks.

IMO any appeal to efficiency here is misguided. Common subexpression
optimization, which is all that's needed here, has been around more
than 40 years. The pointer variants are unlikely to run any faster
than the indexing-based versions, and actually may run slower (as some
have noted). As is the case with almost all micro-optimizations, the
best choice is the one that expresses what the developer means to do
most simply and most directly, and let the compiler take care of
generating good code, because in most such cases it will do a better
job.