Velocity Reviews > Anything in the language to support better recognition of vector operations?

# Anything in the language to support better recognition of vector operations?

David Mathog
Guest
Posts: n/a

 11-08-2010
Do any of the current or upcoming C language standards provide support
to help a compiler recognize vectors and then generate SIMD
operations? For instance, consider this code snippet:

#define ALEN 128
unsigned char a[ALEN];
unsigned char b[ALEN];
int i;
/* arrays are initialized (not shown) */
for (i=0;i<ALEN;i++){
b[i] += a[i];
}

One would hope that a compiler could recognize that as a vector
operation and generate code to take advantage of it on a given target
that supports SIMD operations. This gets trickier when converted to a
function:

addABvector(unsigned char *a, unsigned char *b, int len){
int i;
for (i=0;i<len;i++){
b[i] += a[i];
}
}

Here while a and b may be "vectors" there is nothing to show that they
are aligned optimally, or that len will be large, so the compiler
cannot easily know which way it should optimize this (for long aligned
vectors, or short unaligned data). If we try to tell the compiler to
make a distinction by length, for instance, I am guessing that most
compilers would just optimize the test out of existence, for instance,
in this case:

addABvector(unsigned char *a, unsigned char *b, int len){
int i;
if(len<ISMINLENGTHVECTOR){
for (i=0;i<len;i++){ b[i] += a[i]; }
}
else {
for (i=0;i<len;i++){ b[i] += a[i]; }
}
}

Finally, what if saturating math is required? Then the loop becomes
something like this:

for (i=0;i<len;i++){
b[i] += a[i];
if(b[i] < a[i])b[i]=UCHAR_MAX; /* or any other test for
saturation, all of which have a conditional */
}

I believe some processors have saturating math operations in their
"normal" instruction set, and there are definitely saturating add
operations in the SIMD instruction sets. But in order to use these
the compiler has to deduce from the logic of the loop that a
saturating add is the desired result.

There are certainly other facets, but clues the compiler would need to
tell it when it should use vector instructions would seem to be
minimally:

1. a method to indicate when math operations are saturating.
2. a method to indicate when memory structures must be aligned on
some 2^N byte boundary.
3. a method to indicate the allowed size of an array.

Is any of this present in a standard C language variant, or is the
only way to achieve this to use compiler specific pragmas and
intrinsics?

Thanks,

David Mathog

Ian Collins
Guest
Posts: n/a

 11-08-2010
On 11/ 8/10 08:01 PM, David Mathog wrote:
> Do any of the current or upcoming C language standards provide support
> to help a compiler recognize vectors and then generate SIMD
> operations? For instance, consider this code snippet:
>
> #define ALEN 128
> unsigned char a[ALEN];
> unsigned char b[ALEN];
> int i;
> /* arrays are initialized (not shown) */
> for (i=0;i<ALEN;i++){
> b[i] += a[i];
> }
>
> One would hope that a compiler could recognize that as a vector
> operation and generate code to take advantage of it on a given target
> that supports SIMD operations. This gets trickier when converted to a
> function:

While it isn't a direct solution, look up OpenMP and see what support
your compiler of choice offers. This will give you some idea what is
required to identify and specify vector operations.

Something so processor specific can't really be standardised, so this
type of optimisation is more of a compiler QoI issue.

--
Ian Collins

David Mathog
Guest
Posts: n/a

 11-08-2010
On Nov 8, 12:15*am, Ian Collins <(E-Mail Removed)> wrote:
> While it isn't a direct solution, look up OpenMP and see what support
> your compiler of choice offers. *This will give you some idea what is
> required to identify and specify vector operations.

Had a look but that seems to be all about programming for multiple
cores, not so much (at all?) for SIMD optimization. So it looks like
OpenMP, on an N core machine with code that can be split perfectly
will "automatically" run N times faster, but not MN times faster,
where M is the extra speed factor that would result from using the
SIMD operations on each core. I never thought about it before, but
for just the right sort of code there could be a synergistic speedup
from using both methods - if M is 2 and N is 8, the code could run 16X
faster instead of 8x (multicore only) or 2x (SIMD only, single core)
vs 1x (no SIMD, single core).

Anyway, sounds like using compiler specific methods is currently the
only sure way to use SIMD.

Thanks,

David Mathog

Rui Maciel
Guest
Posts: n/a

 11-08-2010
David Mathog wrote:

> Had a look but that seems to be all about programming for multiple
> cores, not so much (at all?) for SIMD optimization. So it looks like
> OpenMP, on an N core machine with code that can be split perfectly
> will "automatically" run N times faster, but not MN times faster,
> where M is the extra speed factor that would result from using the
> SIMD operations on each core. I never thought about it before, but
> for just the right sort of code there could be a synergistic speedup
> from using both methods - if M is 2 and N is 8, the code could run 16X
> faster instead of 8x (multicore only) or 2x (SIMD only, single core)
> vs 1x (no SIMD, single core).
>
> Anyway, sounds like using compiler specific methods is currently the
> only sure way to use SIMD.

It would be a nice feature, particularly for us folk who happen to deal with numerical analysis
stuff. Nonetheless, asking for an implementation of this feature as a compiler-level optimization
may be an unobtainable goal, mainly due to it's complexity and lack of demand for this sort of
stuff. It is far better (and also desireable) to expect this sort of stuff from other projects
such as OpenCL.

Rui Maciel

Michael Angelo Ravera
Guest
Posts: n/a

 11-08-2010
On Nov 7, 11:01*pm, David Mathog <(E-Mail Removed)> wrote:
> Do any of the current or upcoming C language standards provide support
> to help a compiler recognize vectors and then generate SIMD
> operations? *For instance, consider this code snippet:
>
> #define ALEN 128
> unsigned char a[ALEN];
> unsigned char b[ALEN];
> int i;
> /* arrays are initialized (not shown) */
> for (i=0;i<ALEN;i++){
> * b[i] += a[i];
>
> }
>
> One would hope that a compiler could recognize that as a vector
> operation and generate code to take advantage of it on a given target
> that supports SIMD operations. *This gets trickier when converted to a
> function:
>
> addABvector(unsigned char *a, unsigned char *b, int len){
> int i;
> * for (i=0;i<len;i++){
> * * b[i] += a[i];
> * }
>
> }
>
> Here while a and b may be "vectors" there is nothing to show that they
> are aligned optimally, or that len will be large, so the compiler
> cannot easily know which way it should optimize this (for long aligned
> vectors, or short unaligned data). * If we try to tell the compiler to
> make a distinction by length, for instance, I am guessing that most
> compilers would just optimize the test out of existence, for instance,
> in this case:
>
> addABvector(unsigned char *a, unsigned char *b, int len){
> int i;
> * if(len<ISMINLENGTHVECTOR){
> * * for (i=0;i<len;i++){ b[i] += a[i]; *}
> * }
> * else {
> * * for (i=0;i<len;i++){ b[i] += a[i]; *}
> * }
>
> }
>
> Finally, *what if saturating math is required? *Then the loop becomes
> something like this:
>
> for (i=0;i<len;i++){
> * b[i] += a[i];
> * if(b[i] < a[i])b[i]=UCHAR_MAX; */* *or any other test for
> saturation, all of which have a conditional */
>
> }
>
> I believe some processors have saturating math operations in their
> "normal" instruction set, and there are definitely saturating add
> operations in the SIMD instruction sets. *But in order to use these
> the compiler has to deduce from the logic of the loop that a
> saturating add is the desired result.
>
> There are certainly other facets, but clues the compiler would need to
> tell it when it should use vector instructions would seem to be
> minimally:
>
> 1. *a method to indicate when math operations are saturating.
> 2. *a method to indicate when memory structures must be aligned on
> some 2^N byte boundary.
> 3. *a method to indicate the allowed size of an array.
>
> Is any of this present in a standard C language variant, or is the
> only way to achieve this to use compiler specific pragmas and
> intrinsics?
>
> Thanks,
>
> David Mathog

I've seen the adjective "plural" used in some C compliers when an
array was design to be split across multiple processors, but no
standard exists for this of which I am aware.

David Mathog
Guest
Posts: n/a

 11-11-2010
On Nov 9, 11:06*am, "christian.bau" <(E-Mail Removed)>
wrote:
> You completely missed the case of overlapping arrays - if I call
>
> addABVector (&array [0], &array [1], 100);
>
> then vector operations would give a completely wrong result.

Well sure, but there are lots of functions where one cannot pass
pointers to overlapping memory areas and get the right results.

> For saturating maths; this is easily recognised as long as you write
> the code in a form that is actually equivalent to the saturated ops of
> the processor.

I'll believe it when you post the source file and resulting assembler
where C code was compiled to use the native saturating operator. Even
if this works sometimes, I was trying to show that it is hard to code
so that it would work every time.

Regards,

David Mathog

 Thread Tools

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is OffTrackbacks are On Pingbacks are On Refbacks are Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post Javier C++ 2 09-04-2007 08:46 PM pmatos C++ 6 04-26-2007 05:39 PM Allerdyce.John@gmail.com C++ 8 02-18-2006 12:48 AM Kenny Computer Support 0 05-06-2005 04:50 AM Rushikesh Joshi Perl Misc 0 07-10-2004 01:04 PM

Advertisments