Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C Programming (http://www.velocityreviews.com/forums/f42-c-programming.html)
-   -   The container abstraction and parallel programming (http://www.velocityreviews.com/forums/t807756-the-container-abstraction-and-parallel-programming.html)

jacob navia 01-06-2012 11:02 PM

The container abstraction and parallel programming
 
Today, more than 10 years tafter Intel introduced the first SIMD
instructions it is still not possible to use those instructions to do
SIMD programming when you use C and two mainstream compilers like gcc or
MSVC.

From the C side there is absolutely no support for this type of
programming, even though plain arrays could do the trick obviously.


The container abstraction, specifically the valarray container, could be
used to *mean* SIMD using operator overloading.

The use case could be

ValArray operator+(ValArray left,ValArray right);

This would allow an operation like adding two arrays element-wise
to be performed using SIMD instructions.


Obviously, the usual "optimizations" can be added to those
basic operations so that

ValArray a,b,c,d;

a = b+c+d;

doesn't have to produce an intermediate array but can be "compiled"
by an overloaded assignment operator to generate

a[i] = b[i]+c[i]+d[i];

as is done in C++. Each overloaded operator returns a token operator
containing information about its arguments, and the overloaded
assignment operator compiles correct element-wise code by reparsing
the generated tree.

To prepare for this, the C Containers library proposes the ValArray
container, for each of the basic types (char, short, int, long long and
size_t).

In the sample implementation there is a generic module that
allows to generate code for ANY type, also those being left out like
long (since it is always either equal to int or equal to long long in
all implementations I know of). The type size_t was added to be able
to correctly index any sequantial container with a ValArray of size_t's.

The code proposed NOW is plain serial code, but can (and should) be
replaced in more sophisticated implementations than the sample
implementation to do SIMD programming, one of the most basic forms
of parallelism, already available in plain hardware since already 10
years...

Obviously the four basic operations should be there, but other kinds of
array operations could be interesting like

o REDUCE Apply binary operation to each element and the running total:
REDUCE + 1 2 3 4 5 --> 1+2+3+4+5 = 15

You apply operator + to the identity element for "+" and to the
first element: 0+1 --> 1
You apply the operator + to the second element and the result
of the last operation: 1+2-->3
ETC until the last element is reached.

o EXPAND Apply binary operation producing a ValArray of same length
containing the intermediate results:
EXPAND + 1 2 3 4 5 --> 1 3 6 10 15


o SELECT Apply boolean vector to ValArray selecting all elements that
correspond to 1 and eliminate those with zeros.

SELECT 1 1 0 1 0 0 / 10 20 30 40 50 60 --> 10 20 40

o EXTEND Apply boolean vector to ValArray leaving all elements with 1
and introducing the zero or blank (for character arrays) at the
zero positions:

EXTEND 1 1 0 1 0 0 / 10 20 40 --> 10 20 0 40 0 0

The idea is to select a set of operations that makes it easy to work
with arrays as objects of operations instead of working with scalar data.

What new operations should be in the set?
How could those operations be incoporated into mainstream C?


jacob

Ben Pfaff 01-06-2012 11:08 PM

Re: The container abstraction and parallel programming
 
jacob navia <jacob@spamsink.net> writes:

> Today, more than 10 years tafter Intel introduced the first SIMD
> instructions it is still not possible to use those instructions to do
> SIMD programming when you use C and two mainstream compilers like gcc
> or MSVC.
>
> From the C side there is absolutely no support for this type of
> programming, even though plain arrays could do the trick obviously.


"absolutely no support"? You exaggerate:
http://gcc.gnu.org/onlinedocs/gcc-4....2din-Functions
--
"For those who want to translate C to Pascal, it may be that a lobotomy
serves your needs better." --M. Ambuhl

"Here are the steps to create a C-to-Turbo-Pascal translator..." --H. Schildt

jacob navia 01-06-2012 11:27 PM

Re: The container abstraction and parallel programming
 
Le 07/01/12 00:08, Ben Pfaff a écrit :
> jacob navia<jacob@spamsink.net> writes:
>
>> Today, more than 10 years tafter Intel introduced the first SIMD
>> instructions it is still not possible to use those instructions to do
>> SIMD programming when you use C and two mainstream compilers like gcc
>> or MSVC.
>>
>> From the C side there is absolutely no support for this type of
>> programming, even though plain arrays could do the trick obviously.

>
> "absolutely no support"? You exaggerate:
> http://gcc.gnu.org/onlinedocs/gcc-4....2din-Functions


Hi Ben

Look, those functions exist of curse but that is not standard C, those
functions have no links to the rest of the language.

Besides, they are tied to the Intel architecture (what
is not wrong of course) but the idea behind my post is to try to
abstract a bit from a single architecture to a more *general* SIMD
programming. Note that the notions of container and operations
are NOT tied to Intel, IBM or Motorola vector proposals but could
be used with all three.

Even me with my limited ressources have some intrinsics for some
operations but that is really not *language* support Ben.


Ben Pfaff 01-06-2012 11:30 PM

Re: The container abstraction and parallel programming
 
jacob navia <jacob@spamsink.net> writes:

> Le 07/01/12 00:08, Ben Pfaff a écrit :
>> jacob navia<jacob@spamsink.net> writes:
>>
>>> Today, more than 10 years tafter Intel introduced the first SIMD
>>> instructions it is still not possible to use those instructions to do
>>> SIMD programming when you use C and two mainstream compilers like gcc
>>> or MSVC.
>>>
>>> From the C side there is absolutely no support for this type of
>>> programming, even though plain arrays could do the trick obviously.

>>
>> "absolutely no support"? You exaggerate:
>> http://gcc.gnu.org/onlinedocs/gcc-4....2din-Functions

>
> Look, those functions exist of curse but that is not standard C, those
> functions have no links to the rest of the language.


If that's your purpose then I don't know why you started out by
saying that you can't do it with "mainstream compilers like GCC".
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}

jacob navia 01-06-2012 11:41 PM

Re: The container abstraction and parallel programming
 
Le 07/01/12 00:30, Ben Pfaff a écrit :
> jacob navia<jacob@spamsink.net> writes:
>
>> Le 07/01/12 00:08, Ben Pfaff a écrit :
>>> jacob navia<jacob@spamsink.net> writes:
>>>
>>>> Today, more than 10 years tafter Intel introduced the first SIMD
>>>> instructions it is still not possible to use those instructions to do
>>>> SIMD programming when you use C and two mainstream compilers like gcc
>>>> or MSVC.
>>>>
>>>> From the C side there is absolutely no support for this type of
>>>> programming, even though plain arrays could do the trick obviously.
>>>
>>> "absolutely no support"? You exaggerate:
>>> http://gcc.gnu.org/onlinedocs/gcc-4....2din-Functions

>>
>> Look, those functions exist of curse but that is not standard C, those
>> functions have no links to the rest of the language.

>
> If that's your purpose then I don't know why you started out by
> saying that you can't do it with "mainstream compilers like GCC".


.... and MSVC, and ANY compiler now. It wasn't directed against gcc
specifically.


Because all those compilers propose
the same thing as gcc/Intel compilers do: a series of machine dependent
intrinsics that make your code not portable at all, and maybe because of
that are not accepted by a large user base.

What I wanted to propose is a more abstract and portable way of doing
SIMD than just using pseudo functions that map to one instructions of
the Intel instruction set.

Obviously the problem is there since all that those intrinsics propose
is very low level operations (what amounts to more or less returning to
Ã*ssembly language in C since all the intrinsics map to specific
Intel/AMD instructions).

What my aim is is to start a discussion about what kind of constructs
could be used to abstract from those operations into more general concepts.

Kaz Kylheku 01-07-2012 12:04 AM

Re: The container abstraction and parallel programming
 
On 2012-01-06, jacob navia <jacob@spamsink.net> wrote:
> Today, more than 10 years tafter Intel introduced the first SIMD
> instructions it is still not possible to use those instructions to do
> SIMD programming when you use C and two mainstream compilers like gcc or
> MSVC.


If anything, that may be an indicator that maybe nobody cares that much about
Intel's SIMD instructions. It's not hard to see why.

SIMD instructions do not translate to a tangible benefit for the average
consumer of Intel chips. (Marketing 101: features need to connect with
benefits!)

There was a time when games and video playback stretched the ability of the CPU
to the core, and hardware for accelerating them was next to nonexistent. The
processing-intensive tasks in those kinds of applications are handled by
dedicated hardware these days.

SIMD will not fix the major, user-visible performance issues in a Wintel PC,
like long boot times, or browsers becoming unresponsive for seconds at a time.

Ian Collins 01-07-2012 12:50 AM

Re: The container abstraction and parallel programming
 
On 01/ 7/12 12:02 PM, jacob navia wrote:
> Today, more than 10 years tafter Intel introduced the first SIMD
> instructions it is still not possible to use those instructions to do
> SIMD programming when you use C and two mainstream compilers like gcc or
> MSVC.
>
> From the C side there is absolutely no support for this type of
> programming, even though plain arrays could do the trick obviously.


That may be the case for those compilers, but not for others. For
example the Sun Studio compilers (I'd expect Intel to offer something as
well) have a number of vector processing optimisations which include simd.

I think this class of optimisation is best left as a compiler
implementation detail. The range of vector processing support
(including the likes of OpenMP) is simply too large (from nothing to a
GPU) to be considered for standardisation. If the demand is there, the
vendor will deliver.

> The container abstraction, specifically the valarray container, could be
> used to *mean* SIMD using operator overloading.


A similar approach has been used by some compilers for many years,
typically implemented as some form of "performance library" for vector
type operations.

--
Ian Collins

jacob navia 01-07-2012 08:06 AM

Re: The container abstraction and parallel programming
 
Le 07/01/12 01:04, Kaz Kylheku a écrit :
> On 2012-01-06, jacob navia<jacob@spamsink.net> wrote:
>> Today, more than 10 years tafter Intel introduced the first SIMD
>> instructions it is still not possible to use those instructions to do
>> SIMD programming when you use C and two mainstream compilers like gcc or
>> MSVC.

>
> If anything, that may be an indicator that maybe nobody cares that much about
> Intel's SIMD instructions. It's not hard to see why.
>
> SIMD instructions do not translate to a tangible benefit for the average
> consumer of Intel chips. (Marketing 101: features need to connect with
> benefits!)
>


But... how could those benefits be *USED* unless you program in assembly?

That is the whole point of my post. You can't say that users do not
want those features if there are no high level constructs that they
can use to make that happen!


> There was a time when games and video playback stretched the ability of the CPU
> to the core, and hardware for accelerating them was next to nonexistent. The
> processing-intensive tasks in those kinds of applications are handled by
> dedicated hardware these days.
>


But the problem *remains* since you can't address that dedicated
hardware from a high level language anyway!


> SIMD will not fix the major, user-visible performance issues in a Wintel PC,
> like long boot times, or browsers becoming unresponsive for seconds at a time.


SIMD needs a special kind of programming that is badly supported in
current serial languages like C or C++. Many parallel languages have
been proposed but they are of too limited scope to catch. What is needed
is a way to address SIMD programming from within a known general purpose
programming *context*.


jacob navia 01-07-2012 08:31 AM

Re: The container abstraction and parallel programming
 
Le 07/01/12 01:50, Ian Collins a écrit :
> On 01/ 7/12 12:02 PM, jacob navia wrote:
>> Today, more than 10 years tafter Intel introduced the first SIMD
>> instructions it is still not possible to use those instructions to do
>> SIMD programming when you use C and two mainstream compilers like gcc or
>> MSVC.
>>
>> From the C side there is absolutely no support for this type of
>> programming, even though plain arrays could do the trick obviously.

>
> That may be the case for those compilers, but not for others. For
> example the Sun Studio compilers (I'd expect Intel to offer something as
> well) have a number of vector processing optimisations which include simd.
>


Yes, Intel and Sun, as hardware vendors, have a real interest in making
people use the features of their hardware and will try to propose
either automatically SIMD deduction or inline assembly that supports
their features directly.

Problem is, even after dozens of years and millions and millions of
dollars in research, automatic SIMD parallelization remains VERY
difficult for the genral case. The problem, basically, is as follows:

Automatic parallelization must PROVE that no dependencies exist
between the array members, no unforeseen side effects can possibly
occur and ONLY THEN the constructs in question can be parallelized.

All this needs extensive program analysis, what is made even more
difficult with the model of separate module compilation...

High level constructs designed to allow the programmer assert that
he/she wants parallel programming FOR THIS OBJECT simplify the
task of the compiler, allowing for simpler compilers and more
widespread use.

> I think this class of optimisation is best left as a compiler
> implementation detail. The range of vector processing support (including
> the likes of OpenMP) is simply too large (from nothing to a GPU) to be
> considered for standardisation. If the demand is there, the vendor will
> deliver.
>


Well, SIMD processing is basically the same, only the SCALE changes from
an Intel SSE4 and a GPU with some thousand nodes. Using a higher level
construct you LEAVE the details to the compiler obviously, assuming a
compiler for Intel will make use of SSE4 and a compiler for a GPU will
use the GPU. What you do NOT leave to the compiler is the ANALYSIS as to
which constructs can be used with SIMD, allowing the compiler to
specialize in code generation without burdening it with a complete
analysis of all the possible data paths that could interfere with the
SIMD optimization.


>> The container abstraction, specifically the valarray container, could be
>> used to *mean* SIMD using operator overloading.

>
> A similar approach has been used by some compilers for many years,
> typically implemented as some form of "performance library" for vector
> type operations.
>


Yes, I just wanted to propose that idea not for a single compiler but
for the language in general.


Jens Gustedt 01-07-2012 08:39 AM

Re: The container abstraction and parallel programming
 
Am 01/07/2012 12:02 AM, schrieb jacob navia:
> How could those operations be incoporated into mainstream C?


macros and _Generic

Jens


All times are GMT. The time now is 05:02 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.