Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > The container abstraction and parallel programming

Reply
Thread Tools

The container abstraction and parallel programming

 
 
ec429
Guest
Posts: n/a
 
      01-08-2012
On 08/01/12 10:48, jacob navia wrote:
> You could deduce that c must be of integer type isn't it?
> LET THE COMPILER DO THAT. Let's eliminate declarations for
> integer and let the compiler deduce from the program text
> the type of each variable.
>
> Looks ridiculous isn't it?

No, actually. There are several programming languages which have
strong, static typing, and yet have a complete system of type inference.
I believe ML is one such language.
Such a thing is not appropriate for C, because in C's model types are
/promises/ one makes to the compiler. This is C's optimisation
paradigm: you promise the compiler you won't use a value as anything but
an integer, by declaring it 'int'. You promise the compiler you won't
write through a pointer, by qualifying it 'const'. You promise the
compiler the pointer you gave it has no aliases, by qualifying it
'restrict'.
If you want code that makes more use of hardware features like SIMD,
don't add primitives that map to hardware instructions - add new ways of
promising things to the compiler. Then suggest them in comp.std.c and
don't bash other compilers for not supporting them unless and until they
get incorporated into the Standard.
-e
--
'sane', adj.: see 'unimaginative'
on the web - http://jttlov.no-ip.org
 
Reply With Quote
 
 
 
 
jacob navia
Guest
Posts: n/a
 
      01-08-2012
Le 08/01/12 12:05, ec429 a écrit :
> On 08/01/12 10:41, jacob navia wrote:
> Actually, those programs aren't "exactly the same", due to the lack of
> the aforementioned "restrict" keyword. For instance:
> double array[3]={0,1,2};
> doADD(array+1, array+1, array, 2);


Adding the "restrict" keyword doesn't change anything.


> In general, the correct approach to parallel programming is giving the
> compiler as few constraints as possible. This can sometimes be achieved
> with 'restrict', but not much can be done in procedural languages unless
> they have some explicit means of telling the compiler that normal order
> constraints can be relaxed (which would be a /much/ more useful
> extension to C than processor-specific vectorisation primitives, since
> the latter would reduce the portable glory of C to the provincial
> insularity of assembler (not that assembler isn't valuable in
> appropriate circumstances, of course!)).



Well, thanks a lot. It is EXACTLY what I proposed in the message that
started this discussion. We should have a way to tell the compiler
to use SIMD operations in a special object that we declare as SIMD.

> In C as it stands, the
> canonical way to do this is by means of threads (at least in C11; '99
> and its predecessors don't mention threaded execution, but many
> implementations support it). SIMD has its uses, of course; but unless
> your code is perversely written (and so long as you have used 'restrict'
> where appropriate) the compiler knows better than you do. Besides, your
> time as a programmer is worth enough that you generally shouldn't be
> expending it on non-portable performance hacks unless you /know/ your
> code will be run a very large number of times. YAGNI.


1) Using threads to make SIMD is not possible. The overhead of thread
creation and maintenance is impossible high.

2) Your remarks about "The compiler knows better" have no connection
with reality. Gcc, that is a highly advanced compiler, is unable
to do SIMD reliably, as I proved. It suffices a small change and
the whole thing COLLAPSES.

3) Your remarks about avoiding performance hacks are welcome since it is
exactly what I was saying: we need a portable way of using SIMD that
is supported by the language.


> If you really want to write strongly parallelisable programs (rather
> than merely hand-optimising with a few vector ops that probably don't
> gain you much anyway) try a functional language, such as LISP - with a
> properly designed data-flow model parallel processing becomes something
> that the implementation can do transparently.
>


Well, I am using C, and I do not see how LISP would be useful in this
context since it would need the same extensions that you would need in
C. Lisp is NOT SIMD per se.

> I think you just don't like gcc, because you know (and don't want to
> admit to yourself) that it's better than yours.



I have never denied that gcc generates better code than lcc.
In my posting I spoke of two mainstream compilers like gcc and MSVC
That is all. And what I said was right: they aren't able to use SIMD
programming in a useful way.

But you do not like what I say (and you do not want to admit it
to yourself) because I am NOT using gcc and even have the audacity of
trying to build a different compiler!

You see?

If you get to personal attacks or suppositions we just can't discuss.
Everything becomes a stupid war of "gcc is the best and you are wrong
because you use another compiler"
 
Reply With Quote
 
 
 
 
Nick Keighley
Guest
Posts: n/a
 
      01-08-2012
On Jan 7, 12:04*am, Kaz Kylheku <(E-Mail Removed)> wrote:
> On 2012-01-06, jacob navia <(E-Mail Removed)> wrote:
>
> > Today, more than 10 years tafter Intel introduced the first SIMD
> > instructions it is still not possible to use those instructions to do
> > SIMD programming when you use C and two mainstream compilers like gcc or
> > MSVC.

>
> If anything, that may be an indicator that maybe nobody cares that much about
> Intel's SIMD instructions. *It's not hard to see why.
>
> SIMD instructions do not translate to a tangible benefit for the average
> consumer of Intel chips. (Marketing 101: features need to connect with
> benefits!)


but there's an awful lot of Intel chips sold. Even 0.1% of that market
is huge. Just because the average desk top user doesn't need SIMD
doesn't mean its market is negligable.

> There was a time when games and video playback stretched the ability of the CPU
> to the core, and hardware for accelerating them was next to nonexistent. *The
> processing-intensive tasks in those kinds of applications are handled by
> dedicated hardware these days.
>
> SIMD will not fix the major, user-visible performance issues in a Wintel PC,
> like long boot times, or browsers becoming unresponsive for seconds at a time.


yes but there are people who need heavy number crunching. CAD for one,
I'd think.

 
Reply With Quote
 
ec429
Guest
Posts: n/a
 
      01-08-2012
On 08/01/12 11:29, jacob navia wrote:
> Le 08/01/12 12:05, ec429 a écrit :
>> Actually, those programs aren't "exactly the same", due to the lack of
>> the aforementioned "restrict" keyword. For instance:
>> double array[3]={0,1,2};
>> doADD(array+1, array+1, array, 2);

>
> Adding the "restrict" keyword doesn't change anything.

This is patently false. Adding the "restrict" keyword certainly changes
the program (from the perspective of the C standard) since it promises
to the compiler that the pointers will not be aliased, thereby making
optimisations (such as SIMD) legal (they otherwise would not be).
> Well, thanks a lot. It is EXACTLY what I proposed in the message that
> started this discussion. We should have a way to tell the compiler
> to use SIMD operations in a special object that we declare as SIMD.

No, you have missed the point. You don't tell the compiler to "use SIMD
instructions"; you tell the compiler that it /may/ operate on these data
in any order since the operators are commutative, nothing is aliased,
&c.; if the target platform has SIMD instructions the compiler may
choose to use them. I cannot stress enough that this is the compiler's job.
Mixing container abstractions with compiler internals, as your OP seems
to propose, is a pathway to infinite pain; instead, your containers
library (if such a thing be even necessary) should be written in a
manner susceptible to optimisation by SIMD-aware compilers - namely, it
should make full use of the restrict keyword, and handle arrays simply
where possible; also, data dependencies should be carefully controlled.
(Data dependencies limit all reordering)
If, after doing this, you find that the compiler does not use SIMD
instructions, you patch the compiler (in as general a way as possible).
You do _not_ patch the language.
> 1) Using threads to make SIMD is not possible. The overhead of thread
> creation and maintenance is impossible high.

I wasn't suggesting using threads for SIMD; I was merely discussing
parallelisation in general terms.
> 2) Your remarks about "The compiler knows better" have no connection
> with reality. Gcc, that is a highly advanced compiler, is unable
> to do SIMD reliably, as I proved. It suffices a small change and
> the whole thing COLLAPSES.

Ok, I'll clarify. The compiler *in principle* knows better, and if it
gets it wrong, you patch the compiler; you don't patch the language.
> 3) Your remarks about avoiding performance hacks are welcome since it is
> exactly what I was saying: we need a portable way of using SIMD that
> is supported by the language.

"Using SIMD" in your code is a performance hack. /Making the use of
SIMD by the compiler possible/ in your code is portable, and avoids
performance hacks.
> Well, I am using C, and I do not see how LISP would be useful in this
> context since it would need the same extensions that you would need in
> C. Lisp is NOT SIMD per se.

Lisp is inherently parallel; one entirely valid model of execution is to
consider every cons as being executed in parallel, blocking on its data
dependencies from its car and cdr. This is why functional languages
often incorporate monads, because I/O happens to need to be serial.
A sufficiently intelligent LISP compiler could compile (reduce),
(expand), (select), and (extend) to appropriate SIMD instructions if the
target platform has them. But this is c.l.c, not c.l.lisp.
> I have never denied that gcc generates better code than lcc.
> In my posting I spoke of two mainstream compilers like gcc and MSVC
> That is all. And what I said was right: they aren't able to use SIMD
> programming in a useful way.

I suspect your definition of "useful" is isomorphic to "anything
mainstream compilers don't do". As other posters have explained, gcc
can indeed use SIMD in a useful way, but only where sufficient promises
have been made to it (like "restrict") for such use not to be a
violation of the C standard. Since it's good practice to make those
promises wherever you can (even if you don't foresee an immediate
benefit - just the documentation effect of const and restrict justifies
their use), I don't see what you're complaining about.
> But you do not like what I say (and you do not want to admit it
> to yourself) because I am NOT using gcc and even have the audacity of
> trying to build a different compiler!

I laud your efforts to build a compiler; I'm impressed that you are able
to do so. But that doesn't necessarily mean that ideas you have for the
extension of the language are good ones.
> If you get to personal attacks or suppositions we just can't discuss.
> Everything becomes a stupid war of "gcc is the best and you are wrong
> because you use another compiler"

I'm not a High Priest of the One True Way (tm) of gcc. rather my gold
standard is The Standard. Its extension or emendation is justified only
when it can be shown to have failed. On SIMD it has not done so.
gcc may currently use less SIMD than it could. That is a problem to be
solved by work on gcc, or by creating a better compiler that matches
gcc's good qualities while avoiding its bad habits. It is not a problem
to be solved by adding non-portable incompatible extensions to the
language, particularly when the language already provides sufficient
recourse.
-e
--
'sane', adj.: see 'unimaginative'
on the web - http://jttlov.no-ip.org
 
Reply With Quote
 
Stephen Sprunk
Guest
Posts: n/a
 
      01-08-2012
On 07-Jan-12 11:27, Gene wrote:
> On Jan 7, 2:39 pm, Stephen Sprunk <(E-Mail Removed)> wrote:
>> On 07-Jan-12 02:31, jacob navia wrote:
>>> Automatic parallelization must PROVE that no dependencies exist
>>> between the array members, no unforeseen side effects can possibly
>>> occur and ONLY THEN the constructs in question can be parallelized.

>>
>>> All this needs extensive program analysis, what is made even more
>>> difficult with the model of separate module compilation...

>>
>> There are no "dependencies" between array members. There is a possible
>> aliasing problem when dealing with pointers rather than arrays, but that
>> is easily enough solved with "restrict" qualifiers.
>>
>> I don't understand what you think the problem is.

>
> He has to be referring to loop dependencies as in
>
> for (i = 0; i < n; i++) a[i] = a[i-3] + 2;
>
> The main reason I'm writing is to point out that late version C
> compilers including gcc, Intel, and I believe though am not sure,
> Microsoft all will emit vector instructions for loops coded with
> commonsense idioms. I have a signal processing code that speeds up by
> a factor of 3 with gcc when this opimization is turned on.


Right. Unfortunately, GCC seems to only recognize "commonsense idioms"
and misses opportunities to vectorize code that doesn't fit those
idioms, even when doing so would be provably safe. So, you have to know
what idioms it understands and modify your code to comply; that is not
ideal, but it's hardly a new concept--and presumably the set of idioms
will expand over time as folks complain or contribute patches.

I have no idea if ICC (or MSVC?) has the same limitation.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      01-08-2012
Le 08/01/12 18:37, Stephen Sprunk a écrit :
>
> Right. Unfortunately, GCC seems to only recognize "commonsense idioms"
> and misses opportunities to vectorize code that doesn't fit those
> idioms, even when doing so would be provably safe. So, you have to know
> what idioms it understands and modify your code to comply; that is not
> ideal, but it's hardly a new concept--and presumably the set of idioms
> will expand over time as folks complain or contribute patches.
>
> I have no idea if ICC (or MSVC?) has the same limitation.


Not the same but SIMILAR limitations. Then, for each compiler:
do {
learnLimitations();
ModifyCode();
} while (ItsnotInStandardC());



That is the reason for my proposal. There is no standard way of doing
SIMD even if most modern CPUs have SIMD extensions.

 
Reply With Quote
 
Stephen Sprunk
Guest
Posts: n/a
 
      01-08-2012
On 08-Jan-12 04:48, jacob navia wrote:
> Le 07/01/12 14:39, Stephen Sprunk a écrit :
>> I don't understand what you think the problem is.

>
> The problem, Stephen, is as follows:
>
> gcc will vectorize this:
>
> void doADD(double *L, double *R, double *result,size_t n)
> {
> for(size_t i=0; i<n;i++) result[i]=L[i]+R[i];
> }
>
> but will NOT vectorize this:
>
> void doADD(double *L, double *R, double *result,size_t n)
> {
> for(size_t i=0; i<n;i++) result[n-i]=L[n-i]+R[n-i];
> }
>
> Why?


GCC doesn't understand that loop because it's not idiomatic. When I
crank up the verbosity, it tells me that loop has a "complicated access
pattern".

> All this huge machinery for automatic vectorizing has never worked
> correctly for dozens of years since requires an incredible amount
> of program analysis.


It works correctly if you write idiomatic code, and the amount of work
required to recognize a handful of idioms is not "incredible".

> My proposition is:
>
> Why make complicated something that you can do in a simple way?
>
> Let the programmer declare the data that should be handled in an SIMD
> way.


GCC has vector data types for that, eg. v2df, if you want them. IMHO,
it's simpler (and more portable) to just write code that can be
auto-vectorized according to whatever hardware is available on the
particular target of the day.

(I have a different perspective on this, perhaps, because my employer's
code base has to run on over a dozen different CPUs; portability trumps
maximum performance on one particular CPU.)

> If I write:
>
> c = (a+b)/2;
> if (c&1) fn(1);
>
> You could deduce that c must be of integer type isn't it?


Um, no. It could be any arithmetic type.

> Well, for SIMD the problem is the same.


Not really. A vector is just an array of some other type--it's not a
new type itself.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      01-08-2012
Stephen Sprunk <(E-Mail Removed)> writes:
> On 08-Jan-12 04:48, jacob navia wrote:

[...]
>> If I write:
>>
>> c = (a+b)/2;
>> if (c&1) fn(1);
>>
>> You could deduce that c must be of integer type isn't it?

>
> Um, no. It could be any arithmetic type.


"c&1" is valid only if c is of some integer type.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      01-08-2012
Le 08/01/12 18:52, Stephen Sprunk a écrit :
> On 08-Jan-12 04:48, jacob navia wrote:
>> If I write:
>>
>> c = (a+b)/2;
>> if (c&1) fn(1);
>>
>> You could deduce that c must be of integer type isn't it?

>
> Um, no. It could be any arithmetic type.
>


c&1 implies c is an integer since all other floating types do not
support that operation.

It could be char/short/int/long/long long/, and by reading more
code the compiler could even differentiate among them.
 
Reply With Quote
 
gwowen
Guest
Posts: n/a
 
      01-09-2012
On Jan 8, 10:41*am, jacob navia <(E-Mail Removed)> wrote:
> Nonsense, gcc doesn't have a real vectorizer. For instance:


It doesn't have a *perfect* vectorizer.

> But NO option (including -ftree-vectorize, as you suggest) will
> convince gcc to use the parallel instructions in the first case.


Yes, you can confuse vectorizers. You can also confuse optimizers.

Do you think I can't obfuscate trivial code so badly that lcc can't
optimize it?
Does that mean lcc doesn't have a real optimizer?

Of course it doesn't mean that. By writing clear C I've got a
measurable out of GCC's imperfect vectorizer, and no semantic
arguments will ever prevent that from being true.

> That was the main argument of my post and nobody has given a sensible answer to it.
> Most of the answer are like yours:


No one has agreed with you, certainly. That means one of two things:
i) You're wrong
ii) Everyone else is wrong.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Parallel in, Parallel out shift register Vivek Menon VHDL 0 06-10-2011 10:15 PM
Parallel in, Parallel out shift register Vivek Menon VHDL 5 06-08-2011 03:56 PM
Parallel port control with USB->Parallel converter Soren Python 4 02-14-2008 03:18 PM
STL: container's values setup by another container Maitre Bart C++ 2 02-11-2004 12:11 AM
std::container::iterator vs std::container::pointer Vivi Orunitia C++ 11 02-04-2004 08:09 AM



Advertisments