Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Short-circuiting in C

Reply
Thread Tools

Short-circuiting in C

 
 
James Kuyper
Guest
Posts: n/a
 
      05-28-2011
On 05/26/2011 01:58 PM, Billy Mays wrote:
> Hey CLC,
>
> I was wondering if there is a way to avoid the C short circuiting for
> boolean expressions such as the following:
>
> if(a && b && c) {
> /* Do stuff */
> }
>
> I am currently doing GPU programming and it would be better to evaluate
> the entire condition, even if a is 0, since branches are slow.


I assume that, if you're willing to have the entire condition evaluated,
then neither b nor c represent expressions with side-effects. In
particular, neither of them is or involves the value of an object
defined with volatile.

Then there's two basic cases that need to be distinguished:
1. The compiler can figure out that neither b nor c are expressions with
side effects.
If this is the case, then any sufficiently good compiler targeting your
platform should generate code that evaluates both b and c, and then
discarding their values if not needed; at least, it should do so
whenever it makes sense to do so. This is allowed by the C standard,
even though it says explicitly that they should only be evaluated if
needed. That's because of the fact that they have no side-effects. As a
result, there's no way for a strictly conforming program to determine
whether or not the compiler is doing this, so the as-if rule allows the
compiler to do this.
If your compiler isn't smart enough to perform this optimization
automatically, there's no way in C to tell it explicitly to perform it.

2. The compiler cannot figure out that neither b nor c have side-effects.
About the only way this could be true is if one or both of those
expressions involve a function call. If all three expressions have
integral type, try:

if(a & b & c)
{
}

The use of '&' rather than '&&' removes the short-circuiting behavior,
but the net result is the same, as long as integer types are involved.
If any of the expressions has a floating point or pointer type, you'll
have to insert !=0, and that will probably be converted into a branch,
which is precisely what you're trying to avoid.

--
James Kuyper
 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      05-28-2011
James Kuyper <(E-Mail Removed)> writes:
[...]
> 2. The compiler cannot figure out that neither b nor c have side-effects.
> About the only way this could be true is if one or both of those
> expressions involve a function call. If all three expressions have
> integral type, try:
>
> if(a & b & c)
> {
> }
>
> The use of '&' rather than '&&' removes the short-circuiting behavior,
> but the net result is the same, as long as integer types are involved.


And as long as they all have values of 0 or 1.

[...]

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
 
 
 
blmblm@myrealbox.com
Guest
Posts: n/a
 
      05-28-2011
In article <irpr4j$q20$(E-Mail Removed)>,
Eric Sosman <(E-Mail Removed)> wrote:
> On 5/27/2011 11:28 PM, (E-Mail Removed) wrote:
> > In article<irmuo5$u9t$(E-Mail Removed)>,
> > Eric Sosman<(E-Mail Removed)> wrote:
> >> [...]
> >> C has no way to express the computation you desire. C's abstract
> >> machine has only one flow of control, if we ignore the rather mysterious
> >> effects of asynchronous signals and `volatile' variable. There is no
> >> C construct that will evaluate a,b,c in parallel execution paths and
> >> then somehow combine the three results into a single test. You've
> >> chosen the wrong implementation language.

> >
> > Hesitant though I am to argue with someone who generally seems
> > far more expert than I am ....

>
> "My blushes, Watson." (And don't be too awed by my expertise;
> the only reason I'm in the top ten of this group -- on good days --
> is that the top twenty have stopped posting.)


Well, all right, if that's too much flattery, how about

s/far more expert than I am/very clueful/

?

> > I don't get the impression at all that the OP wants a, b, and
> > c evaluated in parallel. As I understand it, he just wants to
> > be sure that there is no branching involved in the evaluation
> > of the condition (a&& b&& c). I admit I don't have a clear
> > understanding of the platform he's targeting, but it sounds like
> > it somehow involves multiple processing elements[*] executing the
> > same compiled code, and under those circumstances I can vaguely
> > imagine that performance might be better if all these processing
> > elements execute exactly the same instructions (i.e., no branches).
> >
> >[*] The ones that put the "MD" in "SIMD" is what I'm thinking here,
> > though admittedly not with as much clarity as I might like.

>
> He didn't say anything about parallel execution at first, which
> is why I gave a rather different response on the initial go-around.
> But later he wrote that "On the GPU I am using, threads execute
> simultaneously [provided...]," and I understood that as a request for
> a C construct that expresses the "provided" and guarantees parallel
> execution.


I think what I'm not getting is why you're interpreting his words as
a request to have *a, b, and c* evaluated ....

Oh, maybe we're talking at cross purposes here! I understand you to
be saying that he wants the three parts of the conditional executed
concurrently, presumably each in its own thread, and that's what
didn't/doesn't make sense to me.

But maybe in fact you're asking about how he hopes to have multiple
threads concurrently evaluating the whole expression (a && b && c)?
Now *that* would at least make sense as something one might want the
program to do.

But in general I understand him to be saying that the parallelism
(multiple threads operating concurrently, each evaluating the whole
expression) is being provided by some other part of the platform,
and he just wants to know how to come up with an expression whose
evaluation doesn't involve branching. That seems to me to be entirely
topical for this group and something one might want to do for other
reasons (such as improved performance on a platform where any kind
of branching is slow), though admittedly some of those alleged reasons
might be ill-advised.

> C has conditional constructs that may (*may*) involve no
> branches, given smooth seas and a following wind, but has nothing that
> can guarantee or even talk about parallel execution, leaving aside
> signals and volatiles.


Sure, but I don't think that's what he's asking ....

> 'course, it's possible I've misunderstood the O.P. or blundered
> in some other way, and will be summarily drummed out of the top ten.
> Again. (Anybody recall the Brain Trust of the TV series "Scrubs?"
> Them's my guys ...)


No room for error, eh? ?

--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      05-28-2011
On 5/28/2011 6:08 AM, Nobody wrote:
> On Sat, 28 May 2011 03:28:20 +0000, blmblm wrote:
>
>> I don't get the impression at all that the OP wants a, b, and
>> c evaluated in parallel. As I understand it, he just wants to
>> be sure that there is no branching involved in the evaluation
>> of the condition (a&& b&& c). I admit I don't have a clear
>> understanding of the platform he's targeting, but it sounds like
>> it somehow involves multiple processing elements[*] executing the
>> same compiled code, and under those circumstances I can vaguely
>> imagine that performance might be better if all these processing
>> elements execute exactly the same instructions (i.e., no branches).

>
> I was (initially) under the impression that he wanted consistency between
> CPU and GPU performance, and GPUs can't do short-circuit evaluation.
>
> On a SIMD architecture (e.g. a GPU), you only have one program counter for
> all processing units, so a branch where the test result can vary between
> units cannot be implemented in the conventional manner (by modifying the
> program counter). An if-else construct involves executing both branches on
> all units; each unit ignores the instructions in one of the branches
> depending upon whether the test is true or false on that particular unit.
>
> OTOH, the OP may have just misunderstood what "branches are slow" means in
> the context of GPU programming. They're slow in the sense that branching
> over a set of instructions takes just as long as executing them. That
> doesn't mean that tricks which eliminate branches will necessarily be any
> faster.
>


AFAIK, there may be a point of difference between nVidia and ATI GPUs
here, where nVidia GPUs will in many cases skip over the non-evaluated
code and ignore the instructions, and ATI GPUs will actually perform the
branch, but typically go down only one side of the branch at a time
(having different threads branch differently may mean first evaluating,
say, all threads where the jump was true, followed by all branches where
it was false, ...).

also, the GPUs are normally organized into a large number of "cores"
which operate independently of each other, but each core may operate
some number of threads in parallel, each thread in turn having to move
in lock-step with the others.


granted, I am no particular expert on GPUs, and most of my personal
experience has been with OpenGL and GLSL, rather than OpenCL and its
modified C-subset (and, sadly, the inability to use function pointers
would very much hinder many of my usual coding practices). I may use it
eventually though.

I think I remember hearing at some point about projects trying to run
both Java-ByteCode and ActionScript on the GPU as well, ...
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      05-28-2011
On 5/27/2011 9:46 AM, ImpalerCore wrote:
> On May 27, 12:08 pm, Willem<(E-Mail Removed)> wrote:
>> ImpalerCore wrote:
>>
>> ) On May 26, 1:58?pm, Billy Mays<(E-Mail Removed)> wrote:
>> )> I was wondering if there is a way to avoid the C short circuiting for
>> )> boolean expressions such as the following:
>> )>
>> )> if(a&& b&& c) {
>> )> ? ? ?/* Do stuff */
>> )>
>> )> }
>> )>
>> )> I am currently doing GPU programming and it would be better to evaluate
>> )> the entire condition, even if a is 0, since branches are slow.
>> )
>> ) Not sure if it would work, but what about evaluating the condition
>> ) outside of the 'if' statement?
>> )
>> ) bool my_condition_is_true = a&& b&& c;
>> ) if (my_condition_is_true) {
>> ) /* Do stuff */
>> ) }
>>
>> Nope, that would make the exact same branches.
>> I wouldn't be surprised if it generated exactly the same opcodes, even.

>
> Oh, I get it now. The branches referred to are in the '&&' operator.
> Thanks.


hmm, maybe:
!((!a)|(!b)|(!c))

I don't know if any have mentioned this one.
reasoning: (!a) will reduce the value to 0/1, allowing plain bitwise
operators; if any of a/b/c is false, the result of (!a)/(!b)/(!c) is 1
(rather than 0), which will then be reversed by the final "!".


but, who knows which optimizations the GPU-specific compiler may have
done?...

a compiler may well observe that if all of the operations are clean /
non-destructive, then the as-if rule will allow optimizing away the
short-circuit.

or, knowing about this, they may well have just "reinterpreted" the &&
and || operators for their particular target, since what they have isn't
strictly C anyways, and so there is no particular need for them to
follow pedantic interpretations of the standard either, for that matter.

 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      05-28-2011
On 05/28/2011 03:15 PM, Keith Thompson wrote:
> James Kuyper <(E-Mail Removed)> writes:
> [...]
>> 2. The compiler cannot figure out that neither b nor c have side-effects.
>> About the only way this could be true is if one or both of those
>> expressions involve a function call. If all three expressions have
>> integral type, try:
>>
>> if(a & b & c)
>> {
>> }
>>
>> The use of '&' rather than '&&' removes the short-circuiting behavior,
>> but the net result is the same, as long as integer types are involved.

>
> And as long as they all have values of 0 or 1.


Sorry, you're right. I was in rush when I wrote that, and I was thinking
of || and |, rather than && and &.
--
James Kuyper
 
Reply With Quote
 
Herbert Rosenau
Guest
Posts: n/a
 
      05-29-2011
On Thu, 26 May 2011 17:58:59 UTC, Billy Mays <(E-Mail Removed)> wrote:

> Hey CLC,
>
> I was wondering if there is a way to avoid the C short circuiting for
> boolean expressions such as the following:
>
> if(a && b && c) {
> /* Do stuff */
> }
>
> I am currently doing GPU programming and it would be better to evaluate
> the entire condition, even if a is 0, since branches are slow.
>

if (!(a | b | c)) {
/* do stuff on neither a, b and c are true */
} else {
/* either a, b or c are false */
}

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 2.0 ist da!
 
Reply With Quote
 
Michael Press
Guest
Posts: n/a
 
      05-30-2011
In article <irpr4j$q20$(E-Mail Removed)>,
Eric Sosman <(E-Mail Removed)> wrote:

> On 5/27/2011 11:28 PM, (E-Mail Removed) wrote:
> > In article<irmuo5$u9t$(E-Mail Removed)>,
> > Eric Sosman<(E-Mail Removed)> wrote:
> >> [...]
> >> C has no way to express the computation you desire. C's abstract
> >> machine has only one flow of control, if we ignore the rather mysterious
> >> effects of asynchronous signals and `volatile' variable. There is no
> >> C construct that will evaluate a,b,c in parallel execution paths and
> >> then somehow combine the three results into a single test. You've
> >> chosen the wrong implementation language.

> >
> > Hesitant though I am to argue with someone who generally seems
> > far more expert than I am ....

>
> "My blushes, Watson." (And don't be too awed by my expertise;
> the only reason I'm in the top ten of this group -- on good days --
> is that the top twenty have stopped posting.)
>
> > I don't get the impression at all that the OP wants a, b, and
> > c evaluated in parallel. As I understand it, he just wants to
> > be sure that there is no branching involved in the evaluation
> > of the condition (a&& b&& c). I admit I don't have a clear
> > understanding of the platform he's targeting, but it sounds like
> > it somehow involves multiple processing elements[*] executing the
> > same compiled code, and under those circumstances I can vaguely
> > imagine that performance might be better if all these processing
> > elements execute exactly the same instructions (i.e., no branches).
> >
> >[*] The ones that put the "MD" in "SIMD" is what I'm thinking here,
> > though admittedly not with as much clarity as I might like.

>
> He didn't say anything about parallel execution at first, which
> is why I gave a rather different response on the initial go-around.
> But later he wrote that "On the GPU I am using, threads execute
> simultaneously [provided...]," and I understood that as a request for
> a C construct that expresses the "provided" and guarantees parallel
> execution. C has conditional constructs that may (*may*) involve no
> branches, given smooth seas and a following wind, but has nothing that
> can guarantee or even talk about parallel execution, leaving aside
> signals and volatiles.
>
> 'course, it's possible I've misunderstood the O.P. or blundered
> in some other way, and will be summarily drummed out of the top ten.
> Again. (Anybody recall the Brain Trust of the TV series "Scrubs?"
> Them's my guys ...)


I have two coins worth thirty cents total.
One of them is not a nickel.

--
Michael Press
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      05-31-2011
On 5/30/2011 12:28 AM, Michael Press wrote:
> In article<irpr4j$q20$(E-Mail Removed)>,
> Eric Sosman<(E-Mail Removed)> wrote:
>
>> On 5/27/2011 11:28 PM, (E-Mail Removed) wrote:
>>> In article<irmuo5$u9t$(E-Mail Removed)>,
>>> Eric Sosman<(E-Mail Removed)> wrote:
>>>> [...]
>>>> C has no way to express the computation you desire. C's abstract
>>>> machine has only one flow of control, if we ignore the rather mysterious
>>>> effects of asynchronous signals and `volatile' variable. There is no
>>>> C construct that will evaluate a,b,c in parallel execution paths and
>>>> then somehow combine the three results into a single test. You've
>>>> chosen the wrong implementation language.
>>>
>>> Hesitant though I am to argue with someone who generally seems
>>> far more expert than I am ....

>>
>> "My blushes, Watson." (And don't be too awed by my expertise;
>> the only reason I'm in the top ten of this group -- on good days --
>> is that the top twenty have stopped posting.)
>>
>>> I don't get the impression at all that the OP wants a, b, and
>>> c evaluated in parallel. As I understand it, he just wants to
>>> be sure that there is no branching involved in the evaluation
>>> of the condition (a&& b&& c). I admit I don't have a clear
>>> understanding of the platform he's targeting, but it sounds like
>>> it somehow involves multiple processing elements[*] executing the
>>> same compiled code, and under those circumstances I can vaguely
>>> imagine that performance might be better if all these processing
>>> elements execute exactly the same instructions (i.e., no branches).
>>>
>>>[*] The ones that put the "MD" in "SIMD" is what I'm thinking here,
>>> though admittedly not with as much clarity as I might like.

>>
>> He didn't say anything about parallel execution at first, which
>> is why I gave a rather different response on the initial go-around.
>> But later he wrote that "On the GPU I am using, threads execute
>> simultaneously [provided...]," and I understood that as a request for
>> a C construct that expresses the "provided" and guarantees parallel
>> execution. C has conditional constructs that may (*may*) involve no
>> branches, given smooth seas and a following wind, but has nothing that
>> can guarantee or even talk about parallel execution, leaving aside
>> signals and volatiles.
>>
>> 'course, it's possible I've misunderstood the O.P. or blundered
>> in some other way, and will be summarily drummed out of the top ten.
>> Again. (Anybody recall the Brain Trust of the TV series "Scrubs?"
>> Them's my guys ...)

>
> I have two coins worth thirty cents total.
> One of them is not a nickel.
>


errm... "what is a quarter?...".

but, yeah, the GPU is weird that way, as it tends to execute things in
parallel (rather than the more usual serial execution).
 
Reply With Quote
 
Edward A. Falk
Guest
Posts: n/a
 
      05-31-2011
In article <irm4et$66d$(E-Mail Removed)>,
Billy Mays <(E-Mail Removed)> wrote:
>Hey CLC,
>
>I was wondering if there is a way to avoid the C short circuiting for
>boolean expressions such as the following:
>
>if(a && b && c) {
> /* Do stuff */
>}
>
>I am currently doing GPU programming and it would be better to evaluate
>the entire condition, even if a is 0, since branches are slow.


Unless a, b, and c are all simple register values, you'll have a hard
time convincing me that simple short branches are slower than evaluating
them.

--
-Ed Falk, (E-Mail Removed)
http://thespamdiaries.blogspot.com/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments