Velocity Reviews > Nested loop limit?

# Nested loop limit?

Dan Christensen
Guest
Posts: n/a

 07-11-2004
Peter Hansen <(E-Mail Removed)> writes:

> >>> for n in range(100):

> ... exec '\n'.join([(' ' * i) + 'for i%s in range(2):' % i for i in
> range(n)])
> + '\n' + ' ' * n + 'pass\n'

To tie two threads together, I shudder to think of how unreadable
python code could be if there was a ternary operator.

Dan

Peter Hansen
Guest
Posts: n/a

 07-11-2004
Dan Christensen wrote:

> Peter Hansen <(E-Mail Removed)> writes:
>>If you consider that statically nested for loops must in some
>>way represent different dimensions, is it really possible that
>>a problem can have more than 20 dimensions (or even nearly that
>>many!) which must all be looped over simultaneously?

>
> Well, in a computation in quantum gravity, I have the C code shown
> below. It has exactly 20 nested for loops! There are of course
> other ways to write it, but given the number of iterations, speed
> is the bottom line.

(Thanks for the _real_ real-world example.

> #define l1(a) for(Label[a] = 0; Label[a] <= cutoff; Label[a]++)
> #define l2(a,b,c,d) \

[snip code sample]

I won't claim to have tried, or even to be able, to understand the
purpose of the code , but its general appearance suggests to
me that it could also be written, perhaps with equivalent or
conceivably even better performance, and at least more generality,
with some kind of table-driven approach. I'll certainly bow
to your better judgment if you've considered that possibility.
(And I wonder if you would have written it as statically nested
loops if you didn't have macros to fall back on. One might
say that you're sort of cheating there, since the loop structure
is mostly absent by virtue of having used C.)

Anyway, trust an engineer to think the world can be simple,
and trust a physicist to show that sometimes it's not.

(Bonvenon al "Pitonujo". Kaj bv. saluti Lindi de mi.

-Peter

Ville Vainio
Guest
Posts: n/a

 07-11-2004
>>>>> "Dan" == Dan Christensen <(E-Mail Removed)> writes:

Dan> I'm converting much of this project to python, but will
Dan> probably keep this part in C and wrap it with SWIG.

While agreeing with Peter about the table driven approach (possibly
with some recursion thrown in, the number of recursion levels is not
so limited), your code seems like a textbook example of code where C
performance is going to murder Python performance, so keeping it in C
might indeed make sense if it's run often...

--
Ville Vainio http://tinyurl.com/2prnb

Dan Christensen
Guest
Posts: n/a

 07-11-2004
Peter Hansen <(E-Mail Removed)> writes:

> Dan Christensen wrote:
>
>> Well, in a computation in quantum gravity, I have the C code shown
>> below. It has exactly 20 nested for loops! There are of course
>> other ways to write it, but given the number of iterations, speed
>> is the bottom line.

>
> (Thanks for the _real_ real-world example.

....
> its general appearance suggests to
> me that it could also be written, perhaps with equivalent or
> conceivably even better performance, and at least more generality,
> with some kind of table-driven approach. I'll certainly bow
> to your better judgment if you've considered that possibility.

Yes, I've considered it, and in fact I'll be forced to use a
table-driven approach to handle more general situations, where the
geometry is read in from a file. I haven't done timing comparisons,
but I would be very surprised if it was faster. There would be some
beneficial cache effects from the shorter code, but more lookups and
index shuffling which I would expect to slow things down. But maybe
I'm not familiar with some tricks that would make this technique
faster.

As for maintainability of my method, using the macros means that
writing the code is very similar to filling in the table in the first
place!

I've actually toyed with the idea of having the program generate the
code (maybe in Python) on the fly, compile it (e.g. with psyco), and
then run it. But with Python's limit of 20 nested loops, that won't
fly.

Dan

Bengt Richter
Guest
Posts: n/a

 07-11-2004
On Sun, 11 Jul 2004 13:21:37 -0400, Dan Christensen <(E-Mail Removed)> wrote:

>Peter Hansen <(E-Mail Removed)> writes:
>
>> Dan Christensen wrote:
>>
>>> Well, in a computation in quantum gravity, I have the C code shown
>>> below. It has exactly 20 nested for loops! There are of course
>>> other ways to write it, but given the number of iterations, speed
>>> is the bottom line.

>>
>> (Thanks for the _real_ real-world example.

>...
>> its general appearance suggests to
>> me that it could also be written, perhaps with equivalent or
>> conceivably even better performance, and at least more generality,
>> with some kind of table-driven approach. I'll certainly bow
>> to your better judgment if you've considered that possibility.

>
>Yes, I've considered it, and in fact I'll be forced to use a
>table-driven approach to handle more general situations, where the
>geometry is read in from a file. I haven't done timing comparisons,
>but I would be very surprised if it was faster. There would be some
>beneficial cache effects from the shorter code, but more lookups and
>index shuffling which I would expect to slow things down. But maybe
>I'm not familiar with some tricks that would make this technique
>faster.
>
>As for maintainability of my method, using the macros means that
>writing the code is very similar to filling in the table in the first
>place!
>
>I've actually toyed with the idea of having the program generate the
>code (maybe in Python) on the fly, compile it (e.g. with psyco), and
>then run it. But with Python's limit of 20 nested loops, that won't
>fly.
>
>Dan

How many total executions of the innermost loop are you expecting?

At 2 states per nested loop, ISTM you'd have 2**20 unless some logic prunes it (not bad).
But at say 10 states per nested loop, ISTM 10**20 would imply a long wait (>3000 years
if you could do a billion loops/second , unless you have a way of parallelizing
over a LARGE farm and CONSIDERABLE pruning happens.

Perhaps a clue to your real problem might yield some ideas for alternatives
to massive nested looping, where just the repeated control overhead of the innermost
loops by itself could add up to a long wait.

Regards,
Bengt Richter

Dan Christensen
Guest
Posts: n/a

 07-11-2004
http://www.velocityreviews.com/forums/(E-Mail Removed) (Bengt Richter) writes:

> How many total executions of the innermost loop are you expecting?

In our longest runs, I think we had this many: 77 922 135 056 261.
This was distributed over a large cluster of machines, as you guessed,
and each job was coded with nested for loops.

> Perhaps a clue to your real problem might yield some ideas for alternatives
> to massive nested looping

We can get the answers with statistical methods in minutes, but we
needed to know that the statistical methods were accurate in this case
before extrapolating to other cases.

Dan

Fernando Perez
Guest
Posts: n/a

 07-11-2004
Dan Christensen wrote:

> Well, in a computation in quantum gravity, I have the C code shown
> below. It has exactly 20 nested for loops! There are of course
> other ways to write it, but given the number of iterations, speed
> is the bottom line. Unfortunately, this is only the simplest test
> case of a larger problem...

You might want to have a look at:

The approach outlined here is a fairly generic framework to tackle
high-dimensionality problems, there's another preprint with more examples
coming down the pipeline.

With d=20, since your complexity for truly nested stuff goes like N^d you are
hosed for almost any value of N. Combining the ideas from the paper above
with:

it is possible to write separated forms of many interesting kernels in
mathematical physics, providing algorithms which scale linearly with d (albeit
with big constants in front).

I'd like to know in a bit more detail what the context of your calculation is,
and whether the 20 comes from looping over dimesionalities of some
string-derived model or something else. We're very interested in finding
contexts where these dimesionality-reduction approaches may be used. While my
background is not in quantum gravity, if you keep the discussion generic enough
I should be able to follow (keep it bounded by general relativity, the standard
model and introductory stringy stuff and I should be fine).

Feel free to contact me directly at fperez AT colorado DOT edu if you don't feel
like talking about quantum gravity on c.l.py

Regards,

f

Terry Reedy
Guest
Posts: n/a

 07-11-2004

"Bengt Richter" <(E-Mail Removed)> wrote in message
news:ccrucm\$qbv\$0@216.39.172.122...
> How many total executions of the innermost loop are you expecting?
>
> At 2 states per nested loop, ISTM you'd have 2**20 unless some logic

And there are at least two simple ways to extend the limit: a) have the
innermost loop call a function with its own set of nestings; b) flatten the
nesting with explicit cross-products, as in replacing

for a in [1,2]:
for b in [1,2]: # with

for a,b in [(1,1), (1,2), (2,1), (2,2)]:

Terry J. Reedy

Paul Rubin
Guest
Posts: n/a

 07-11-2004
Dan Christensen <(E-Mail Removed)> writes:
> I've actually toyed with the idea of having the program generate the
> code (maybe in Python) on the fly, compile it (e.g. with psyco), and
> then run it. But with Python's limit of 20 nested loops, that won't fly.

Since I doubt you care very much about portability to lots of different
installations, maybe you could just recompile your copy of Python to use
a higher limit.

Peter Hansen
Guest
Posts: n/a

 07-12-2004
Dan Christensen wrote:

> I've actually toyed with the idea of having the program generate the
> code (maybe in Python) on the fly, compile it (e.g. with psyco), and
> then run it. But with Python's limit of 20 nested loops, that won't
> fly.

Maybe Pyrex does not have such a limit, or at least has a higher
one. Then you could generate the Pyrex code on the fly, and
maybe still get C performance.