Velocity Reviews > relative speed of incremention syntaxes (or "i=i+1" VS "i+=1")

# relative speed of incremention syntaxes (or "i=i+1" VS "i+=1")

Laurent
Guest
Posts: n/a

 08-21-2011

> With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with
> floats (0.0 and 1.0), 6%

For floats it is understandable. But for integers, seriously, 4% is a lot. I would never have thought an interpreter would have differences like this in syntax for something as fundamental as adding 1.

Laurent
Guest
Posts: n/a

 08-21-2011
Actually 6% between float themselves is just as not-understandable.

Laurent
Guest
Posts: n/a

 08-21-2011
Actually 6% between float themselves is just as not-understandable.

Laurent
Guest
Posts: n/a

 08-21-2011
I did the test several times with floats on my machine and the difference is not as big as for integers:

==> "i = i + 1.0" is 0.732% faster than "i += 1.0".

It seems normal as the float addition is supposed to be slower than integer addition, thus the syntaxic difference is comparatively less important.

Nobody
Guest
Posts: n/a

 08-21-2011
On Sun, 21 Aug 2011 09:52:23 -0700, Laurent wrote:

> I did many tests and "i = i + 1" always seems to be around 2% faster
> than "i += 1". This is no surprise as the += notation seems to be a
> syntaxic sugar layer that has to be converted to i = i + 1 anyway. Am I
> wrong in my interpretation?

It depends. If the value on the left has an __iadd__ method, that will be
called; the value will be updated in-place, so all references to that
object will be affected:

> import numpy as np
> a = np.zeros(3)
> b = a
> a

array([ 0., 0., 0.])
> b

array([ 0., 0., 0.])
> a += 1
> a

array([ 1., 1., 1.])
> b

array([ 1., 1., 1.])

If the value on the left doesn't have an __iadd__ method, then addition is
performed and the name is re-bound to the result:

> a = a + 1
> a

array([ 2., 2., 2.])
> b

array([ 1., 1., 1.])

If you're writing code which could reasonably be expected to work with
arbitrary "numeric" values, you should decide which to use according to
whether in-place modification is appropriate rather than trivial
performance differences. If a difference of a few percent is significant,
Python is probably the wrong language in the first place.

Andreas Löscher
Guest
Posts: n/a

 08-21-2011
Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith:
> In article <(E-Mail Removed)>,
> Christian Heimes <(E-Mail Removed)> wrote:
>
> > Am 21.08.2011 19:27, schrieb Andreas Lscher:
> > > As for using Integers, the first case (line 1319 and 1535) are true and
> > > there is no difference in Code. However, Python uses a huge switch-case
> > > construct to execute it's opcodes and INPLACE_ADD cames after
> > > BINARY_ADD, hence the difference in speed.

> >
> > I don't think that's the reason. Modern compiles turn a switch statement
> > into a jump or branch table rather than a linear search like chained
> > elif statements.

>
> This is true even for very small values of "modern". I remember the
> Unix v6 C compiler (circa 1977) was able to do this.

What is the difference in speed between a jump table that is searched
from top to bottom in comparison to an ordinary if-then-elif...? The
difference can only be in the search algorithm regarding the table.
Without optimization (linear search) both are the same. If the compiler
applies some magic the difference can be relevant (linear complexity for
if-then-elif... and O(1) if you would use a dictionary).

Hence the executed code for integers is the same, there must be a slower

How would such an jump table work to behave the same liek a
switch-case-statement? Beware, that things like

case PRINT_NEWLINE_TO:
1802 w = stream = POP();
1803 /* fall through to PRINT_NEWLINE */
1804
1805 case PRINT_NEWLINE:

must be supported.

Bye the way:
First line of ceval.c since at least Version 2.4
1
2 /* Execute compiled code */
3
4 /* XXX TO DO:
5 XXX speed up searching for keywords by using a dictionary
6 XXX document it!
7 */

Andreas Löscher
Guest
Posts: n/a

 08-21-2011
Am Sonntag, den 21.08.2011, 12:53 -0700 schrieb Laurent:
> > With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with
> > floats (0.0 and 1.0), 6%

>
> For floats it is understandable. But for integers, seriously, 4% is a lot.. I would never have thought an interpreter would have differences like this in syntax for something as fundamental as adding 1.

It's not as bad as you think. The addition of two integers is a cheap
task (in terms of computation power). If you have two way's to to it,
every little think (jumps in the code etc. ) will have a larger impact
on the execution time than on an expensive operation.

But every improvement on your algorithm will easily result in a
significant shorter execution time than replaceing a+=1 with a=a+1 will
ever do.

Chris Angelico
Guest
Posts: n/a

 08-21-2011
2011/8/22 Andreas Löscher <(E-Mail Removed)-chemnitz.de>:
> How would such an jump table work to behave the same liek a
> switch-case-statement?

If your switch statement uses a simple integer enum with sequential
values, then it can be done quite easily. Take this as an example:

switch (argc)
{
case 0: printf("No args at all, this is weird\n"); break;
case 1: printf("No args\n"); break;
case 2: printf("Default for second arg\n");
case 3: printf("Two args\n"); break;
default: printf("Too many args\n"); break;
}

I compiled this using Open Watcom C, looked at the disassembly, and
hereby translate it into pseudocode (I'll email/post the full 80x86
disassembly if you like):

1) Check if argc > 3 (unsigned comparison), if so jump to default case.
2) Left shift argc two places, add a constant offset, fetch a pointer
from there, and jump to it - that's the jump table. One JMP statement.
3) Code follows for each case.

Incidentally, the Open Watcom compiler actually turned several of the
cases into offset-load of the appropriate string pointer, and then a
jump to the single call to printf. The fall-through from 'case 2' to
'case 3' works fine, although it means that 'case 2' has to be
de-optimized from that one simplification.

This type of optimization works best when the case values are
sequential. (If I remove the 'case 0', the compiler decrements argc
and proceeds to continue as above.) Otherwise, the jump table has to
have a lot of copies of the "default" pointer.

Chris Angelico

Terry Reedy
Guest
Posts: n/a

 08-21-2011
On 8/21/2011 7:17 PM, Andreas LĂ¶scher wrote:
> Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith:
>> In article<(E-Mail Removed)>,
>> Christian Heimes<(E-Mail Removed)> wrote:
>>
>>> Am 21.08.2011 19:27, schrieb Andreas Lscher:
>>>> As for using Integers, the first case (line 1319 and 1535) are true and
>>>> there is no difference in Code. However, Python uses a huge switch-case
>>>> construct to execute it's opcodes and INPLACE_ADD cames after
>>>> BINARY_ADD, hence the difference in speed.
>>>
>>> I don't think that's the reason. Modern compiles turn a switch statement
>>> into a jump or branch table rather than a linear search like chained
>>> elif statements.

>>
>> This is true even for very small values of "modern". I remember the
>> Unix v6 C compiler (circa 1977) was able to do this.

>
> What is the difference in speed between a jump table that is searched
> from top to bottom in comparison to an ordinary if-then-elif...? The
> difference can only be in the search algorithm regarding the table.
> Without optimization (linear search) both are the same. If the compiler
> applies some magic the difference can be relevant (linear complexity for
> if-then-elif... and O(1) if you would use a dictionary).

A jump or branch table is applicable when the case value values are all
small ints, like bytes or less. For C, the table is simply an array of
pointers (addressess, with entries for unused byte codes would be a void
pointer). Hence, O(1) access.
https://secure.wikimedia.org/wikiped...iki/Jump_table

> Hence the executed code for integers is the same, there must be a slower
>
> How would such an jump table work to behave the same liek a
> switch-case-statement? Beware, that things like
>
> case PRINT_NEWLINE_TO:
> 1802 w = stream = POP();
> 1803 /* fall through to PRINT_NEWLINE */

> 1804
> 1805 case PRINT_NEWLINE:
>
> must be supported.

--
Terry Jan Reedy

Chris Angelico
Guest
Posts: n/a

 08-21-2011
2011/8/22 Andreas Löscher <(E-Mail Removed)-chemnitz.de>:
> But every improvement on your algorithm will easily result in a
> significant shorter execution time than replaceing a+=1 with a=a+1 will
> ever do.
>

Agreed. If Python needed a faster alternative to "a=a+1", then I would
recommend an "a.inc()" call or something; some way to avoid looking up
the value of 1. (An equivalent to C's ++a operation, if you like.) But
I think you'd be hard-pressed to find any situation where improving
the speed of incrementing will be significant, that wouldn't be better
served by algorithmic improvements (even just using "for a in
range()") or dropping to C.

ChrisA