Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > error when porting C code to Python (bitwise manipulation)

Reply
Thread Tools

error when porting C code to Python (bitwise manipulation)

 
 
Jordan
Guest
Posts: n/a
 
      07-10-2008
I am trying to rewrite some C source code for a poker hand evaluator
in Python. Putting aside all of the comments such as just using the C
code, or using SWIG, etc. I have been having problems with my Python
code not responding the same way as the C version.

C verison:

unsigned find_fast(unsigned u)
{
unsigned a, b, r;
u += 0xe91aaa35;
u ^= u >> 16;
u += u << 8;
u ^= u >> 4;
b = (u >> & 0x1ff;
a = (u + (u << 2)) >> 19;
r = a ^ hash_adjust[b];
return r;
}


my version (Python, hopefully ):

def find_fast(u):
u += 0xe91aaa35
u ^= u >> 16
u += u << 8
u ^= u >> 4
b = (u >> & 0x1ff
a = (u + (u << 2)) >> 19
r = a ^ hash_adjust[b]
return r


As far as I understand the unsigned instructions in C just increase
amount of bytes the int can hold, and Python automatically converts to
longs which have infinite size when necessary, so I am not sure why I
am getting different results.

I assume that I am missing something fairly simple here, so help a
n00b out if you can

Thanks in advance,

jnb
 
Reply With Quote
 
 
 
 
Dan Stromberg
Guest
Posts: n/a
 
      07-10-2008
On Wed, 09 Jul 2008 20:56:59 -0700, Jordan wrote:

> I am trying to rewrite some C source code for a poker hand evaluator in
> Python. Putting aside all of the comments such as just using the C
> code, or using SWIG, etc. I have been having problems with my Python
> code not responding the same way as the C version.
>
> C verison:
>
> unsigned find_fast(unsigned u)
> {
> unsigned a, b, r;
> u += 0xe91aaa35;
> u ^= u >> 16;
> u += u << 8;
> u ^= u >> 4;
> b = (u >> & 0x1ff;
> a = (u + (u << 2)) >> 19;
> r = a ^ hash_adjust[b];
> return r;
> }
>
>
> my version (Python, hopefully ):
>
> def find_fast(u):
> u += 0xe91aaa35
> u ^= u >> 16
> u += u << 8
> u ^= u >> 4
> b = (u >> & 0x1ff
> a = (u + (u << 2)) >> 19
> r = a ^ hash_adjust[b]
> return r
>
>
> As far as I understand the unsigned instructions in C just increase
> amount of bytes the int can hold, and Python automatically converts to
> longs which have infinite size when necessary, so I am not sure why I am
> getting different results.
>
> I assume that I am missing something fairly simple here, so help a n00b
> out if you can
>
> Thanks in advance,
>
> jnb


What business does a poker hand evaluator have doing that kind of bitwise
arithmetic? One problem with C is not the language itself, but the
culture of using bitwise tricks where they aren't really necessary.

Anyway, I believe in C unsigned bitwise arithmetic, when overflowing an
integer will simply throw away the bits that are "too big". So if python
is converting to a long when overflowing, that would cause a different
result right there.

You could try throwing in "&= 0xffffffff" all over the place if the C
code was written for a 32 bit unsigned int. unsigned int will usually be
32 or 64 bits these days. If it's a 64 bit unsigned int in C, it'd be
"&= 0xffffffffffffffff".

 
Reply With Quote
 
 
 
 
Jordan
Guest
Posts: n/a
 
      07-10-2008
I was actually just going through an example to show what was
happening each step of the way and noticed the overflow!!! bah, stupid
tricks tricks tricks!!!

The problem is def the overflow, I notice that I start to get negative
numbers in the C version, which makes me think that the & 0xffffffff
trick won't work (because it will never evaluate to negative in
python, right?)


Seeing that the problem is the overflow and the bitwise operations
returning a negative, does anyone have any suggestions...I will look
more into C bitwise tricks in the meantime haha.

And in terms of what this is doing in a poker hand evaluator:

http://www.suffecool.net/poker/evaluator.html (an evaluator using
some nice tricks to evaluate for flushes, straights, and highcard with
LU tables then binary search for the rest)

then

http://senzee.blogspot.com/2006/06/s...fect-hash.html (does the
same thing, but uses perfect hashing instead of a binary search)

the function I am having issues with comes up in the hashing
algorithm
 
Reply With Quote
 
Jordan
Guest
Posts: n/a
 
      07-10-2008
I realize I did a pretty bad job of explaining the problem. The
problem is the python version is returning an r that is WAAAAY to big.

Here is an example run through that function in each language:

C:

u starts at 1050

u += 0xe91aaa35;

u is now -384127409

u ^= u >> 16;

u is now -384153771

u += u << 8;

u is now 56728661

u ^= u >> 4;

u is now 56067472

b = (u >> & 0x1ff;

b is now 389

a = (u + (u << 2)) >> 19;

a is now 534

r = a ^ hash_adjust[b];

r is now 6366



Python:

u starts at 1050

u += 0xe91aaa35

u is now 3910839887L

rut roh...
 
Reply With Quote
 
Jordan
Guest
Posts: n/a
 
      07-10-2008
if after the first step (u += 0xe91aaa35) you apply this function:

invert = lambda x: ~int(hex(0xffffffff - x)[0:-1],16)

it returns the correct value (corrected the overflow)

but there is still something wrong, still looking into it, if someone
knows how to do this, feel free to comment
 
Reply With Quote
 
Peter Otten
Guest
Posts: n/a
 
      07-10-2008
Jordan wrote:

> C:
>
> u starts at 1050
>
> u += 0xe91aaa35;
>
> u is now -384127409


Hm, a negative unsigned...

> Python:
>
> Â* Â*u starts at 1050
>
> Â* Â*u += 0xe91aaa35
>
> Â* Â*u is now Â*3910839887L


Seriously, masking off the leading ones is the way to go:

>>> -384127409 & 0xffffffff == 3910839887 & 0xffffffff

True

Peter
 
Reply With Quote
 
Jordan
Guest
Posts: n/a
 
      07-10-2008
Well, I have figured out something that works:

def findit(u):
u += 0xe91aaa35
u1 = ~(0xffffffff - u) ^ u >> 16
u1 += ((u1 << & 0xffffffff)
u1 ^= (u1 & 0xffffffff) >> 4
b = (u1 >> & 0x1ff
a = (u1 + (u1 << 2) & 0xffffffff) >> 19
r = int(a) ^ hash_adjust[int(b)]
return r


I feel like this cannot possibly be the best way of doing this, but it
does work!!!! haha

If anyone would care to share a more elegant solution, that would be
great
 
Reply With Quote
 
MRAB
Guest
Posts: n/a
 
      07-10-2008
On Jul 10, 4:56*am, Jordan <JordanNealB...@gmail.com> wrote:
> I am trying to rewrite some C source code for a poker hand evaluator
> in Python. *Putting aside all of the comments such as just using the C
> code, or using SWIG, etc. *I have been having problems with my Python
> code not responding the same way as the C version.
>
> C verison:
>
> unsigned find_fast(unsigned u)
> {
> * * unsigned a, b, r;
> * * u += 0xe91aaa35;
> * * u ^= u >> 16;
> * * u += u << 8;
> * * u ^= u >> 4;
> * * b *= (u >> & 0x1ff;
> * * a *= (u + (u << 2)) >> 19;
> * * r *= a ^ hash_adjust[b];
> * * return r;
>
> }
>
> my version (Python, hopefully ):
>
> def find_fast(u):
> * * u += 0xe91aaa35
> * * u ^= u >> 16
> * * u += u << 8
> * * u ^= u >> 4
> * * b *= (u >> & 0x1ff
> * * a *= (u + (u << 2)) >> 19
> * * r *= a ^ hash_adjust[b]
> * * return r
>
> As far as I understand the unsigned instructions in C just increase
> amount of bytes the int can hold, and Python automatically converts to
> longs which have infinite size when necessary, so I am not sure why I
> am getting different results.
>
> I assume that I am missing something fairly simple here, so help a
> n00b out if you can
>
> Thanks in advance,
>
> jnb


You want to restrict the values to 32 bits. The result of + or << may
exceed 32 bits, so you need to mask off the excess bits afterwards.

def find_fast(u):
mask = 0xffffffff
u = (u + 0xe91aaa35) & mask
u ^= u >> 16
u = (u + (u << ) & mask # can get away with only 1 mask here
u ^= u >> 4
b = (u >> & 0x1ff
a = ((u + (u << 2)) & mask) >> 19 # can get away with only 1 mask
here
r = a ^ hash_adjust[b]
return r

HTH
 
Reply With Quote
 
Jordan
Guest
Posts: n/a
 
      07-10-2008
On Jul 10, 1:35*pm, MRAB <goo...@mrabarnett.plus.com> wrote:
> On Jul 10, 4:56*am, Jordan <JordanNealB...@gmail.com> wrote:
>
>
>
> > I am trying to rewrite some C source code for a poker hand evaluator
> > in Python. *Putting aside all of the comments such as just using the C
> > code, or using SWIG, etc. *I have been having problems with my Python
> > code not responding the same way as the C version.

>
> > C verison:

>
> > unsigned find_fast(unsigned u)
> > {
> > * * unsigned a, b, r;
> > * * u += 0xe91aaa35;
> > * * u ^= u >> 16;
> > * * u += u << 8;
> > * * u ^= u >> 4;
> > * * b *= (u >> & 0x1ff;
> > * * a *= (u + (u << 2)) >> 19;
> > * * r *= a ^ hash_adjust[b];
> > * * return r;

>
> > }

>
> > my version (Python, hopefully ):

>
> > def find_fast(u):
> > * * u += 0xe91aaa35
> > * * u ^= u >> 16
> > * * u += u << 8
> > * * u ^= u >> 4
> > * * b *= (u >> & 0x1ff
> > * * a *= (u + (u << 2)) >> 19
> > * * r *= a ^ hash_adjust[b]
> > * * return r

>
> > As far as I understand the unsigned instructions in C just increase
> > amount of bytes the int can hold, and Python automatically converts to
> > longs which have infinite size when necessary, so I am not sure why I
> > am getting different results.

>
> > I assume that I am missing something fairly simple here, so help a
> > n00b out if you can

>
> > Thanks in advance,

>
> > jnb

>
> You want to restrict the values to 32 bits. The result of + or << may
> exceed 32 bits, so you need to mask off the excess bits afterwards.
>
> def find_fast(u):
> * * mask = 0xffffffff
> * * u *= (u + 0xe91aaa35) & mask
> * * u ^= u >> 16
> * * u *= (u + (u << ) & mask # can get away with only 1 mask here
> * * u ^= u >> 4
> * * b *= (u >> & 0x1ff
> * * a *= ((u + (u << 2)) & mask) >> 19 # can get away with only 1 mask
> here
> * * r *= a ^ hash_adjust[b]
> * * return r
>
> HTH


Well, I guess there are two problems....the masking and the fact the
in C it seems to for some reason overflow and become a negative
value....still not sure why it does it....So the code with just
masking doesn't work, you still need some sort of weird inversion like
the ~(0xFFFFFFFF - u).....weird

anyone?

haha
 
Reply With Quote
 
Harald Luessen
Guest
Posts: n/a
 
      07-10-2008
On Thu, 10 Jul 2008 Jordan wrote:

>On Jul 10, 1:35*pm, MRAB <goo...@mrabarnett.plus.com> wrote:
>> On Jul 10, 4:56*am, Jordan <JordanNealB...@gmail.com> wrote:
>>
>>
>>
>> > I am trying to rewrite some C source code for a poker hand evaluator
>> > in Python. *Putting aside all of the comments such as just using the C
>> > code, or using SWIG, etc. *I have been having problems with my Python
>> > code not responding the same way as the C version.

>>
>> > C verison:

>>
>> > unsigned find_fast(unsigned u)
>> > {
>> > * * unsigned a, b, r;
>> > * * u += 0xe91aaa35;
>> > * * u ^= u >> 16;
>> > * * u += u << 8;
>> > * * u ^= u >> 4;
>> > * * b *= (u >> & 0x1ff;
>> > * * a *= (u + (u << 2)) >> 19;
>> > * * r *= a ^ hash_adjust[b];
>> > * * return r;

>>
>> > }

>>
>> > my version (Python, hopefully ):

>>
>> > def find_fast(u):
>> > * * u += 0xe91aaa35
>> > * * u ^= u >> 16
>> > * * u += u << 8
>> > * * u ^= u >> 4
>> > * * b *= (u >> & 0x1ff
>> > * * a *= (u + (u << 2)) >> 19
>> > * * r *= a ^ hash_adjust[b]
>> > * * return r

>>
>> > As far as I understand the unsigned instructions in C just increase
>> > amount of bytes the int can hold, and Python automatically converts to
>> > longs which have infinite size when necessary, so I am not sure why I
>> > am getting different results.

>>
>> > I assume that I am missing something fairly simple here, so help a
>> > n00b out if you can

>>
>> > Thanks in advance,

>>
>> > jnb

>>
>> You want to restrict the values to 32 bits. The result of + or << may
>> exceed 32 bits, so you need to mask off the excess bits afterwards.
>>
>> def find_fast(u):
>> * * mask = 0xffffffff
>> * * u *= (u + 0xe91aaa35) & mask
>> * * u ^= u >> 16
>> * * u *= (u + (u << ) & mask # can get away with only 1 mask here
>> * * u ^= u >> 4
>> * * b *= (u >> & 0x1ff
>> * * a *= ((u + (u << 2)) & mask) >> 19 # can get away with only 1 mask
>> here
>> * * r *= a ^ hash_adjust[b]
>> * * return r
>>
>> HTH

>
>Well, I guess there are two problems....the masking and the fact the
>in C it seems to for some reason overflow and become a negative
>value....still not sure why it does it....So the code with just
>masking doesn't work, you still need some sort of weird inversion like
>the ~(0xFFFFFFFF - u).....weird
>
>anyone?


In C unsigned can not be negative. Why do you believe
the numbers are negative? If your debugger is telling
you this thow away the debugger and use printf.
If printf is telling you this then use the right format.
printf("%u", u); // for unsigned int u
printf("%lu", u); // for unsigned long u
printf("%x", u);
or
printf("0x%08x", u); // to see u in hex

Harald

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with porting 1.1 code to 2.0 =?Utf-8?B?QmlsbA==?= ASP .Net 1 01-06-2006 07:06 AM
Porting Code from one machine to another DevBoy ASP .Net 0 05-06-2004 01:14 PM
Re: [porting code from c++ to java] Enis Java 2 04-03-2004 10:58 PM
Re: Java solution vs porting code from X/Motif to Windows Roedy Green Java 0 07-22-2003 07:09 PM
Re: Java solution vs porting code from X/Motif to Windows Harald Hein Java 0 07-22-2003 06:20 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57