Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C++ (http://www.velocityreviews.com/forums/f39-c.html)
-   -   bitset<32> and bitset<64> efficiency (http://www.velocityreviews.com/forums/t954927-bitset-32-and-bitset-64-efficiency.html)

Ninds 11-28-2012 08:58 AM

bitset<32> and bitset<64> efficiency
 
Hi,

I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.
Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.
I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?


Thanks
N

Johannes Bauer 11-28-2012 10:02 AM

Re: bitset<32> and bitset<64> efficiency
 
On 28.11.2012 09:58, Ninds wrote:

> I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.
> Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.
> I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?


Why don't you try it out for your constellation? I just tried:

#include <bitset>
#include <string>
#include <iostream>

int main(int argc, char ** argv) {
std::bitset<64> i, j;
std::cerr << sizeof(i) << std::endl;
__asm__ __volatile__("nop");
i.set(argc);
__asm__ __volatile__("nop");
std::cerr << i << std::endl;
__asm__ __volatile__("nop");
j.set(64 - argc);
__asm__ __volatile__("nop");
std::cerr << j << std::endl;
__asm__ __volatile__("nop");
i ^= j;
__asm__ __volatile__("nop");
std::cerr << i << std::endl;
return 0;
}

on Linux x86_64 with g++ 4.6.2. The "set" always results in a call, but
the xor is done as if it were a int, i.e.:

// First set
400ce0: ba 01 00 00 00 mov $0x1,%edx
400ce5: 48 63 f3 movslq %ebx,%rsi
400ce8: 48 89 e7 mov %rsp,%rdi
400ceb: e8 60 01 00 00 callq 400e50
<std::bitset<64ul>::set(unsigned long, bool)>

[...]
// Second set
400d07: be 40 00 00 00 mov $0x40,%esi
400d0c: 48 8d 7c 24 10 lea 0x10(%rsp),%rdi
400d11: ba 01 00 00 00 mov $0x1,%edx
400d16: 29 de sub %ebx,%esi
400d18: 48 63 f6 movslq %esi,%rsi
400d1b: e8 30 01 00 00 callq 400e50
<std::bitset<64ul>::set(unsigned long, bool)>

[...]
// XOR operation
400d39: 48 8b 44 24 10 mov 0x10(%rsp),%rax
400d3e: 48 31 04 24 xor %rax,(%rsp)

A bit curious, I'd have thought "set" would get inlined/optimized away,
but YMMV.

Best regards,
Johannes

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>

W Karas 11-29-2012 12:38 AM

Re: bitset<32> and bitset<64> efficiency
 
On Wednesday, November 28, 2012 5:02:55 AM UTC-5, Johannes Bauer wrote:
> On 28.11.2012 09:58, Ninds wrote:
>
>
>
> > I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.

>
> > Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.

>
> > I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?

>
>
>
> Why don't you try it out for your constellation? I just tried:
>
>
>
> #include <bitset>
>
> #include <string>
>
> #include <iostream>
>
>
>
> int main(int argc, char ** argv) {
>
> std::bitset<64> i, j;
>
> std::cerr << sizeof(i) << std::endl;
>
> __asm__ __volatile__("nop");
>
> i.set(argc);
>
> __asm__ __volatile__("nop");
>
> std::cerr << i << std::endl;
>
> __asm__ __volatile__("nop");
>
> j.set(64 - argc);
>
> __asm__ __volatile__("nop");
>
> std::cerr << j << std::endl;
>
> __asm__ __volatile__("nop");
>
> i ^= j;
>
> __asm__ __volatile__("nop");
>
> std::cerr << i << std::endl;
>
> return 0;
>
> }
>
>
>
> on Linux x86_64 with g++ 4.6.2. The "set" always results in a call, but
>
> the xor is done as if it were a int, i.e.:
>
>
>
> // First set
>
> 400ce0: ba 01 00 00 00 mov $0x1,%edx
>
> 400ce5: 48 63 f3 movslq %ebx,%rsi
>
> 400ce8: 48 89 e7 mov %rsp,%rdi
>
> 400ceb: e8 60 01 00 00 callq 400e50
>
> <std::bitset<64ul>::set(unsigned long, bool)>
>
>
>
> [...]
>
> // Second set
>
> 400d07: be 40 00 00 00 mov $0x40,%esi
>
> 400d0c: 48 8d 7c 24 10 lea 0x10(%rsp),%rdi
>
> 400d11: ba 01 00 00 00 mov $0x1,%edx
>
> 400d16: 29 de sub %ebx,%esi
>
> 400d18: 48 63 f6 movslq %esi,%rsi
>
> 400d1b: e8 30 01 00 00 callq 400e50
>
> <std::bitset<64ul>::set(unsigned long, bool)>
>
>
>
> [...]
>
> // XOR operation
>
> 400d39: 48 8b 44 24 10 mov 0x10(%rsp),%rax
>
> 400d3e: 48 31 04 24 xor %rax,(%rsp)
>
>
>
> A bit curious, I'd have thought "set" would get inlined/optimized away,
>
> but YMMV.
>
>
>
> Best regards,
>
> Johannes


It seems that typical C++ compilers will often fail to inline, even when doing so would result in object code that was BOTH smaller and faster. It's a very frustrating aspect of using C++. Can anyone comment on why this is the case? In the case of GCC, I suppose one cannot look a gift-horse in the mouth. But the problem seems to exist with compilers which must be licensed at significant cost as well.

Ian Collins 11-29-2012 01:46 AM

Re: bitset<32> and bitset<64> efficiency
 
W Karas wrote:

Please clean up the mess google makes of your replies!

> It seems that typical C++ compilers will often fail to inline, even
> when doing so would result in object code that was BOTH smaller and
> faster. It's a very frustrating aspect of using C++. Can anyone
> comment on why this is the case? In the case of GCC, I suppose one
> cannot look a gift-horse in the mouth. But the problem seems to
> exist with compilers which must be licensed at significant cost as
> well.


Typical C++ compilers do a decent job of inlining when optimisation is
enabled. Both compilers I tried (g++ and Sun CC) inline the original
example (which was too mangled in your reply to re-quote).

--
Ian Collins

Ninds 11-29-2012 12:02 PM

Re: bitset<32> and bitset<64> efficiency
 
On Wednesday, 28 November 2012 08:58:59 UTC, Ninds wrote:
> Hi,
>
>
>
> I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.
>
> Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.
>
> I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?
>
>
>
>
>
> Thanks
>
> N


Hi,
Thanks for the replies .. but I am afraid I am none the wiser. To put my question in context:
I wanted to write a small concise bit of code to compute the SHA256 hash that would be portable (hence free of asm) and isolated. I realised from the outset that with these constraints it wouldn't be the speediest kid on the block but I didn't want it to be ridiculously inefficient either. I couldn't think of a cleaner solution that using std::bitset this way I didn't needto concern myself with the underlying architecture and at the same time I would be working with the language rather than against it.

I now have a prototype which seems to work fine ans is simply C++ implementation of the pseudo code on Wikipedia:

#include<iostream>
#include<string>
#include<sstream>
#include<vector>
#include<iomanip>
#include<bitset>

const unsigned int h[8] ={
0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19};

const unsigned int k[64] ={
0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2};

template<size_t M>
struct bit_op
{
template<size_t N>
static std::bitset<N> &RoL(std::bitset<N> &data)
{
std::bitset<N> leftShift = data<<M;
data>>=(N-M);
data |= leftShift;
return data;
}

template<size_t N>
static std::bitset<N> &RoR(std::bitset<N> &data)
{
std::bitset<N> rightShift = data>>M;
data<<=(N-M);
data |= rightShift;
return data;
}

template<size_t N>
static std::bitset<N> roL(const std::bitset<N> &data)
{
std::bitset<N> temp(data);
return RoL(temp);
}

template<size_t N>
static std::bitset<N> roR(const std::bitset<N> &data)
{
std::bitset<N> temp(data);
return RoR(temp);
}
};

void clean(std::vector<std::bitset<32> > &chunk)
{
for(int i(0); i< chunk.size(); ++i)
{
chunk[i].reset();
}
}

int getChunck(std::istream &theStream, std::vector<std::bitset<32> > &chunk)
{
clean(chunk);
int Count =0;
while(!theStream.eof() && Count < 512/8)
{
char c=0;
theStream.get(c);
if(c!=0)
{
std::bitset<32> data = ((unsigned long )c);
data<<=((3-(Count & 3))*8);
chunk[(Count>>2)] |= data;
++Count;
}
}
return Count;
}

void process(std::vector<std::bitset<32> > &w,std::vector<std::bitset<32> >&hvec)
{
for (int i(16); i< 64; ++i)
{
std::bitset<32> s0 = (bit_op<7>::roR(w[i-15])) ^ (bit_op<18>::roR(w[i-15])) ^ (w[i-15]>>3);
std::bitset<32> s1 = (bit_op<17>::roR(w[i-2])) ^ (bit_op<19>::roR(w[i-2])) ^ (w[i-2]>>10);
w[i] = std::bitset<32>(w[i-16].to_ulong() + s0.to_ulong() + w[i-7].to_ulong() + s1.to_ulong());
}
std::bitset<32> a = hvec[0];
std::bitset<32> b = hvec[1];
std::bitset<32> c = hvec[2];
std::bitset<32> d = hvec[3];
std::bitset<32> e = hvec[4];
std::bitset<32> f = hvec[5];
std::bitset<32> g = hvec[6];
std::bitset<32> h = hvec[7];

for(int i(0); i< 64; ++i)
{
std::bitset<32> S0 = (bit_op<2>::roR(a)) ^ (bit_op<13>::roR(a)) ^ (bit_op<22>::roR(a));
std::bitset<32> maj = (a & b) ^ (a & c) ^ (b & c);
std::bitset<32> t2(S0.to_ulong() + maj.to_ulong());
std::bitset<32> S1 = (bit_op<6>::roR(e)) ^ (bit_op<11>::roR(e)) ^ (bit_op<25>::roR(e));
std::bitset<32> ch = (e & f) ^ ((~e) & g);
std::bitset<32> t1 = h.to_ulong() + S1.to_ulong() + ch.to_ulong() + k[i] + w[i].to_ulong() ;

h = g;
g = f;
f = e;
e = d.to_ulong() + t1.to_ulong();
d = c;
c = b;
b = a;
a = t1.to_ulong() + t2.to_ulong() ;
}
hvec[0] = hvec[0].to_ulong() + a.to_ulong();
hvec[1] = hvec[1].to_ulong() + b.to_ulong();
hvec[2] = hvec[2].to_ulong() + c.to_ulong();
hvec[3] = hvec[3].to_ulong() + d.to_ulong();
hvec[4] = hvec[4].to_ulong() + e.to_ulong();
hvec[5] = hvec[5].to_ulong() + f.to_ulong();
hvec[6] = hvec[6].to_ulong() + g.to_ulong();
hvec[7] = hvec[7].to_ulong() + h.to_ulong();
}

Ninds 11-29-2012 12:21 PM

Re: bitset<32> and bitset<64> efficiency
 
On Wednesday, 28 November 2012 08:58:59 UTC, Ninds wrote:
> Hi,
>
>
>
> I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.
>
> Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.
>
> I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?
>
>
>
>
>
> Thanks
>
> N

Hi,
Thanks for the replies .. but I am afraid I am none the wiser. To put my question in context:
I wanted to write a small concise bit of code to compute the SHA256 hash that would be portable (hence free of asm) and isolated. I realised from the outset that with these constraints it wouldn't be the speediest kid on the block but I didn't want it to be ridiculously inefficient either. I couldn't think of a cleaner solution that using std::bitset this way I didn't needto concern myself with the underlying architecture and at the same time I would be working with the language rather than against it.
I am however curious as to why rotations methods are not implemented for bitset.

I now have a prototype which seems to work fine implemented from the pseudocode on Wikipedia.

#include<iostream>
#include<string>
#include<sstream>
#include<vector>
#include<iomanip>
#include<bitset>

const unsigned int h[8] ={
0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19};

const unsigned int k[64] ={
0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2};

template<size_t M>
struct bit_op
{
template<size_t N>
static std::bitset<N> &RoL(std::bitset<N> &data)
{
std::bitset<N> leftShift = data<<M;
data>>=(N-M);
data |= leftShift;
return data;
}

template<size_t N>
static std::bitset<N> &RoR(std::bitset<N> &data)
{
std::bitset<N> rightShift = data>>M;
data<<=(N-M);
data |= rightShift;
return data;
}

template<size_t N>
static std::bitset<N> roL(const std::bitset<N> &data)
{
std::bitset<N> temp(data);
return RoL(temp);
}

template<size_t N>
static std::bitset<N> roR(const std::bitset<N> &data)
{
std::bitset<N> temp(data);
return RoR(temp);
}
};

void clean(std::vector<std::bitset<32> > &chunk)
{
for(int i(0); i< chunk.size(); ++i)
{
chunk[i].reset();
}
}

int getChunck(std::istream &theStream, std::vector<std::bitset<32> > &chunk)
{
clean(chunk);
int Count =0;
while(!theStream.eof() && Count < 512/8)
{
char c=0;
theStream.get(c);
if(c!=0)
{
std::bitset<32> data = ((unsigned long )c);
data<<=((3-(Count & 3))*8);
chunk[(Count>>2)] |= data;
++Count;
}
}
return Count;
}

void process(std::vector<std::bitset<32> > &w,std::vector<std::bitset<32> >&hvec)
{
for (int i(16); i< 64; ++i)
{
std::bitset<32> s0 = (bit_op<7>::roR(w[i-15])) ^ (bit_op<18>::roR(w[i-15])) ^ (w[i-15]>>3);
std::bitset<32> s1 = (bit_op<17>::roR(w[i-2])) ^ (bit_op<19>::roR(w[i-2])) ^ (w[i-2]>>10);
w[i] = std::bitset<32>(w[i-16].to_ulong() + s0.to_ulong() + w[i-7].to_ulong() + s1.to_ulong());
}
std::bitset<32> a = hvec[0];
std::bitset<32> b = hvec[1];
std::bitset<32> c = hvec[2];
std::bitset<32> d = hvec[3];
std::bitset<32> e = hvec[4];
std::bitset<32> f = hvec[5];
std::bitset<32> g = hvec[6];
std::bitset<32> h = hvec[7];

for(int i(0); i< 64; ++i)
{
std::bitset<32> S0 = (bit_op<2>::roR(a)) ^ (bit_op<13>::roR(a)) ^ (bit_op<22>::roR(a));
std::bitset<32> maj = (a & b) ^ (a & c) ^ (b & c);
std::bitset<32> t2(S0.to_ulong() + maj.to_ulong());
std::bitset<32> S1 = (bit_op<6>::roR(e)) ^ (bit_op<11>::roR(e)) ^ (bit_op<25>::roR(e));
std::bitset<32> ch = (e & f) ^ ((~e) & g);
std::bitset<32> t1 = h.to_ulong() + S1.to_ulong() + ch.to_ulong() + k[i] + w[i].to_ulong() ;

h = g;
g = f;
f = e;
e = d.to_ulong() + t1.to_ulong();
d = c;
c = b;
b = a;
a = t1.to_ulong() + t2.to_ulong() ;
}
hvec[0] = hvec[0].to_ulong() + a.to_ulong();
hvec[1] = hvec[1].to_ulong() + b.to_ulong();
hvec[2] = hvec[2].to_ulong() + c.to_ulong();
hvec[3] = hvec[3].to_ulong() + d.to_ulong();
hvec[4] = hvec[4].to_ulong() + e.to_ulong();
hvec[5] = hvec[5].to_ulong() + f.to_ulong();
hvec[6] = hvec[6].to_ulong() + g.to_ulong();
hvec[7] = hvec[7].to_ulong() + h.to_ulong();
}

int main(int argc, char* argv[])
{
if(argc<2)
{
std::cout <<"Need a string to hash";
return -1;
}
std::string theTestString = argv[1] ;
for(int i(2); i< argc; ++i)
{
theTestString+=" ";
theTestString+= argv[i];
}
std::istringstream theStream(theTestString);
std::vector<std::bitset<32> > hvec(8);

for(int i(0); i< 8; ++i)
{
hvec[i] = h[i];
}

std::vector<std::bitset<32> > chunk(64);
bool finished = false;
unsigned int byteCount = 0;
unsigned int chunkCount = 0;
int count =0;
while((count=getChunck(theStream,chunk))==64)
{
process(chunk,hvec);
byteCount+=count;
}
byteCount+=count;
int chunkIndex = 0;
int byteIndex = 0;
if(count !=0)
{
chunkIndex = count/4;
byteIndex = count - (count/4)*4;
}
chunk[chunkIndex]|= (1<<((4-byteIndex)*8-1));
std::bitset<64> length(byteCount);
length<<=3;
unsigned long lower = (length & ((std::bitset<64>().set())>>32)).to_ulong();
unsigned long upper = (length & ((std::bitset<64>().set())<<32)).to_ulong();

if(chunkIndex > 13)
{
process(chunk,hvec);
clean(chunk);
}
chunk[14]|=upper;
chunk[15]|=lower;
process(chunk,hvec);

std::vector<unsigned char> theComputed;
for(int i(0); i< 8; ++i)
{
for(int j(0); j<4; ++j)
{
std::cout << std::hex << ((hvec[i]>>((3-j)*8))&std::bitset<32>(255)).to_ulong() <<"-";
}
}
}


Johannes Bauer 11-29-2012 01:24 PM

Re: bitset<32> and bitset<64> efficiency
 
On 29.11.2012 02:11, Luca Risolia wrote:

> Just make sure to compile with optimizations on:


Oh well, I'm not *that* stupid:

[~/tmp]: g++ -Wall -O3 x.cc -o x
[~/tmp]: objdump --demangle -d x | grep call | grep bitset | grep '::set'
400ceb: e8 60 01 00 00 callq 400e50
<std::bitset<64ul>::set(unsigned long, bool)>
400d1b: e8 30 01 00 00 callq 400e50
<std::bitset<64ul>::set(unsigned long, bool)>
[~/tmp]: g++ -dumpversion
4.6.2

Is is definitely not inlined even though I specified -O3.

Regards,
Johannes

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>

Johannes Bauer 11-29-2012 01:25 PM

Re: bitset<32> and bitset<64> efficiency
 
On 29.11.2012 02:46, Ian Collins wrote:

> Typical C++ compilers do a decent job of inlining when optimisation is
> enabled. Both compilers I tried (g++ and Sun CC) inline the original
> example (which was too mangled in your reply to re-quote).


Which version did you try? I used 4.6.2 on Linux x86_64 with -O3 and got
no inlining.

Best regards,
Johannes

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>

Ninds 11-29-2012 02:58 PM

Re: bitset<32> and bitset<64> efficiency
 
On Thursday, 29 November 2012 13:25:17 UTC, Johannes Bauer wrote:
> On 29.11.2012 02:46, Ian Collins wrote:
>
>
>
> > Typical C++ compilers do a decent job of inlining when optimisation is

>
> > enabled. Both compilers I tried (g++ and Sun CC) inline the original

>
> > example (which was too mangled in your reply to re-quote).

>
>
>
> Which version did you try? I used 4.6.2 on Linux x86_64 with -O3 and got
>
> no inlining.
>
>
>
> Best regards,
>
> Johannes
>
>
>
> --
>
> >> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

>
> > Zumindest nicht öffentlich!

>
> Ah, der neueste und bis heute genialste Streich unsere großen
>
> Kosmologen: Die Geheim-Vorhersage.
>
> - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>


Is it possible that:
1. For your case, on x86_64 there would be no need for inlining since bitset<64> degenerates exactly to native int ?
2. The same case on a 32bit machine would require a call since the set op is no longer atomic
3. For bitset<128> you case makes a call ?

N

Johannes Bauer 11-29-2012 05:47 PM

Re: bitset<32> and bitset<64> efficiency
 
On 29.11.2012 15:58, Ninds wrote:

[Quoting chaos]

As others have mentioned, PLEASE take care of the quoting mess that your
newsreader produces. It makes your messages hard to decipher.

> Is it possible that:
> 1. For your case, on x86_64 there would be no need for inlining since bitset<64> degenerates exactly to native int ?


Well, a bitset<64> on my example is 8 bytes wide while a native int is 4
bytes. But if it did exactly degenerate into a native int, it would
still make lots of sense of the compiler to inline the code in order to
simplify it and get rid of the call overhead.

> 2. The same case on a 32bit machine would require a call since the set op is no longer atomic


This doesn't make sense. How does atomicity come into play here? There's
no guarantee a bitset does atomically alter the bits.

> 3. For bitset<128> you case makes a call ?


Yes.

Regards,
Johannes


--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>


All times are GMT. The time now is 05:14 PM.

Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.