Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > bitset<32> and bitset<64> efficiency

Reply
Thread Tools

bitset<32> and bitset<64> efficiency

 
 
Ninds
Guest
Posts: n/a
 
      11-28-2012
Hi,

I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.
Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.
I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?


Thanks
N
 
Reply With Quote
 
 
 
 
Johannes Bauer
Guest
Posts: n/a
 
      11-28-2012
On 28.11.2012 09:58, Ninds wrote:

> I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.
> Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.
> I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?


Why don't you try it out for your constellation? I just tried:

#include <bitset>
#include <string>
#include <iostream>

int main(int argc, char ** argv) {
std::bitset<64> i, j;
std::cerr << sizeof(i) << std::endl;
__asm__ __volatile__("nop");
i.set(argc);
__asm__ __volatile__("nop");
std::cerr << i << std::endl;
__asm__ __volatile__("nop");
j.set(64 - argc);
__asm__ __volatile__("nop");
std::cerr << j << std::endl;
__asm__ __volatile__("nop");
i ^= j;
__asm__ __volatile__("nop");
std::cerr << i << std::endl;
return 0;
}

on Linux x86_64 with g++ 4.6.2. The "set" always results in a call, but
the xor is done as if it were a int, i.e.:

// First set
400ce0: ba 01 00 00 00 mov $0x1,%edx
400ce5: 48 63 f3 movslq %ebx,%rsi
400ce8: 48 89 e7 mov %rsp,%rdi
400ceb: e8 60 01 00 00 callq 400e50
<std::bitset<64ul>::set(unsigned long, bool)>

[...]
// Second set
400d07: be 40 00 00 00 mov $0x40,%esi
400d0c: 48 8d 7c 24 10 lea 0x10(%rsp),%rdi
400d11: ba 01 00 00 00 mov $0x1,%edx
400d16: 29 de sub %ebx,%esi
400d18: 48 63 f6 movslq %esi,%rsi
400d1b: e8 30 01 00 00 callq 400e50
<std::bitset<64ul>::set(unsigned long, bool)>

[...]
// XOR operation
400d39: 48 8b 44 24 10 mov 0x10(%rsp),%rax
400d3e: 48 31 04 24 xor %rax,(%rsp)

A bit curious, I'd have thought "set" would get inlined/optimized away,
but YMMV.

Best regards,
Johannes

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$(E-Mail Removed)>
 
Reply With Quote
 
 
 
 
W Karas
Guest
Posts: n/a
 
      11-29-2012
On Wednesday, November 28, 2012 5:02:55 AM UTC-5, Johannes Bauer wrote:
> On 28.11.2012 09:58, Ninds wrote:
>
>
>
> > I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.

>
> > Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.

>
> > I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?

>
>
>
> Why don't you try it out for your constellation? I just tried:
>
>
>
> #include <bitset>
>
> #include <string>
>
> #include <iostream>
>
>
>
> int main(int argc, char ** argv) {
>
> std::bitset<64> i, j;
>
> std::cerr << sizeof(i) << std::endl;
>
> __asm__ __volatile__("nop");
>
> i.set(argc);
>
> __asm__ __volatile__("nop");
>
> std::cerr << i << std::endl;
>
> __asm__ __volatile__("nop");
>
> j.set(64 - argc);
>
> __asm__ __volatile__("nop");
>
> std::cerr << j << std::endl;
>
> __asm__ __volatile__("nop");
>
> i ^= j;
>
> __asm__ __volatile__("nop");
>
> std::cerr << i << std::endl;
>
> return 0;
>
> }
>
>
>
> on Linux x86_64 with g++ 4.6.2. The "set" always results in a call, but
>
> the xor is done as if it were a int, i.e.:
>
>
>
> // First set
>
> 400ce0: ba 01 00 00 00 mov $0x1,%edx
>
> 400ce5: 48 63 f3 movslq %ebx,%rsi
>
> 400ce8: 48 89 e7 mov %rsp,%rdi
>
> 400ceb: e8 60 01 00 00 callq 400e50
>
> <std::bitset<64ul>::set(unsigned long, bool)>
>
>
>
> [...]
>
> // Second set
>
> 400d07: be 40 00 00 00 mov $0x40,%esi
>
> 400d0c: 48 8d 7c 24 10 lea 0x10(%rsp),%rdi
>
> 400d11: ba 01 00 00 00 mov $0x1,%edx
>
> 400d16: 29 de sub %ebx,%esi
>
> 400d18: 48 63 f6 movslq %esi,%rsi
>
> 400d1b: e8 30 01 00 00 callq 400e50
>
> <std::bitset<64ul>::set(unsigned long, bool)>
>
>
>
> [...]
>
> // XOR operation
>
> 400d39: 48 8b 44 24 10 mov 0x10(%rsp),%rax
>
> 400d3e: 48 31 04 24 xor %rax,(%rsp)
>
>
>
> A bit curious, I'd have thought "set" would get inlined/optimized away,
>
> but YMMV.
>
>
>
> Best regards,
>
> Johannes


It seems that typical C++ compilers will often fail to inline, even when doing so would result in object code that was BOTH smaller and faster. It's a very frustrating aspect of using C++. Can anyone comment on why this is the case? In the case of GCC, I suppose one cannot look a gift-horse in the mouth. But the problem seems to exist with compilers which must be licensed at significant cost as well.
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      11-29-2012
W Karas wrote:

Please clean up the mess google makes of your replies!

> It seems that typical C++ compilers will often fail to inline, even
> when doing so would result in object code that was BOTH smaller and
> faster. It's a very frustrating aspect of using C++. Can anyone
> comment on why this is the case? In the case of GCC, I suppose one
> cannot look a gift-horse in the mouth. But the problem seems to
> exist with compilers which must be licensed at significant cost as
> well.


Typical C++ compilers do a decent job of inlining when optimisation is
enabled. Both compilers I tried (g++ and Sun CC) inline the original
example (which was too mangled in your reply to re-quote).

--
Ian Collins
 
Reply With Quote
 
Ninds
Guest
Posts: n/a
 
      11-29-2012
On Wednesday, 28 November 2012 08:58:59 UTC, Ninds wrote:
> Hi,
>
>
>
> I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.
>
> Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.
>
> I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?
>
>
>
>
>
> Thanks
>
> N


Hi,
Thanks for the replies .. but I am afraid I am none the wiser. To put my question in context:
I wanted to write a small concise bit of code to compute the SHA256 hash that would be portable (hence free of asm) and isolated. I realised from the outset that with these constraints it wouldn't be the speediest kid on the block but I didn't want it to be ridiculously inefficient either. I couldn't think of a cleaner solution that using std::bitset this way I didn't needto concern myself with the underlying architecture and at the same time I would be working with the language rather than against it.

I now have a prototype which seems to work fine ans is simply C++ implementation of the pseudo code on Wikipedia:

#include<iostream>
#include<string>
#include<sstream>
#include<vector>
#include<iomanip>
#include<bitset>

const unsigned int h[8] ={
0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19};

const unsigned int k[64] ={
0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2};

template<size_t M>
struct bit_op
{
template<size_t N>
static std::bitset<N> &RoL(std::bitset<N> &data)
{
std::bitset<N> leftShift = data<<M;
data>>=(N-M);
data |= leftShift;
return data;
}

template<size_t N>
static std::bitset<N> &RoR(std::bitset<N> &data)
{
std::bitset<N> rightShift = data>>M;
data<<=(N-M);
data |= rightShift;
return data;
}

template<size_t N>
static std::bitset<N> roL(const std::bitset<N> &data)
{
std::bitset<N> temp(data);
return RoL(temp);
}

template<size_t N>
static std::bitset<N> roR(const std::bitset<N> &data)
{
std::bitset<N> temp(data);
return RoR(temp);
}
};

void clean(std::vector<std::bitset<32> > &chunk)
{
for(int i(0); i< chunk.size(); ++i)
{
chunk[i].reset();
}
}

int getChunck(std::istream &theStream, std::vector<std::bitset<32> > &chunk)
{
clean(chunk);
int Count =0;
while(!theStream.eof() && Count < 512/
{
char c=0;
theStream.get(c);
if(c!=0)
{
std::bitset<32> data = ((unsigned long )c);
data<<=((3-(Count & 3))*;
chunk[(Count>>2)] |= data;
++Count;
}
}
return Count;
}

void process(std::vector<std::bitset<32> > &w,std::vector<std::bitset<32> >&hvec)
{
for (int i(16); i< 64; ++i)
{
std::bitset<32> s0 = (bit_op<7>::roR(w[i-15])) ^ (bit_op<18>::roR(w[i-15])) ^ (w[i-15]>>3);
std::bitset<32> s1 = (bit_op<17>::roR(w[i-2])) ^ (bit_op<19>::roR(w[i-2])) ^ (w[i-2]>>10);
w[i] = std::bitset<32>(w[i-16].to_ulong() + s0.to_ulong() + w[i-7].to_ulong() + s1.to_ulong());
}
std::bitset<32> a = hvec[0];
std::bitset<32> b = hvec[1];
std::bitset<32> c = hvec[2];
std::bitset<32> d = hvec[3];
std::bitset<32> e = hvec[4];
std::bitset<32> f = hvec[5];
std::bitset<32> g = hvec[6];
std::bitset<32> h = hvec[7];

for(int i(0); i< 64; ++i)
{
std::bitset<32> S0 = (bit_op<2>::roR(a)) ^ (bit_op<13>::roR(a)) ^ (bit_op<22>::roR(a));
std::bitset<32> maj = (a & b) ^ (a & c) ^ (b & c);
std::bitset<32> t2(S0.to_ulong() + maj.to_ulong());
std::bitset<32> S1 = (bit_op<6>::roR(e)) ^ (bit_op<11>::roR(e)) ^ (bit_op<25>::roR(e));
std::bitset<32> ch = (e & f) ^ ((~e) & g);
std::bitset<32> t1 = h.to_ulong() + S1.to_ulong() + ch.to_ulong() + k[i] + w[i].to_ulong() ;

h = g;
g = f;
f = e;
e = d.to_ulong() + t1.to_ulong();
d = c;
c = b;
b = a;
a = t1.to_ulong() + t2.to_ulong() ;
}
hvec[0] = hvec[0].to_ulong() + a.to_ulong();
hvec[1] = hvec[1].to_ulong() + b.to_ulong();
hvec[2] = hvec[2].to_ulong() + c.to_ulong();
hvec[3] = hvec[3].to_ulong() + d.to_ulong();
hvec[4] = hvec[4].to_ulong() + e.to_ulong();
hvec[5] = hvec[5].to_ulong() + f.to_ulong();
hvec[6] = hvec[6].to_ulong() + g.to_ulong();
hvec[7] = hvec[7].to_ulong() + h.to_ulong();
}
 
Reply With Quote
 
Ninds
Guest
Posts: n/a
 
      11-29-2012
On Wednesday, 28 November 2012 08:58:59 UTC, Ninds wrote:
> Hi,
>
>
>
> I would like to know whether using bitset<32> for bit operations on a 32bit machine would generally be as efficient as using a 32bit int.
>
> Moreover, if the answer is yes would it also hold for bitset<64> and 64bit int on a 64 bit arch.
>
> I realise the standard says nothing about the implementation so there is no definitive answer but what is 'likely' to be the case ?
>
>
>
>
>
> Thanks
>
> N

Hi,
Thanks for the replies .. but I am afraid I am none the wiser. To put my question in context:
I wanted to write a small concise bit of code to compute the SHA256 hash that would be portable (hence free of asm) and isolated. I realised from the outset that with these constraints it wouldn't be the speediest kid on the block but I didn't want it to be ridiculously inefficient either. I couldn't think of a cleaner solution that using std::bitset this way I didn't needto concern myself with the underlying architecture and at the same time I would be working with the language rather than against it.
I am however curious as to why rotations methods are not implemented for bitset.

I now have a prototype which seems to work fine implemented from the pseudocode on Wikipedia.

#include<iostream>
#include<string>
#include<sstream>
#include<vector>
#include<iomanip>
#include<bitset>

const unsigned int h[8] ={
0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19};

const unsigned int k[64] ={
0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2};

template<size_t M>
struct bit_op
{
template<size_t N>
static std::bitset<N> &RoL(std::bitset<N> &data)
{
std::bitset<N> leftShift = data<<M;
data>>=(N-M);
data |= leftShift;
return data;
}

template<size_t N>
static std::bitset<N> &RoR(std::bitset<N> &data)
{
std::bitset<N> rightShift = data>>M;
data<<=(N-M);
data |= rightShift;
return data;
}

template<size_t N>
static std::bitset<N> roL(const std::bitset<N> &data)
{
std::bitset<N> temp(data);
return RoL(temp);
}

template<size_t N>
static std::bitset<N> roR(const std::bitset<N> &data)
{
std::bitset<N> temp(data);
return RoR(temp);
}
};

void clean(std::vector<std::bitset<32> > &chunk)
{
for(int i(0); i< chunk.size(); ++i)
{
chunk[i].reset();
}
}

int getChunck(std::istream &theStream, std::vector<std::bitset<32> > &chunk)
{
clean(chunk);
int Count =0;
while(!theStream.eof() && Count < 512/
{
char c=0;
theStream.get(c);
if(c!=0)
{
std::bitset<32> data = ((unsigned long )c);
data<<=((3-(Count & 3))*;
chunk[(Count>>2)] |= data;
++Count;
}
}
return Count;
}

void process(std::vector<std::bitset<32> > &w,std::vector<std::bitset<32> >&hvec)
{
for (int i(16); i< 64; ++i)
{
std::bitset<32> s0 = (bit_op<7>::roR(w[i-15])) ^ (bit_op<18>::roR(w[i-15])) ^ (w[i-15]>>3);
std::bitset<32> s1 = (bit_op<17>::roR(w[i-2])) ^ (bit_op<19>::roR(w[i-2])) ^ (w[i-2]>>10);
w[i] = std::bitset<32>(w[i-16].to_ulong() + s0.to_ulong() + w[i-7].to_ulong() + s1.to_ulong());
}
std::bitset<32> a = hvec[0];
std::bitset<32> b = hvec[1];
std::bitset<32> c = hvec[2];
std::bitset<32> d = hvec[3];
std::bitset<32> e = hvec[4];
std::bitset<32> f = hvec[5];
std::bitset<32> g = hvec[6];
std::bitset<32> h = hvec[7];

for(int i(0); i< 64; ++i)
{
std::bitset<32> S0 = (bit_op<2>::roR(a)) ^ (bit_op<13>::roR(a)) ^ (bit_op<22>::roR(a));
std::bitset<32> maj = (a & b) ^ (a & c) ^ (b & c);
std::bitset<32> t2(S0.to_ulong() + maj.to_ulong());
std::bitset<32> S1 = (bit_op<6>::roR(e)) ^ (bit_op<11>::roR(e)) ^ (bit_op<25>::roR(e));
std::bitset<32> ch = (e & f) ^ ((~e) & g);
std::bitset<32> t1 = h.to_ulong() + S1.to_ulong() + ch.to_ulong() + k[i] + w[i].to_ulong() ;

h = g;
g = f;
f = e;
e = d.to_ulong() + t1.to_ulong();
d = c;
c = b;
b = a;
a = t1.to_ulong() + t2.to_ulong() ;
}
hvec[0] = hvec[0].to_ulong() + a.to_ulong();
hvec[1] = hvec[1].to_ulong() + b.to_ulong();
hvec[2] = hvec[2].to_ulong() + c.to_ulong();
hvec[3] = hvec[3].to_ulong() + d.to_ulong();
hvec[4] = hvec[4].to_ulong() + e.to_ulong();
hvec[5] = hvec[5].to_ulong() + f.to_ulong();
hvec[6] = hvec[6].to_ulong() + g.to_ulong();
hvec[7] = hvec[7].to_ulong() + h.to_ulong();
}

int main(int argc, char* argv[])
{
if(argc<2)
{
std::cout <<"Need a string to hash";
return -1;
}
std::string theTestString = argv[1] ;
for(int i(2); i< argc; ++i)
{
theTestString+=" ";
theTestString+= argv[i];
}
std::istringstream theStream(theTestString);
std::vector<std::bitset<32> > hvec(;

for(int i(0); i< 8; ++i)
{
hvec[i] = h[i];
}

std::vector<std::bitset<32> > chunk(64);
bool finished = false;
unsigned int byteCount = 0;
unsigned int chunkCount = 0;
int count =0;
while((count=getChunck(theStream,chunk))==64)
{
process(chunk,hvec);
byteCount+=count;
}
byteCount+=count;
int chunkIndex = 0;
int byteIndex = 0;
if(count !=0)
{
chunkIndex = count/4;
byteIndex = count - (count/4)*4;
}
chunk[chunkIndex]|= (1<<((4-byteIndex)*8-1));
std::bitset<64> length(byteCount);
length<<=3;
unsigned long lower = (length & ((std::bitset<64>().set())>>32)).to_ulong();
unsigned long upper = (length & ((std::bitset<64>().set())<<32)).to_ulong();

if(chunkIndex > 13)
{
process(chunk,hvec);
clean(chunk);
}
chunk[14]|=upper;
chunk[15]|=lower;
process(chunk,hvec);

std::vector<unsigned char> theComputed;
for(int i(0); i< 8; ++i)
{
for(int j(0); j<4; ++j)
{
std::cout << std::hex << ((hvec[i]>>((3-j)*)&std::bitset<32>(255)).to_ulong() <<"-";
}
}
}

 
Reply With Quote
 
Johannes Bauer
Guest
Posts: n/a
 
      11-29-2012
On 29.11.2012 02:11, Luca Risolia wrote:

> Just make sure to compile with optimizations on:


Oh well, I'm not *that* stupid:

[~/tmp]: g++ -Wall -O3 x.cc -o x
[~/tmp]: objdump --demangle -d x | grep call | grep bitset | grep '::set'
400ceb: e8 60 01 00 00 callq 400e50
<std::bitset<64ul>::set(unsigned long, bool)>
400d1b: e8 30 01 00 00 callq 400e50
<std::bitset<64ul>::set(unsigned long, bool)>
[~/tmp]: g++ -dumpversion
4.6.2

Is is definitely not inlined even though I specified -O3.

Regards,
Johannes

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$(E-Mail Removed)>
 
Reply With Quote
 
Johannes Bauer
Guest
Posts: n/a
 
      11-29-2012
On 29.11.2012 02:46, Ian Collins wrote:

> Typical C++ compilers do a decent job of inlining when optimisation is
> enabled. Both compilers I tried (g++ and Sun CC) inline the original
> example (which was too mangled in your reply to re-quote).


Which version did you try? I used 4.6.2 on Linux x86_64 with -O3 and got
no inlining.

Best regards,
Johannes

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$(E-Mail Removed)>
 
Reply With Quote
 
Ninds
Guest
Posts: n/a
 
      11-29-2012
On Thursday, 29 November 2012 13:25:17 UTC, Johannes Bauer wrote:
> On 29.11.2012 02:46, Ian Collins wrote:
>
>
>
> > Typical C++ compilers do a decent job of inlining when optimisation is

>
> > enabled. Both compilers I tried (g++ and Sun CC) inline the original

>
> > example (which was too mangled in your reply to re-quote).

>
>
>
> Which version did you try? I used 4.6.2 on Linux x86_64 with -O3 and got
>
> no inlining.
>
>
>
> Best regards,
>
> Johannes
>
>
>
> --
>
> >> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

>
> > Zumindest nicht öffentlich!

>
> Ah, der neueste und bis heute genialste Streich unsere großen
>
> Kosmologen: Die Geheim-Vorhersage.
>
> - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$(E-Mail Removed)>


Is it possible that:
1. For your case, on x86_64 there would be no need for inlining since bitset<64> degenerates exactly to native int ?
2. The same case on a 32bit machine would require a call since the set op is no longer atomic
3. For bitset<128> you case makes a call ?

N
 
Reply With Quote
 
Johannes Bauer
Guest
Posts: n/a
 
      11-29-2012
On 29.11.2012 15:58, Ninds wrote:

[Quoting chaos]

As others have mentioned, PLEASE take care of the quoting mess that your
newsreader produces. It makes your messages hard to decipher.

> Is it possible that:
> 1. For your case, on x86_64 there would be no need for inlining since bitset<64> degenerates exactly to native int ?


Well, a bitset<64> on my example is 8 bytes wide while a native int is 4
bytes. But if it did exactly degenerate into a native int, it would
still make lots of sense of the compiler to inline the code in order to
simplify it and get rid of the call overhead.

> 2. The same case on a 32bit machine would require a call since the set op is no longer atomic


This doesn't make sense. How does atomicity come into play here? There's
no guarantee a bitset does atomically alter the bits.

> 3. For bitset<128> you case makes a call ?


Yes.

Regards,
Johannes


--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$(E-Mail Removed)>
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Vector and list iterators and efficiency Thormod Johansen C++ 2 03-26-2007 05:23 PM
Applets, Seperate Frames, and CPU cycles/Efficiency Russ Java 4 05-02-2005 11:07 PM
Opinions wanted regarding efficiency and drop down list data Wysiwyg ASP .Net 2 12-27-2004 11:16 PM
Huge SQL query and ASP.NET...Question of efficiency The Eeediot ASP .Net 3 11-16-2004 10:12 PM
Efficiency of XML and Transformations? GSK ASP .Net 0 05-17-2004 04:48 PM



Advertisments