Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > What's more important to know: "Big-O" scientology, or Assembly Language?

Reply
Thread Tools

What's more important to know: "Big-O" scientology, or Assembly Language?

 
 
joe
Guest
Posts: n/a
 
      08-22-2010
(I thought about the capitalization of the subject a little, "little"
being the key (No!!! Not you DBA, go back to sleep.) word).

What is more important? Oh? I presented it "wrong"? Well, you know what I
meant. So, which is more important? (Obviously I'm not soliciting a
definitive "answer".)


 
Reply With Quote
 
 
 
 
Phlip
Guest
Posts: n/a
 
      08-22-2010
On Aug 21, 9:29*pm, "joe" <(E-Mail Removed)> wrote:
> (I thought about the capitalization of the subject a little, "little"
> being the key (No!!! Not you DBA, go back to sleep.) word).
>
> What is more important? Oh? I presented it "wrong"? Well, you know what I
> meant. So, which is more important? (Obviously I'm not soliciting a
> definitive "answer".)


Please tell us u r just joking about complexity theory (big-O)
relating to $cientology.

However, you just asked "what tool do I use?" without explaining what
project you target.

If you want to shred (which is most of the reason to use C++), you
need both big-O and assembler. You need to be able to estimate (and
profile) your code's performance over data sets of various sizes. And
you need to understand which C++ statements map onto what general
kinds of Assembler opcodes (push, pop, branch, fetch, etc.).

So the answer is yes.
 
Reply With Quote
 
 
 
 
Juha Nieminen
Guest
Posts: n/a
 
      08-22-2010
Phlip <(E-Mail Removed)> wrote:
> And you need to understand which C++ statements map onto what general
> kinds of Assembler opcodes (push, pop, branch, fetch, etc.).


I don't think I would fully agree with that. Exactly what does it matter
what kind of assembler opcodes the compiler is generating from your code?

In fact, I would say the exact opposite: You should not worry which
opcodes the compiler is generating, mainly because the compiler will
often generate wildly different opcodes for the same C++ code depending
on the compiler itself, the target architecture and the amount of
optimizations. The exact opcodes being generated is rather irrelevant
from the point of view of programming in C++.

That's not to say that *everything* that is extremely low-level is
completely irrelevant. For example, there may be situations where memory
locality and cache optimization may affect the efficiency of the program
very significantly (eg. your program could become an order of magnitude
faster if you do something more cache-optimally). However, this has nothing
to do with exactly what assembler opcodes the compiler is generating.
 
Reply With Quote
 
Phlip
Guest
Posts: n/a
 
      08-22-2010
On Aug 21, 11:08*pm, Juha Nieminen <(E-Mail Removed)> wrote:
> Phlip <(E-Mail Removed)> wrote:
> > And you need to understand which C++ statements map onto what general
> > kinds of Assembler opcodes (push, pop, branch, fetch, etc.).

>
> * I don't think I would fully agree with that. Exactly what does it matter
> what kind of assembler opcodes the compiler is generating from your code?


C++ is a C language, which is a kind of "portable assembler".

The point of using C++ is writing high-level code that compiles very
tight and runs very fast. If you don't need small footprint and high
speed, then use a language such as Ruby, which is gentler to the
programmer, and harsher to the CPU.

> * In fact, I would say the exact opposite: You should not worry which
> opcodes the compiler is generating, mainly because the compiler will
> often generate wildly different opcodes for the same C++ code depending
> on the compiler itself, the target architecture and the amount of
> optimizations. The exact opcodes being generated is rather irrelevant
> from the point of view of programming in C++.


Most of the time you do not worry about the specific opcodes.

C++ is defined in terms of a virtual architecture. That's not poetry,
it's how the Standards work. C++ calls a function by pushing arguments
onto a stack or into registers, then pushing its own address onto the
stack, then jumping into another address. That address is expected to
pop the arguments off the stack and use them, then pop the return
address off and jump back to it.

You don't think of any of that when you code. And someone could port C+
+ to an architecture, such as DNA or photonics, which does not have a
stack, or pushes, or pops, etc.

A syntax error at compile time represents an inability to create valid
opcodes, such as codes that align data types and copy them around.
Disregard that and fix the syntax error.

And the language allows many dodges, from unions to reinterpret_cast,
to allow you to turn off that syntax, and enforce the behavior that
you want. For example, indexing off the end of a minor dimension of a
multidimensional array is well-defined, if you are still within the
major dimensions.

And when you make a mistake and have to diagnose a crash,
understanding all the ways you can stomp the CPU sure helps! C++ gets
as close as possible to the Von Neumann Architecture, of pointers,
registers, accumulators, opcodes, and stacks, while remaining
portable. This also explains the 1001 "undefined behaviors" it can
generate.

> * That's not to say that *everything* that is extremely low-level is
> completely irrelevant. For example, there may be situations where memory
> locality and cache optimization may affect the efficiency of the program
> very significantly (eg. your program could become an order of magnitude
> faster if you do something more cache-optimally). However, this has nothing
> to do with exactly what assembler opcodes the compiler is generating.


The point of using C++ is writing fast, tight code via some awareness
what the operations cost. Learning assembler (even for another
architecture) helps that, even when you only keep it in the back of
your mind.
 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      08-22-2010
Phlip <(E-Mail Removed)> wrote:
> On Aug 21, 11:08*pm, Juha Nieminen <(E-Mail Removed)> wrote:
>> Phlip <(E-Mail Removed)> wrote:
>> > And you need to understand which C++ statements map onto what general
>> > kinds of Assembler opcodes (push, pop, branch, fetch, etc.).

>>
>> * I don't think I would fully agree with that. Exactly what does it matter
>> what kind of assembler opcodes the compiler is generating from your code?

>
> C++ is a C language, which is a kind of "portable assembler".
>
> The point of using C++ is writing high-level code that compiles very
> tight and runs very fast. If you don't need small footprint and high
> speed, then use a language such as Ruby, which is gentler to the
> programmer, and harsher to the CPU.


I still don't see why you should worry which precise opcodes are being
generated from your C++ code by the compiler. Still feels completely
irrelevant.

>> * In fact, I would say the exact opposite: You should not worry which
>> opcodes the compiler is generating, mainly because the compiler will
>> often generate wildly different opcodes for the same C++ code depending
>> on the compiler itself, the target architecture and the amount of
>> optimizations. The exact opcodes being generated is rather irrelevant
>> from the point of view of programming in C++.

>
> Most of the time you do not worry about the specific opcodes.


Most of the time? Can you give an example of a situation where knowing
the exact opcodes is relevant in any way?

> C++ is defined in terms of a virtual architecture. That's not poetry,
> it's how the Standards work. C++ calls a function by pushing arguments
> onto a stack or into registers, then pushing its own address onto the
> stack, then jumping into another address. That address is expected to
> pop the arguments off the stack and use them, then pop the return
> address off and jump back to it.


It may be interesting to know that a C/C++ program basically divides
memory into two sections: The so-called "heap" and the so-called "stack",
and that when you call a function, the return address and the function
parameters are put in the stack, from where the function in question can
access them. (This may be interesting information because then you will
know why in some cases your program is termitating due to running out of
stack space.)

However, it's still rather irrelevant which opcodes exactly the compiler
is using to do this.

> And the language allows many dodges, from unions to reinterpret_cast,
> to allow you to turn off that syntax, and enforce the behavior that
> you want. For example, indexing off the end of a minor dimension of a
> multidimensional array is well-defined, if you are still within the
> major dimensions.


I still don't see the relevance of knowing which opcodes are being
created from your C++ code.

>> * That's not to say that *everything* that is extremely low-level is
>> completely irrelevant. For example, there may be situations where memory
>> locality and cache optimization may affect the efficiency of the program
>> very significantly (eg. your program could become an order of magnitude
>> faster if you do something more cache-optimally). However, this has nothing
>> to do with exactly what assembler opcodes the compiler is generating.

>
> The point of using C++ is writing fast, tight code via some awareness
> what the operations cost. Learning assembler (even for another
> architecture) helps that, even when you only keep it in the back of
> your mind.


Knowing how the CPU architecture works doesn't require you knowing the
exact opcodes which the CPU is executing. In the example above, you can
perfectly well understand how data caching works in modern CPUs without
knowing even one single assembler opcode.
 
Reply With Quote
 
Öö Tiib
Guest
Posts: n/a
 
      08-22-2010
On 22 aug, 22:07, Juha Nieminen <(E-Mail Removed)> wrote:
> Phlip <(E-Mail Removed)> wrote:
> > On Aug 21, 11:08*pm, Juha Nieminen <(E-Mail Removed)> wrote:
> >> Phlip <(E-Mail Removed)> wrote:
> >> > And you need to understand which C++ statements map onto what general
> >> > kinds of Assembler opcodes (push, pop, branch, fetch, etc.).

>
> >> * I don't think I would fully agree with that. Exactly what does it matter
> >> what kind of assembler opcodes the compiler is generating from your code?

>
> > C++ is a C language, which is a kind of "portable assembler".

>
> > The point of using C++ is writing high-level code that compiles very
> > tight and runs very fast. If you don't need small footprint and high
> > speed, then use a language such as Ruby, which is gentler to the
> > programmer, and harsher to the CPU.

>
> * I still don't see why you should worry which precise opcodes are being
> generated from your C++ code by the compiler. Still feels completely
> irrelevant.
>
> >> * In fact, I would say the exact opposite: You should not worry which
> >> opcodes the compiler is generating, mainly because the compiler will
> >> often generate wildly different opcodes for the same C++ code depending
> >> on the compiler itself, the target architecture and the amount of
> >> optimizations. The exact opcodes being generated is rather irrelevant
> >> from the point of view of programming in C++.

>
> > Most of the time you do not worry about the specific opcodes.

>
> * Most of the time? Can you give an example of a situation where knowing
> the exact opcodes is relevant in any way?


Probably not exact opcodes. Just the facts like that adding int to
pointer involves multiplication. It is clear if just to think about
it. Some realize it (and other as "surprising" things) first time when
eyeballing the opcodes. It is easier to see than to think. So opcodes
are one way of understanding things to the very bottom of actual
truth.

 
Reply With Quote
 
Phlip
Guest
Posts: n/a
 
      08-23-2010
On Aug 22, 12:07*pm, Juha Nieminen <(E-Mail Removed)> wrote:
> Phlip <(E-Mail Removed)> wrote:


> >> > And you need to understand which C++ statements map onto what general
> >> > kinds of Assembler opcodes (push, pop, branch, fetch, etc.).

>
> * I still don't see why you should worry which precise opcodes are being
> generated from your C++ code by the compiler. Still feels completely
> irrelevant.


general... precise. Violent agreement achieved.

(BTW, just to show off the tiny bit of assembler I know, a "precise
opcode" is impossible. Each one is really the name of a whole category
of similar machine language instructions that each take different
argument types.)

> * Most of the time? Can you give an example of a situation where knowing
> the exact opcodes is relevant in any way?


I declared something volatile, and it has a bug, so I debug thru the
opcodes, looking to see if it accidentally got aliased.

Other than that, I said "some awareness", not exact opcodes, so you
got some straw going there.
 
Reply With Quote
 
Phlip
Guest
Posts: n/a
 
      08-23-2010
On Aug 22, 1:06*pm, Öö Tiib <(E-Mail Removed)> wrote:

> Probably not exact opcodes. Just the facts like that adding int to
> pointer involves multiplication. It is clear if just to think about
> it. Some realize it (and other as "surprising" things) first time when
> eyeballing the opcodes. It is easier to see than to think. So opcodes
> are one way of understanding things to the very bottom of actual
> truth.


And multiplying too high numbers causes (well defined?) roll-over.
Simply to allow the CPU to use its most primitive & fastest multiplier
as often as possible.
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      08-23-2010
On Aug 22, 7:08 am, Juha Nieminen <(E-Mail Removed)> wrote:
> Phlip <(E-Mail Removed)> wrote:
> > And you need to understand which C++ statements map onto
> > what general kinds of Assembler opcodes (push, pop, branch,
> > fetch, etc.).


> I don't think I would fully agree with that. Exactly what does
> it matter what kind of assembler opcodes the compiler is
> generating from your code?


More to the point, how can you know, since different hardware
has different sets of machine instructions. (I've used C on
a machine with no push or pop instructions.) And even on the
same machine, different compilers may map things differently,
and use different instructions.

[...]
> That's not to say that *everything* that is extremely
> low-level is completely irrelevant. For example, there may be
> situations where memory locality and cache optimization may
> affect the efficiency of the program very significantly (eg.
> your program could become an order of magnitude faster if you
> do something more cache-optimally). However, this has nothing
> to do with exactly what assembler opcodes the compiler is
> generating.


Yes, but even then, an O(n ln n) algorithm with poor locality
will outperform an O(n^2) algorithm with good locality. (And
such locality issues are often very machine dependent.)

--
James Kanze
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      08-23-2010
On Aug 22, 8:07 pm, Juha Nieminen <(E-Mail Removed)> wrote:
> Phlip <(E-Mail Removed)> wrote:
> > On Aug 21, 11:08 pm, Juha Nieminen <(E-Mail Removed)> wrote:
> >> Phlip <(E-Mail Removed)> wrote:


[...]
> > Most of the time you do not worry about the specific opcodes.


> Most of the time? Can you give an example of a situation where
> knowing the exact opcodes is relevant in any way?


When you're implementing a compiler. Or when the error is due
to a bug in the compiler (especially when it only occurs when
optimization is turned on).

[...]
> It may be interesting to know that a C/C++ program basically
> divides memory into two sections: The so-called "heap" and the
> so-called "stack",


Actually, there are more parts than that: most good compilers
will use 5 or 6 distinct segments, if not more. (At a minimum:
one for code, one for the stack walkback information used when
handling an exception, one for variables with static lifetime,
a heap, a stack per thread, a place to store the exception
during stack walkback, etc., etc.)

> and that when you call a function, the return address and the
> function parameters are put in the stack,


Or in registers, or any other place where the called function
can find it.

[...]
> > And the language allows many dodges, from unions to reinterpret_cast,
> > to allow you to turn off that syntax, and enforce the behavior that
> > you want. For example, indexing off the end of a minor dimension of a
> > multidimensional array is well-defined, if you are still within the
> > major dimensions.


This is, of course, completely false. To begin with, there are
no multidimensional arrays in C++, the closest one can come is
an array of arrays. And out of bounds indexing is undefined
behavior, regardless of where it occurs.

> > The point of using C++ is writing fast, tight code via some
> > awareness what the operations cost.


The point of using C++ is that it is the best alternative you
have available for some particular project. Generally speaking,
C++ gives you the possibility of writing very well engineered
software, which is very important in large projects. The fact
that it usually produces fairly efficient code isn't always that
important; the fact that it supports large scale software
engineering, and has a wide choice of compilers readily
available is.

--
James Kanze
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
adding assembly to windows\assembly through bat file Grant Merwitz ASP .Net 3 09-15-2005 11:40 AM
Assembly's manifest definition does not match the assembly reference. Horatiu Margavan via .NET 247 ASP .Net 0 08-30-2004 04:14 PM
ASP.NET 2.0: What is the namespace and assembly name of generated assembly SA ASP .Net 0 08-09-2004 05:09 PM
Referencing assembly from GAC using @assembly fails Brent ASP .Net 1 01-23-2004 08:23 PM
can a strongly named assembly reference a regular assembly? Prasanna Padmanabhan ASP .Net 1 11-19-2003 06:21 AM



Advertisments