Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > sorting 5million minute data: choose std::list or std::vector

Reply
Thread Tools

sorting 5million minute data: choose std::list or std::vector

 
 
gwowen
Guest
Posts: n/a
 
      02-07-2011
On Feb 7, 1:06*pm, Leigh Johnston <(E-Mail Removed)> wrote:

>> I've written code for compilers/platforms for which [snip]

> [Your correct code fix snipped]
>*In which case on VC++ I get:


Then I think we can safely say that VC++ was not the platform to which
I was referring can't we? Sheesh, some people are so keen to be
thought of as smarter than everyone else in the whole world, they end
up looking like idiots.

Stick to arguing with Paul, at least you *are* probably smarter than
him.
 
Reply With Quote
 
 
 
 
gwowen
Guest
Posts: n/a
 
      02-07-2011
On Feb 7, 1:42*pm, Leigh Johnston <(E-Mail Removed)> wrote:

> Here the only difference is intel opcode jl versus intel opcode jb which
> IICR have the same performance characteristics.


I'm not talking about an intel processor, so the effacacy of intel
opcodes is utterly irrelevant.
 
Reply With Quote
 
 
 
 
gwowen
Guest
Posts: n/a
 
      02-07-2011
On Feb 7, 1:58*pm, Leigh Johnston <(E-Mail Removed)> wrote:
> It was simply a balanced, pertinent counter-example to your claim of
> multiple platforms


If you think "one platform agrees with me" provides a counter
example .

> Like it or not the Intel platform is quite pervasive (not some esoteric platform that nobody cares much about).


I know that. In fact, that was entirely my point. I was talking
about an esoteric platform - specifically a custom DSP Asic who's
default signed integer arithmetic saturated (because on embedded DSPs,
that is almost always what you want signed arithmetic to do - clip
your signal rather than wrap it around).

So every time you do an unsigned addition on that processor, the C
standard insists that you check for overflow and correct the results
if overflow is about to occur. If you use signed arithmetic, the
compiler just loads the registers and calls the appropriate
instruction. (In fact, if memory recalls, unsigned arithmetic was done
using a subroutine, signed arithmetic was a single opcode, although I
may have misremembered that).

As to specific cases where the compiler can use the "overflow is
undefined" to optimize, with assembler generated, your attention is
drawn to http://www.airs.com/blog/archives/120
 
Reply With Quote
 
gwowen
Guest
Posts: n/a
 
      02-07-2011
On Feb 7, 2:42*pm, Leigh Johnston <(E-Mail Removed)> wrote:

> The implementation specific behaviour of some esoteric platform (or any
> specific platform for that matter) should not necessarily direct us as
> to what constitutes good C++ coding practice.
>


Given that all I originally said was "There are sometimes good
reasons" -- the thing which you claimed to have disproved with a
counterexample [not sure how you disprove a "sometimes" with a
counterexample, but hey ho] -- its good to see you've come around.

Unsigned integral types - I use them a lot, they're handy, especially
when I'm writing code for Intel and other general purpose processors
[though I bet there are some GPUs that work like DSPs for similar
reasons]. Eschewing (good word, by the way) them without a good reason
is pretty silly, but as I originally -- and correctly, as you now
agree -- said *THERE ARE SOMETIMES GOOD REASONS*.
 
Reply With Quote
 
gwowen
Guest
Posts: n/a
 
      02-07-2011
On Feb 7, 3:47*pm, Leigh Johnston <(E-Mail Removed)> wrote:

> My original reply stated more or less what you claim I *now* agree with:


So what was your "counterexample" meant to demonstrate?
 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      02-07-2011
Leigh Johnston <(E-Mail Removed)> wrote:
> Which is why is said "*If*" in my reply. On the implementation I use
> static_cast<int> has no overhead and unlike some people I don't mind the
> verbosity of the C++ style casts.


It's not about verbosity. It's about clarity and the maning of your
code. If you say that a variable is unsigned and then later you cast it
to signed, you *lied* in your variable declaration. That can only cause
confusion.
 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      02-07-2011
Leigh Johnston <(E-Mail Removed)> wrote:
> On 07/02/2011 17:01, Juha Nieminen wrote:
>> Leigh Johnston<(E-Mail Removed)> wrote:
>>> Which is why is said "*If*" in my reply. On the implementation I use
>>> static_cast<int> has no overhead and unlike some people I don't mind the
>>> verbosity of the C++ style casts.

>>
>> It's not about verbosity. It's about clarity and the maning of your
>> code. If you say that a variable is unsigned and then later you cast it
>> to signed, you *lied* in your variable declaration. That can only cause
>> confusion.

>
> Wrong. The cast is sometimes required when using the unsigned variable
> in an expression involving signed variables (as clearly indicated in the
> example code I posted); a cast is not required in other contexts. It
> seems to me that the only confusion is your understanding of the issue.


But that's exactly why you need to make the original variable signed:
Because it can be used in signed expressions. It makes no sense to make
it unsigned when it will be used as a signed value anyways.

Whenever you need to cast an unsigned variable to a signed one in an
expression, that's a sign of bad design. You are using the variable in
a way that contradicts its original declaration.
 
Reply With Quote
 
Bo Persson
Guest
Posts: n/a
 
      02-07-2011
Leigh Johnston wrote:
> On 07/02/2011 16:50, gwowen wrote:
>> On Feb 7, 3:47 pm, Leigh Johnston<(E-Mail Removed)> wrote:
>>
>>> My original reply stated more or less what you claim I *now*
>>> agree with:

>>
>> So what was your "counterexample" meant to demonstrate?

>
> It was simply to add some balance to your initial reply in which you
> mention compilers/platforms (plural) producing *considerably*
> smaller code when not using unsigned which I initially viewed as
> argument.


And now we know one platform (singular) where the code is not
considerably smaller?


Bo Persson


 
Reply With Quote
 
Kaba
Guest
Posts: n/a
 
      02-07-2011
Leigh Johnston wrote:
> Wrong; it is not a sign of bad design at all. If a variable is used as
> an unsigned value more often than as a signed value and it logically
> makes sense for it to be unsigned (e.g. it is never negative) then
> making the variable unsigned makes perfect sense.


--8x--

> You might be in the "use int everywhere" camp but I am not. Perhaps
> Java is a more suitable language for you rather than C++.


For what it's worth, here's my take on this. This is a piece of text I
have written earlier (explaining the form). I am in the 'use int
everywhere' camp Ideas can't and shouldn't be forced, but they can be
shared; the following text reflects how I reason my position, not that
the other camps are wrong.

In this article I present two problems in using unsigned
integers in a program. These are the _zero boundary problem_,
and the _extended positive range problem_.

Zero boundary problem
---------------------

Most of the time we are assuming that, away from boundaries caused
by the finite representation, an integer object works like an element
of the integer ring ZZ. In addition, it seems plausible that most
integer calculations are concentrated around a neighborhood of zero.

The _zero-boundary problem_ (of unsigned integers) is that the zero
is on the boundary beyond which the assumption of working with integers
falls apart. Thus, the probability of introducing errors in computations
increases greatly. Furthermore, those errors can not be catched, since
every value is legal.

### Looping with unsigned integers

Looping over values from n to zero backwards demonstrates the zero
boundary problem with unsigned integers:

for (unsigned int i = n;i >= 0;--i)
{
// Do something.
}

Since there are no negative values to fail the loop test,
this loop never finishes. In contrast, with signed integers the problem
does not exist, since the boundaries are located far away from zero:

for (int i = n;i >= 0;--i)
{
// Do something.
}

Extended positive range problem
-------------------------------

Conversion between an unsigned integer and a signed integer
is an information destroying process in either direction.
The only way to avoid this is to use one or the other
consistently.

If the normal arithmetic is the intention, then a signed integer
represents a more general concept than an unsigned integer: the former
covers both the negative and positive integers, whereas the latter
only covers non-negative integers. In programming terms, it is
possible to create a program using signed integers alone, however,
the same can't be said about the unsigned integers. Therefore, if
only one type is to be used consistently, the choice should be a
signed integer. However, let us do some more analysis.

Since any non-trivial program must use signed integers, the use of
unsigned integers eventually leads to an unsigned-signed
conversion. In particular, because `std::size_t` in the Standard
Library is an unsigned integer, there are few programs that can
escape the unsigned-signed conversions.

Despite these conversions, programs still seem to work. The reason
for this, I reflect, is that the unsigned integers are normally
not taken into the extended positive range they allow.
The _extended positive range problem_ is that if the unsigned integers
are taken to their extended positive range, then the signed-unsigned
conversions become a reality. Ensuring correctness under such a threat
is hard, if not practically impossible.

Conclusion
----------

Using unsigned integers to model integers decreases the probability
that a program is correct.

--
http://kaba.hilvi.org
 
Reply With Quote
 
Kaba
Guest
Posts: n/a
 
      02-07-2011
Leigh Johnston wrote:
> On 07/02/2011 21:58, Kaba wrote:
> > The _zero-boundary problem_ (of unsigned integers) is that the zero
> > is on the boundary beyond which the assumption of working with integers
> > falls apart. Thus, the probability of introducing errors in computations
> > increases greatly. Furthermore, those errors can not be catched, since
> > every value is legal.

>
> It is debatable if the probability of introducing errors increases
> greatly when using unsigned integral types; this is certainly an
> assertion on your part, do you have any evidence to back it up?


Consider any function taking in an unsigned integer (e.g.
std::vector::resize()). It is possible that the computation of a new
size, as a bug, ends up being a negative number, which is then wrapped
to a large positive number. This bug can't be catched, since every value
is legal.

Something similar can happen with signed integers: a large negative
number wraps to large positive number. But this is much less probable:
the action is concentrated around zero.

Correctness is the most important property of a program. In the
described case using a signed integer _brings visible_ an often
occurring bug.

The problem of unchecked bugs gets worse with increasing levels of
abstraction. In the worst case you go through a deep layer of functions,
with an unexplained crash, although the actual problem was that an
argument of the top-most function did not fulfill its precondition (e.g.
size < 0). It's quite comparable to the thousand-lines template errors
when using Boost.Spirit.

> > ### Looping with unsigned integers
> >
> > Looping over values from n to zero backwards demonstrates the zero
> > boundary problem with unsigned integers:
> >
> > for (unsigned int i = n;i>= 0;--i)
> > {
> > // Do something.
> > }

>
> for (unsigned int i = n + 1;i-- > 0
> {
> // Do something.
> }


Nice However, the point is that to carry out a task you have to be
careful to avoid a bug. That's something that should not be left to
humans (because by Murphy's law...).

> > Despite these conversions, programs still seem to work. The reason
> > for this, I reflect, is that the unsigned integers are normally
> > not taken into the extended positive range they allow.
> > The _extended positive range problem_ is that if the unsigned integers
> > are taken to their extended positive range, then the signed-unsigned
> > conversions become a reality. Ensuring correctness under such a threat
> > is hard, if not practically impossible.

>
> You ensure correctness as you would ensure correctness using any other
> language feature.


I don't think it can be afforded, because no one can put the effort to
check each unsigned-signed conversion for trouble. The practical way,
which I also use, is to close the eyes and avoid exercising a program to
its extremes (e.g. 2^32 - 1 objects).

Interestingly, the 64-bit integers help with the extended range problem
_in practice_: the extreme values are then often avoided naturally (e.g.
no one creates an std::vector with 2^64 elements). But the zero-boundary
problem remains.

> > Conclusion
> > ----------
> >
> > Using unsigned integers to model integers decreases the probability
> > that a program is correct.
> >

>
> Bugs can be created using any language feature therefore it is, IMO,
> wrong to eschew a particular language feature based on the possibility
> of bug creation unless there is clear evidence that the language feature
> in question has a high probability of bug creation (for some definition
> of "high").


The possibility of a bug creation is not a problem. The problem is that
using unsigned integers prevents me from catching bugs, and that way
decreases my confidence on program correctness.

> What you have presented is little more than conjecture IMO.


Critical thinking appreciated.

--
http://kaba.hilvi.org
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Sorting list vs sorting vector boltar2003@boltar.world C++ 2 07-06-2010 09:40 AM
fired event Sorting which wasn't handled - sorting and SelectedIndexChanged Jason ASP .Net Web Controls 0 10-04-2006 02:19 PM
Refresh every minute, on the minute. Denebola Java 3 02-27-2006 07:39 AM
HELP!, EXACTLY every minute...... =?Utf-8?B?U3dhbXBkb25reQ==?= Wireless Networking 6 08-26-2004 08:08 PM
sorting by multiple criterias (sub-sorting) Tom Kirchner Perl Misc 3 10-11-2003 05:16 PM



Advertisments