Alf P. Steinbach /Usenet wrote:

> * Vladimir Jovic, on 08.09.2010 12:43:

>> Hello,

>>

>> I am trying to optimize integer rescaling (see example at the end).

>>

>> There is a range or integer elements (in the example, elements are of

>> type

>> unsigned short), current range maximum, and the desired maximum. Straight

>> forward way is to multiply by new maximum, and then to divide by old

>> maximum

>> (taking care about the overflow).

>>

>> Are there better ways of rescaling the integers?

>>

>>
[snip old code]

> As Victor noted your 'b' is out of range. That means that you've not

> paid attention to detail. The details may or may not be important

> depending on the problem at hand, which you do not describe.

>
Actually, the first part of the algorithm is calculating histogram of an

image. Another part of the algorithm is using cumulative sum of the

histogram as a lookup table.

Since the maximum value in the histogram is number of points in the

image (one point is 16 bits), I need to rescale all values in the

histogram to the maximum value of 16 bits (which is 0xffff).

All parts of the algorithm are optimized (at least I think that is

true), except for this rescaling. I am doing it after I calculate the

cumulative sum, and I am doing it as shown in the code example (one

multiplication and one division). It works fine, but I was wondering if

there is a faster way.

> In the general case you need to first decide on the distribution you

> want. For example, are your maximums inclusive or exclusive? The value

> 0x0800 seems like an exclusive maximum, while 0x3fff seems like an

> inclusive one.

>
Yes, as I said else thread, that is a small mistake. For completeness,

here is the correct version :

///////////////////////////////////

#include <iostream>

unsigned short a = 0x1234; // value in the range to rescale

unsigned short b = 0x3fff; // new maximum range value

unsigned short c = 0x2800; // current maximum range value

int main()

{

const unsigned int d = a * b;

const unsigned short e = d / c;

std::cout << "rescaled value = 0x" << std::hex << e << std::endl;

}

///////////////////////////////////

> When you have decided /what/ your rescaling should do, then you should

> implement it correctly. That means e.g. avoiding overflow. The above

> code can easily overflow, so as portable code it's generally incorrect

> no matter the details of the desired scaling (it's like a car with

> square wheels, doesn't matter whether it's meant to be a sports car or a

> lorry, it's just wrong).

>
I didn't really think about code portability, because my target is set

(x86 - some pentium processor). I also wasn't thinking about the

overflow, because that is easily fixed in proper unit tests.

The problem I am facing is the algorithm can not be used, because it

uses too much CPU

> So, yes, there are better ways.

>

> The first step on the road to betterness is to define the desired

> effect, and the second step is to implement that correctly. Only then

> worry about micro efficiency. If at all (and if you do, first measure,

> Measure, MEASURE).

>
I did measure, and the profiler told that 30% time is spent on rescaling.

ps Is this another problem for comp.programming ? Is there a NG where to

post optimization questions?