Velocity Reviews > VHDL > modulo of any number

# modulo of any number

Tim
Guest
Posts: n/a

 01-05-2007
I want to do the following

but because the base is not a factor of 2^x, it's not synthesizable.
What's the optimum way of performing this calculation? The easiest but
computationally expensive way would be

while temp>=320 loop
temp <= input - 100;
end loop;
output <= temp;

but I'm certain there's a better way but just don't know what.

Thanks,

Tim

Tim
Guest
Posts: n/a

 01-05-2007
> while temp>=320 loop
> temp <= input - 100;
> end loop;
> output <= temp;

This is what i meant:

while temp>=320 loop
temp <= input - 320;
end loop;
output <= temp;

I'm still pulling my hair out trying to find some solutions. So far,
I've uncovered some ideas,

In particular, Allan Heriman's post on Mon, Feb 14 2005 11:28 pm. It
seems that if you require a mod N function, you can split it up such
that mod N = mod a * mod b, where a is 2^n and b is 2^n-1? In which
case, could I use mod 320 = mod 64 * mod 5 or have I completely
misunderstood the posters intention? If someone could clarify that or
come up with any other ideas, I'd be grateful.

I should also mention that addr is a signal ranging from 0 to 76799, a
17 bit slv. The divisor, 320 is a constant.

Ben Jones
Guest
Posts: n/a

 01-05-2007
Hi Tim,

"Tim" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) ups.com...
>> while temp>=320 loop
>> temp <= input - 100;
>> end loop;
>> output <= temp;

>
> This is what i meant:
>
> while temp>=320 loop
> temp <= input - 320;
> end loop;
> output <= temp;
>
> I should also mention that addr is a signal ranging from 0 to 76799, a
> 17 bit slv. The divisor, 320 is a constant.

In general there is no one right answer to this. The simplest way is just a
step up from the method you identified above. Start by comparing the input
against in your case 128x320 = 40960, subtracting 40960 if it is greater.
Then compare the result of that to 40960/2 = 20480, subtracting 20480 if it
is greater. Carry on dividing down and comparing against 10240, 5120, 2560,
1280, 640 and finally 320. That way it will take you exactly 8
compare-subtract operations to reduce the address down to the range 0-319.

This method is fairly cheap in terms of circuitry, but has relatively high
latency. It has the advantage that it's pretty easy to implement, in either
an iterative (resource-shared) circuit or a pipelined version (for maximum
throughput).

What are your latency and throughput requirements for this operation?

Cheers,

-Ben-

jens
Guest
Posts: n/a

 01-05-2007
A slightly different way (no iteration, requires two multiplications
and a subtraction):

addr mod 320 = addr - 320 * ( ( addr * const1 ) / const2 )

where
const1 = round( 2^N / 320 )
const2 = 2^N
N needs to be large enough to handle the largest possible value of
const1 can probably be tweaked a little bit to reduce the value of N.

Arnim
Guest
Posts: n/a

 01-05-2007

> I want to do the following
>
>
> but because the base is not a factor of 2^x, it's not synthesizable.

I once started a design for Cyclone chips using QuartusII and the tool
happily synthesized a '<signed> mod 6' expression, fooling me into
thinking that this is common with modern tools. As the design was ported
to Spartan3 lateron, I was in deep sh*t when XST choked at the same code.

The following function is my contribution to this topic. Just a looped
generation of a look-up table. Might be easily parametrized to accept
other mods than 6.
QuartusII even yielded lower LUT count with this code than with the
original 'mod 6'.

-- Function mod_6_f
--
-- Purpose:
-- Calculate the modulo of 6.
-- Only the positive part is considered.
--
function mod_6_f(val : in hv_t) return hv_t is
variable mod_v : natural;
variable result_v : hv_t;
begin
if val(hv_t'left) = '0' then
result_v := (others => '0');
mod_v := 0;
for idx in 0 to 255 loop
if val = idx then
result_v := to_signed(mod_v, hv_t'length);
end if;

if mod_v < 5 then
mod_v := mod_v + 1;
else
mod_v := 0;
end if;
end loop;
else
result_v := (others => '-');
end if;

return result_v;
end;

David R Brooks
Guest
Posts: n/a

 01-05-2007
Tim wrote:
> I want to do the following
>
>
> but because the base is not a factor of 2^x, it's not synthesizable.
> What's the optimum way of performing this calculation? The easiest but
> computationally expensive way would be
>
> while temp>=320 loop
> temp <= input - 100;
> end loop;
> output <= temp;
>
> but I'm certain there's a better way but just don't know what.
>

From Wikipedia (art. "multiplicative inverse"):
> In modular arithmetic, the multiplicative inverse of x is also defined: it is the number a such that (a·x) mod n = 1. However, this multiplicative inverse exists only if a and n are relatively prime. For example, the inverse of 3 modulo 11 is 4 because it is the solution to (3x) mod 11 = 1. The extended Euclidean algorithm may be used to compute the multiplicative inverse modulo of a number.
>

Since (2^17 - 1) is prime, it follows that (2^17 -1) and 320 are
relatively prime. Hence a multiplicative inverse is defined in this case.
Given that, you can multiply by that inverse (many FPGAs feature a
hardware multiplier) to implement the division. You then multiply the
quotient by 320 (same multiplier?) & subtract from the original number
to get the remainder.
This method has the advantage (may not be relevant in your case) of a
constant execution time.

David M. Palmer
Guest
Posts: n/a

 01-06-2007
In article <(E-Mail Removed). com>, Tim
<(E-Mail Removed)> wrote:

> I want to do the following
>
>
> but because the base is not a factor of 2^x, it's not synthesizable.
> What's the optimum way of performing this calculation? The easiest but
> computationally expensive way would be
>
> while temp>=320 loop
> temp <= input - 100;
> end loop;
> output <= temp;
>
> but I'm certain there's a better way but just don't know what.

As you pointed out :

The only hard part is mod 5

--
David M. Palmer http://www.velocityreviews.com/forums/(E-Mail Removed) (formerly @clark.net, @ematic.com)

Tim
Guest
Posts: n/a

 01-07-2007
decided to go with jens algorithm. I have no idea why it works but
following D. Brooks post, I'll look it up on wikipedia.

> What are your latency and throughput requirements for this operation?

I don't have performance requirements per se as I am trying to get it
working for starters. Obviously, the faster the better though.

Tim
Guest
Posts: n/a

 01-08-2007

jens wrote:
> A slightly different way (no iteration, requires two multiplications
> and a subtraction):
>
> addr mod 320 = addr - 320 * ( ( addr * const1 ) / const2 )
>
> where
> const1 = round( 2^N / 320 )
> const2 = 2^N
> N needs to be large enough to handle the largest possible value of
> const1 can probably be tweaked a little bit to reduce the value of N.

I must be doing something wrong becuase my calculations are not
yielding the right remainder!
This is my working:

const1 = round(2^17/320) = 409.6 ~= 410
const1 = 2^17 = 131 072

Now, if I substitute the address 321, the remainder should be 1 as 321
mod 320 = 1.
But,
addr - 320 * ( ( addr * const1 ) / const2 ) = 321 - 320 * ( ( 321 * 410
) / 131 072 )
= 0.313 which is not remainder 1...

Can someone tell me what's wrong?

jens
Guest
Posts: n/a

 01-08-2007
> Now, if I substitute the address 321, the remainder should be 1 as 321
> mod 320 = 1.
> But,
> addr - 320 * ( ( addr * const1 ) / const2 ) = 321 - 320 * ( ( 321 * 410
> ) / 131 072 )
> = 0.313 which is not remainder 1...
>
> Can someone tell me what's wrong?

Sorry, I was a little vague on the divide operation- it's actually a
shift right by N, so you would lose the fraction, and the result of the
"divide" is 1, then 321 mod 320 = 1.

A spreadsheet is a good way to simulate this, optimize N and tweak the
constants, and verify every required value.