Velocity Reviews > Long number base conversion

Long number base conversion

Eyegor
Guest
Posts: n/a

 11-25-2010
Hi,
I am working on a project, just for fun of it, so there is no 3rd
party restictions.
What I am trying to do is to write a simple bignum library, simmilar
to GMP. I know GMP exists and it works and I am not going to write
anything better than what it is, this is just for learning purposes.
Anyhow, I creaed several basic functions to add and subtract large
numbers which are basically Array Encoded Integers. I use 16-bit
signed integers as elements in the array, and the whole array
represents a very large integer, say 16000 bits or so.

The problem I ran into is writing a replica of scanf function.
Basically I want to comvert a decimal input like
67627457272523772428537372395768 into the binary array. Any suggestion
on how to do this would be greatly appreciated. I dont actually need
the code, just an algorithm suggestion. I wrote a printf function like
this about 5 years ago, but I dont have that code anymore and want to
replicate it.

thanks

BartC
Guest
Posts: n/a

 11-25-2010

"Eyegor" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...

> What I am trying to do is to write a simple bignum library, simmilar

> The problem I ran into is writing a replica of scanf function.
> Basically I want to comvert a decimal input like
> 67627457272523772428537372395768 into the binary array. Any suggestion
> on how to do this would be greatly appreciated. I dont actually need
> the code, just an algorithm suggestion. I wrote a printf function like
> this about 5 years ago, but I dont have that code anymore and want to
> replicate it.

s = "67627457272523772428537372395768"

n = 0
for each character c in s:
n = n*10 + (c-'0')

You need to use your bigint arithmetic for operations here.

--
Bartc

Eyegor
Guest
Posts: n/a

 11-25-2010
On Nov 24, 8:36*pm, "BartC" <(E-Mail Removed)> wrote:
> "Eyegor" <(E-Mail Removed)> wrote in message
>
> news:(E-Mail Removed)...
>
> > What I am trying to do is to write a simple bignum library, simmilar
> > The problem I ran into is writing a replica of scanf function.
> > Basically I want to comvert a decimal input like
> > 67627457272523772428537372395768 into the binary array. Any suggestion
> > on how to do this would be greatly appreciated. I dont actually need
> > the code, just an algorithm suggestion. I wrote a printf function like
> > this about 5 years ago, but I dont have that code anymore and want to
> > replicate it.

>
> s = "67627457272523772428537372395768"
>
> n = 0
> for each character c in s:
> * *n = n*10 + (c-'0')
>
> You need to use your bigint arithmetic for operations here.
>
> --
> Bartc

Enginious, Thank you.

Keith Thompson
Guest
Posts: n/a

 11-25-2010
Eyegor <(E-Mail Removed)> writes:
> I am working on a project, just for fun of it, so there is no 3rd
> party restictions.
> What I am trying to do is to write a simple bignum library, simmilar
> to GMP. I know GMP exists and it works and I am not going to write
> anything better than what it is, this is just for learning purposes.
> Anyhow, I creaed several basic functions to add and subtract large
> numbers which are basically Array Encoded Integers. I use 16-bit
> signed integers as elements in the array, and the whole array
> represents a very large integer, say 16000 bits or so.

Why are you using *signed* integers? I'd think that unsigned
integers would be more convenient. If you have a bignum represented
as an array of 1000 16-bit integers, you only need 1 sign bit,
not a thousand of them.

> The problem I ran into is writing a replica of scanf function.
> Basically I want to comvert a decimal input like
> 67627457272523772428537372395768 into the binary array. Any suggestion
> on how to do this would be greatly appreciated. I dont actually need
> the code, just an algorithm suggestion. I wrote a printf function like
> this about 5 years ago, but I dont have that code anymore and want to
> replicate it.

Presumably you've already got functions that implement the arithmetic
operators (+, -, *, /, %) for your Array Encoded Integers. Just use
them to implement the string-to-bignum conversion in the same way
you'd implement a string-to-integer conversion. It's the same
algorithm (which I presume you already know or can find or invent),
just using different operations.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

bert
Guest
Posts: n/a

 11-25-2010
On Nov 25, 1:21*am, Eyegor <(E-Mail Removed)> wrote:
> Hi,
> I am working on a project, just for fun of it, so there is no 3rd
> party restictions.
> What I am trying to do is to write a simple bignum library, simmilar
> to GMP. I know GMP exists and it works and I am not going to write
> anything better than what it is, this is just for learning purposes.
> Anyhow, I creaed several basic functions to add and subtract large
> numbers which are basically Array Encoded Integers. I use 16-bit
> signed integers as elements in the array, and the whole array
> represents a very large integer, say 16000 bits or so.
>
> The problem I ran into is writing a replica of scanf function.
> Basically I want to comvert a decimal input like
> 67627457272523772428537372395768 into the binary array. Any suggestion
> on how to do this would be greatly appreciated. I dont actually need
> the code, just an algorithm suggestion. I wrote a printf function like
> this about 5 years ago, but I dont have that code anymore and want to
> replicate it.

Yes, BartC is perfectly correct.

You will find that a couple of useful additions to
your range of functions are bignum * int -> bignum,
and bignum / int -> (quot bignum, rem int). The
latter is particularly handy for the inverse of
scanf, churning out the decimal digits of a bigint.
--

Tom St Denis
Guest
Posts: n/a

 11-25-2010
On Nov 24, 8:21*pm, Eyegor <(E-Mail Removed)> wrote:
> Hi,
> I am working on a project, just for fun of it, so there is no 3rd
> party restictions.
> What I am trying to do is to write a simple bignum library, simmilar
> to GMP. I know GMP exists and it works and I am not going to write
> anything better than what it is, this is just for learning purposes.
> Anyhow, I creaed several basic functions to add and subtract large
> numbers which are basically Array Encoded Integers. I use 16-bit
> signed integers as elements in the array, and the whole array
> represents a very large integer, say 16000 bits or so.

Why not check out LibTomMath [disclosure I was the original developer/
maintainer]? It's public domain, has a whole suite of typical bignum
functions (and some required for crypto), even comes with a book on
bignum math in PDF format in the archive.

I'm not trying to discourage you from writing your own code, by all
means do that. You'll also learn more if you try then to explain the
algorithms (worked for me).

http://libtom.org/?page=features&new...5&whatfile=ltm

Tom

Mark Wooding
Guest
Posts: n/a

 11-25-2010
"BartC" <(E-Mail Removed)> writes:

> "Eyegor" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>
> > What I am trying to do is to write a simple bignum library, simmilar

>
> > The problem I ran into is writing a replica of scanf function.
> > Basically I want to comvert a decimal input like
> > 67627457272523772428537372395768 into the binary array. Any suggestion
> > on how to do this would be greatly appreciated. I dont actually need
> > the code, just an algorithm suggestion. I wrote a printf function like
> > this about 5 years ago, but I dont have that code anymore and want to
> > replicate it.

>
> s = "67627457272523772428537372395768"
>
> n = 0
> for each character c in s:
> n = n*10 + (c-'0')

This is a really sucky way of doing it. Good large-integer
multiplication algorithms work best when their inputs are approximately
the same length. The above algorithm is more or less pathological.

Here's a sketch of a fairly good algorithm. The algorithm will read a
sequence V of base-R digits of unknown length L, most significant first,
and output the integer represented by V

V[0] R^(L-1) + V[1] R^(L-2) + ... + V[L - 1]

The algorithm is nonrecursive, but makes use of a stack of pairs (X, N)
such that 0 <= X < R^N. The stack maintains an invariant: if (X0, N0)
is nearer the top than (X1, N1) then N0 < N1. The algorithm to push an
item (X, N) onto the stack is therefore like this:

if the stack is empty, push (X, N) and return
let (X', N') be the topmost item on the stack
if N' > N then push (X, N) and return
pop (X', N')
set X <- X' R^N + X and N <- N + N' and go back to the top

Now, to convert the sequence V:

if V is exhausted, finish up (below)
read a digit D from V
push the pair (D, 1) onto the stack (above)

Finally, finish up as follows:

if the stack is empty, return 0 (or report an error: the
sequence was empty, which you might consider invalid)
let (Y, N) be the topmost item on the stack
while the stack is not empty:
pop (X, N')
set Y <- X R^N + Y and N <- N + R^N
output Y

It's not hard to compute a (fairly low) upper bound on the size of stack
required, so this can be allocated statically. All of the values of N
stored on the stack are powers of 2 (easy proof). You can save a number
of multiplications by caching R^N for the various values of N you
encounter rather than computing them afresh each time; again, the number
of items in this cache is bounded above by a small number you can
determine statically.

Plausible bounds are simple functions of CHAR_BIT and sizeof whatever
type you're using to store lengths of your bignums (this will be an
overestimate if there are padding bits, but that doesn't matter).

(I have an implementation of this in portable C, licensed under LGPL.)

-- [mdw]

BartC
Guest
Posts: n/a

 11-25-2010

"Mark Wooding" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) ...
> "BartC" <(E-Mail Removed)> writes:
>
>> "Eyegor" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...

>> > The problem I ran into is writing a replica of scanf function.
>> > Basically I want to comvert a decimal input like
>> > 67627457272523772428537372395768 into the binary array. Any suggestion
>> > on how to do this would be greatly appreciated. I dont actually need
>> > the code, just an algorithm suggestion.

>> s = "67627457272523772428537372395768"
>>
>> n = 0
>> for each character c in s:
>> n = n*10 + (c-'0')

>
> This is a really sucky way of doing it. Good large-integer
> multiplication algorithms work best when their inputs are approximately
> the same length. The above algorithm is more or less pathological.

It'll do to get started. I like to test code (even pseudo-code), before I
post, the above took me a few minutes (not in C though).

> Here's a sketch of a fairly good algorithm. The algorithm will read a
> sequence V of base-R digits of unknown length L, most significant first,
> and output the integer represented by V
>
> V[0] R^(L-1) + V[1] R^(L-2) + ... + V[L - 1]

I would just assume R is 10. If bigints are stored in binary form, and R is
2 or 16 (the mostly likely apart from 10), then a string can be translated
very efficiently with a few bit-operations, without multiplies or additions.

(As it happens, my own bigint library stores numbers in decimal format,
allowing a similarly fast conversion, but the entire library is so slow
anyway that it's not worth bothering with.)

> The algorithm is nonrecursive, but makes use of a stack of pairs (X, N)
> such that 0 <= X < R^N. The stack maintains an invariant: if (X0, N0)
> is nearer the top than (X1, N1) then N0 < N1. The algorithm to push an
> item (X, N) onto the stack is therefore like this:

....

I'll have to study this further.

--
Bartc

Eyegor
Guest
Posts: n/a

 11-25-2010
On Nov 25, 12:20*pm, "BartC" <(E-Mail Removed)> wrote:
> "Mark Wooding" <(E-Mail Removed)> wrote in message
>
> news:(E-Mail Removed) ...
>
>
>
> > "BartC" <(E-Mail Removed)> writes:

>
> >> "Eyegor" <(E-Mail Removed)> wrote in message
> >>news:(E-Mail Removed)....
> >> > The problem I ran into is writing a replica of scanf function.
> >> > Basically I want to comvert a decimal input like
> >> > 67627457272523772428537372395768 into the binary array. Any suggestion
> >> > on how to do this would be greatly appreciated. I dont actually need
> >> > the code, just an algorithm suggestion.
> >> s = "67627457272523772428537372395768"

>
> >> n = 0
> >> for each character c in s:
> >> * n = n*10 + (c-'0')

>
> > This is a really sucky way of doing it. *Good large-integer
> > multiplication algorithms work best when their inputs are approximately
> > the same length. *The above algorithm is more or less pathological.

>
> It'll do to get started. I like to test code (even pseudo-code), before I
> post, the above took me a few minutes (not in C though).
>
> > Here's a sketch of a fairly good algorithm. *The algorithm will read a
> > sequence V of base-R digits of unknown length L, most significant first,
> > and output the integer represented by V

>
> > * * * *V[0] R^(L-1) + V[1] R^(L-2) + ... + V[L - 1]

>
> I would just assume R is 10. If bigints are stored in binary form, and R is
> 2 or 16 (the mostly likely apart from 10), then a string can be translated
> very efficiently with a few bit-operations, without multiplies or additions.
>
> (As it happens, my own bigint library stores numbers in decimal format,
> allowing a similarly fast conversion, but the entire library is so slow
> anyway that it's not worth bothering with.)
>
> > The algorithm is nonrecursive, but makes use of a stack of pairs (X, N)
> > such that 0 <= X < R^N. *The stack maintains an invariant: if (X0, N0)
> > is nearer the top than (X1, N1) then N0 < N1. *The algorithm to push an
> > item (X, N) onto the stack is therefore like this:

>
> ...
>
> I'll have to study this further.
>
> --
> Bartc

You don't want to have base10 library, because it will by about 100
times slower than what it can be, in multiplication routine. The trick
is to make library base multiple of 2 so you can use left bit shift in

I am a Nuclear Engineer with M.Sc. in Physics, but I have a really
boring job babysitting technicians, so occasionally I get into a self
improvement mood, this happens when I feel like I am getting dumber at
work day after day. Anyhow, during these "self improvement moods I
usually code in C or do some other learning project. This is one of
those learning projects. I did this before, but lost code couple years
back during pc upgrade. Last time I wrote a factorial calculator which
used my own bignum library and could calculate 1,000,000 factorial
with no problems.

BartC
Guest
Posts: n/a

 11-25-2010
"Eyegor" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On Nov 25, 12:20 pm, "BartC" <(E-Mail Removed)> wrote:

>> (As it happens, my own bigint library stores numbers in decimal format,
>> allowing a similarly fast conversion, but the entire library is so slow
>> anyway that it's not worth bothering with.)

> You don't want to have base10 library, because it will by about 100
> times slower than what it can be, in multiplication routine. The trick
> is to make library base multiple of 2 so you can use left bit shift in

Binary allows you to use 16x16->32-bit multiplies (or even 32x32->64-bit
ones), instead of 3.3x3.3 bits->6.6 bits (as I stored one decimal digit per
byte).

A decimal-based library isn't that terrible (some things are easier, such as
printing results), but my library was only quickly put together to solve a
problem, and is not too serious (otherwise I wouldn't have implemented it in
an interpreted language...)

> I am a Nuclear Engineer with M.Sc. in Physics, but I have a really
> boring job babysitting technicians, so occasionally I get into a self
> improvement mood, this happens when I feel like I am getting dumber at
> work day after day. Anyhow, during these "self improvement moods I
> usually code in C or do some other learning project. This is one of
> those learning projects. I did this before, but lost code couple years
> back during pc upgrade. Last time I wrote a factorial calculator which
> used my own bignum library and could calculate 1,000,000 factorial
> with no problems.

How long did it take? My slow library took 90 minutes just to work out
100,000! (and you can read that exclamation mark either way...)

Might be time to upgrade it I think...

--
Bartc