Velocity Reviews > Re: word to digit module

# Re: word to digit module

Stephen Thorne
Guest
Posts: n/a

 12-22-2004
On Wed, 22 Dec 2004 10:27:16 +0530, Gurpreet Sachdeva
<(E-Mail Removed)> wrote:
> Is there any module available that converts word like 'one', 'two',
> 'three' to corresponding digits 1, 2, 3??

This seemed like an interesting problem! So I decided to solve it.

I started with
http://www.python.org/pycon/dc2004/papers/42/ex1-C/ which allowed me
to create a nice test suite.
import num2eng
for i in range(40000):
e = num2eng.num2eng(i)
if toNumber(e) != i:
print e, i, toNumber(e)

once this all important test suite was created I was able to knock up
the following script. This is tested up to 'ninty nine thousand nine
hundred and ninty nine'. It won't do 'one hundred thousand', and isn't
exceptionally agile. If I were to go any higher than 'one hundred
thousand' I would probably pull out http://dparser.sf.net/ and write a
parser.

translation = {
'and':0,
'zero':0,
'one':1,
'two':2,
'three':3,
'four':4,
'five':5,
'six':6,
'seven':7,
'eight':8,
'nine':9,
'ten':10,
'eleven':11,
'twelve':12,
'thirteen':13,
'fourteen':14,
'fifteen':15,
'sixteen':16,
'seventeen':17,
'eighteen':18,
'nineteen':19,
'twenty':20,
'thirty':30,
'forty':40,
'fifty':50,
'sixty':60,
'seventy':70,
'eighty':80,
'ninety':90,
'hundred':100,
'thousand':1000,
}

def toNumber(s):
items = s.replace(',', '').split()
numbers = [translation.get(item.strip(), -1) for item in items if
item.strip()]
if -1 in numbers:
raise ValueError("Invalid string '%s'" % (s,))

if 1000 in numbers:
idx = numbers.index(1000)
hundreds = numbers[:idx]
numbers = numbers[idx+1:] + [1000*x for x in hundreds]

if 100 in numbers:
idx = numbers.index(100)
hundreds = numbers[:idx]
numbers = numbers[idx+1:] + [100*x for x in hundreds]

return sum(numbers)

Stephen Thorne

John Machin
Guest
Posts: n/a

 12-22-2004
Stephen Thorne wrote:
> On Wed, 22 Dec 2004 10:27:16 +0530, Gurpreet Sachdeva
> <(E-Mail Removed)> wrote:
> > Is there any module available that converts word like 'one', 'two',
> > 'three' to corresponding digits 1, 2, 3??

>
> This seemed like an interesting problem! So I decided to solve it.
>
> I started with
> http://www.python.org/pycon/dc2004/papers/42/ex1-C/ which allowed me
> to create a nice test suite.
> import num2eng
> for i in range(40000):
> e = num2eng.num2eng(i)
> if toNumber(e) != i:
> print e, i, toNumber(e)
>
> once this all important test suite was created I was able to knock up
> the following script. This is tested up to 'ninty nine thousand nine
> hundred and ninty nine'. It won't do 'one hundred thousand', and

isn't
> exceptionally agile. If I were to go any higher than 'one hundred
> thousand' I would probably pull out http://dparser.sf.net/ and write

a
> parser.
>

Parser?

The following appears to work, with appropriate dict entries for
'million', 'billion', etc:
[apologies for the dots, which attempt to the defeat the
indent-stuffing]
..def toNumber2(s):
.. items = s.replace(',', '').split()
.. numbers = [translation.get(item.strip(), -1) for item in items if
item.strip()]
.. stack = [0]
.. for num in numbers:
.. if num == -1:
.. raise ValueError("Invalid string '%s'" % (s,))
.. if num >= 100:
.. stack[-1] *= num
.. if num >= 1000:
.. stack.append(0)
.. else:
.. stack[-1] += num
.. return sum(stack)

Scott David Daniels
Guest
Posts: n/a

 12-22-2004
John Machin wrote:
> Stephen Thorne wrote:
> .def toNumber2(s):
> . items = s.replace(',', '').split()
> . numbers = [translation.get(item.strip(), -1) for item in items if
> item.strip()]
> . stack = [0]
> . for num in numbers:
> . if num == -1:
> . raise ValueError("Invalid string '%s'" % (s,))
> . if num >= 100:
> . stack[-1] *= num
> . if num >= 1000:
> . stack.append(0)
> . else:
> . stack[-1] += num
> . return sum(stack)
>

Can I play too?
Let's replace the top with some little bit of error handling:

def toNumber3(text):
s = text.replace(',', '').replace('-', '')# for twenty-three
items = s.split()
try:
numbers = [translation[item] for item in items]
except KeyError, e:
raise ValueError, "Invalid element %r in string %r" % (
e.args[0], text)
stack = [0]
for num in numbers:
if num >= 100:
stack[-1] *= num
if num >= 1000:
stack.append(0)
else:
stack[-1] += num
return sum(stack)

--Scott David Daniels
http://www.velocityreviews.com/forums/(E-Mail Removed)

M.E.Farmer
Guest
Posts: n/a

 12-22-2004
Cool script just one little thing,
toNumber('One thousand') bites the dust.
Guess you should add another test, and s.lower()

Stephen Thorne wrote:
{code snip}
> def toNumber(s):

+ s = s.lower()
> items = s.replace(',', '').split()
> numbers = [translation.get(item.strip(), -1) for item in items if
> item.strip()]
> if -1 in numbers:
> raise ValueError("Invalid string '%s'" % (s,))
>
> if 1000 in numbers:
> idx = numbers.index(1000)
> hundreds = numbers[:idx]
> numbers = numbers[idx+1:] + [1000*x for x in hundreds]
>
> if 100 in numbers:
> idx = numbers.index(100)
> hundreds = numbers[:idx]
> numbers = numbers[idx+1:] + [100*x for x in hundreds]
>
> return sum(numbers)
>
> Stephen Thorne

M.E.Farmer

Stephen Thorne
Guest
Posts: n/a

 12-22-2004
On Wed, 22 Dec 2004 11:41:26 -0800, Scott David Daniels
<(E-Mail Removed)> wrote:
> John Machin wrote:
> > Stephen Thorne wrote:
> > .def toNumber2(s):
> > . items = s.replace(',', '').split()
> > . numbers = [translation.get(item.strip(), -1) for item in items if
> > item.strip()]
> > . stack = [0]
> > . for num in numbers:
> > . if num == -1:
> > . raise ValueError("Invalid string '%s'" % (s,))
> > . if num >= 100:
> > . stack[-1] *= num
> > . if num >= 1000:
> > . stack.append(0)
> > . else:
> > . stack[-1] += num
> > . return sum(stack)
> >

>
> Can I play too?
> Let's replace the top with some little bit of error handling:
>
> def toNumber3(text):
> s = text.replace(',', '').replace('-', '')# for twenty-three
> items = s.split()
> try:
> numbers = [translation[item] for item in items]
> except KeyError, e:
> raise ValueError, "Invalid element %r in string %r" % (
> e.args[0], text)
> stack = [0]
> for num in numbers:
> if num >= 100:
> stack[-1] *= num
> if num >= 1000:
> stack.append(0)
> else:
> stack[-1] += num
> return sum(stack)

Thankyou for you feedback, both of you.

Stephen.

John Machin
Guest
Posts: n/a

 12-22-2004

Stephen Thorne wrote:
> Thankyou for you feedback, both of you.

No wuckas.

> http://thorne.id.au/users/stephen/scripts/eng2num.py contains your

suggestions.

This points to some javascript which prints "No."