Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Scan for Tokens

Reply
Thread Tools

Scan for Tokens

 
 
Raul Parolari
Guest
Posts: n/a
 
      11-11-2007
I am looking for the best way to break an input string into individual
tokens (I do not want to use a lexer library); I found some Ruby
programs that do it by "nibbling" at the string, like this (for
simplicity, the tokens are simply printed):
str = "20 * sin(x) + ..."

while (s.length > 0)
if str.sub!(\A\s*(\d+)/) { |m| puts "nr: #{m}" ; '' }
elsif str.sub!(\A\s*(\w+)/) { |m| puts "func: #{m}" ; '' }

This works, but it is very inefficient as the string has to be
continuously modified (a variation is to use str.match and then set str
= post_match, that is
probably even worse).
I was looking for the equivalent of what Perl calls "walking the string"
(if $str =~ /\G ../gcxms), picking up one token at the time at the point
after the previous one was retrieved.

I saw in the Pickaxe the mention of \G with scan; but I could not make
scan work 'one token at the time'; I had to list all the tokens as
argument, and then I had to find out which token had hit, ie:

str.scan(/\G\s* (\d+ | [**]| [+] | [(] | ..)/xm) do |m|
if m[0].match(/A\d+\z/) then puts "number: #{m}"
elsif m[0].match(/A\[**]\z/) then puts "power: #{m}"
..

It worked perfectly (almost to my surprise!); but it seems funny (unRuby
like) to have to repeat the tokens (even if in my real code I used
regexp vars to avoid hardcoding them twice, it still is a repetition).

I looked at 4 Ruby books and I found only platitudes on the subject (or
references to libraries). I would love to hear an elegant way to solve
this,

thanks!

Raul
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
Phrogz
Guest
Posts: n/a
 
      11-11-2007
On Nov 10, 6:07 pm, Raul Parolari <(E-Mail Removed)> wrote:
> I am looking for the best way to break an input string into individual
> tokens (I do not want to use a lexer library)


Look at the StringScanner library[1] included with Ruby. It's simple,
and it's fast. It's the basis of my TagTreeScanner library[2], which
is specialized for parsing arbitrary text and converting it into
hierarchically nested markup (e.g. XML).

[1] http://ruby-doc.org/stdlib/libdoc/st...doc/index.html
[2] http://phrogz.net/RubyLibs/OWLScribble/doc/tts.html

 
Reply With Quote
 
 
 
 
Raul Parolari
Guest
Posts: n/a
 
      11-11-2007
Gavin Kistner wrote:
> On Nov 10, 6:07 pm, Raul Parolari <(E-Mail Removed)> wrote:
>> I am looking for the best way to break an input string into individual
>> tokens (I do not want to use a lexer library)

>
> Look at the StringScanner library[1] included with Ruby. It's simple,
> and it's fast. It's the basis of my TagTreeScanner library[2], which
> is specialized for parsing arbitrary text and converting it into
> hierarchically nested markup (e.g. XML).
>
> [1] http://ruby-doc.org/stdlib/libdoc/st...doc/index.html
> [2] http://phrogz.net/RubyLibs/OWLScribble/doc/tts.html


Gavin

I was surprised at first that this basic capability was in a library,
but
StringScanner works beautifully, and it is indeed extremely fast.

I will try your TagTreeScanner at the first chance

Thank you

Raul
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Nikon Scan vs Vuescan, Nikon Scan smears detail, why (0/1) melbjer@hotmail.com Digital Photography 3 08-09-2008 02:52 AM
Best to scan in 48 Bit HDR? Or use 48 Bit + modify during scan? NewScanner Digital Photography 9 01-16-2007 04:07 AM
Finding and replacing Invalid Tokens in an XML document Ben Holness Perl 0 01-06-2006 12:11 PM
RE: string into tokens =?Utf-8?B?RWx0b24gVw==?= ASP .Net 0 10-13-2005 06:06 PM
Progressive scan dvd's on a non-progressive scan tv jack lift DVD Video 7 12-09-2003 06:01 PM



Advertisments