Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Building the finite state machine of a Ruby regexp

Reply
Thread Tools

Building the finite state machine of a Ruby regexp

 
 
Gilles Lacroix
Guest
Posts: n/a
 
      08-01-2006

Does anyone know if there is a nice way to build the Finite State Machine
(FSM) of a Ruby regexp ?

Assuming the native ruby regexp parser used a (more or less explicit) FSM
I first thought that there was some builtin way to access or re-build it
but, after reading some reference docs (Regexp class) and posts I can't
see any useful methods to do this. But I would be happy to be wrong about
this...

As an alternative solution, I wonder if someone knows about a module that
could parse Ruby regexps (accepting exactly the same regexp grammar as
Ruby) and let me read the associated FSM (or whatever graph-like structure
representing the regexp).

Any help will be greatly appreciated. Thanks sincerely.
 
Reply With Quote
 
 
 
 
Caleb Clausen
Guest
Posts: n/a
 
      08-02-2006
On 8/1/06, Gilles Lacroix <(E-Mail Removed)> wrote:
>
> Does anyone know if there is a nice way to build the Finite State Machine
> (FSM) of a Ruby regexp ?


I don't know of anything specifically oriented to Ruby Regexps that
does this, but surely such tools exist for (say) Perl...?

> As an alternative solution, I wonder if someone knows about a module that
> could parse Ruby regexps (accepting exactly the same regexp grammar as
> Ruby) and let me read the associated FSM (or whatever graph-like structure
> representing the regexp).


Given how notoriously opaque they are, I've actually found regular
expressions to be pretty easy to parse. I can send you a partial
Regexp parser from one of my projects that just extracted the
information I needed to know at the time.... but probably what you
really want is Simon Strandgaard's regexp-engine. You can find it
here:

http://rubyforge.org/projects/aeditor/

 
Reply With Quote
 
 
 
 
Matthew Smillie
Guest
Posts: n/a
 
      08-03-2006
On Aug 2, 2006, at 22:44, Caleb Clausen wrote:

> On 8/1/06, Gilles Lacroix <(E-Mail Removed)> wrote:
>>
>> Does anyone know if there is a nice way to build the Finite State
>> Machine
>> (FSM) of a Ruby regexp ?

>
> I don't know of anything specifically oriented to Ruby Regexps that
> does this, but surely such tools exist for (say) Perl...?


Strictly speaking, there isn't anything that does this, since modern
'extended' regexes (a set which includes Ruby's regexes) aren't
always representable as finite state automata (i.e., they can
recognise more than the class of regular languages).

>> As an alternative solution, I wonder if someone knows about a
>> module that
>> could parse Ruby regexps (accepting exactly the same regexp
>> grammar as
>> Ruby) and let me read the associated FSM (or whatever graph-like
>> structure
>> representing the regexp).


Sorry, can't help you there either - I don't know and couldn't dig up
anything that exposes Ruby's regex internals (or Oniguruma's) in the
way you're asking for, which strikes me as the only way to use
*exactly* the same grammar as Ruby does. Like Caleb Clausen
suggested, though, there are probably partial solutions out there,
depending on how flexible your criteria are and exactly what
information you need - are you looking for an aid in building
regexes? just visualising them? some other sort of analysis?

matthew smillie.

 
Reply With Quote
 
Gilles Lacroix
Guest
Posts: n/a
 
      08-03-2006
On Thu, 03 Aug 2006 09:22:14 +0900, Matthew Smillie wrote*:

> (...) Like Caleb Clausen
> suggested, though, there are probably partial solutions out there,
> depending on how flexible your criteria are and exactly what
> information you need - are you looking for an aid in building
> regexes? just visualising them? some other sort of analysis?


What I'd like to do is, given any regexp :
1. To test *quickly* if a particular string matches this regexp : that's
why I'd prefer to use native Ruby regexes. They are the natural choice
when programming in Ruby and offer very good performance (because the
parser is compiled into the Ruby interpreter from a well-established C
code base (GNU regexps) that should be reasonably well optimized after
decades of public exposure).
2. To generate "random" strings that match the same regexp : that's why I
was thinking about using a Finite State Machine (or kind of it since, as
you pointed out 'modern' regexp are not alway representable as FSM) :
starting from the initial state and choosing randomly an outbound edge,
I can add a (first) character to the string. Repeating the process from
the node I arrived on, I can add a second character, and so on... (then
I will also have to make sure that I finally reach the final state in a
reasonable time but that's another problem).

Gilles Lacroix.



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
RegExp as Finite State Machine Georg Jaehnig Javascript 21 01-07-2009 05:29 PM
Safe finite state machine design SomeDude VHDL 3 08-14-2006 12:47 PM
regexp finite state machine? Walter Roberson Perl Misc 0 03-08-2005 07:21 PM
Gate Level model of a Finite state machine Inderkal VHDL 8 12-09-2004 11:17 PM
HELP PLEASE!! - Finite State Machine - Automaton - Microprogrammed System deejayfred VHDL 0 10-02-2003 01:23 AM



Advertisments