Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Writing parsers?

Reply
Thread Tools

Writing parsers?

 
 
Paatsch, Bernd
Guest
Posts: n/a
 
      02-13-2006
------_=_NextPart_001_01C630F9.78EE12D1
Content-Type: text/plain

Hello,

I got this great task assigned to write a parser and looking at the files to
parse is not very trivial. Does anybody know where to find a website that
would explain steps and pitfalls to avoid writing a parser?
Any suggestion/help in is appreciated.

Thanks,
Bernd

------_=_NextPart_001_01C630F9.78EE12D1--


 
Reply With Quote
 
 
 
 
David Vallner
Guest
Posts: n/a
 
      02-14-2006
D=C5=88a Utorok 14 Febru=C3=A1r 2006 00:59 Paatsch, Bernd nap=C3=ADsal:
> Hello,
>
> I got this great task assigned to write a parser and looking at the files
> to parse is not very trivial. Does anybody know where to find a website
> that would explain steps and pitfalls to avoid writing a parser?
> Any suggestion/help in is appreciated.
>
> Thanks,
> Bernd


http://epaperpress.com/lexandyacc/ seems to be a useful resource, provided =
you=20
already know some theory behind formal grammars and such. The two tools and=
=20
their derivatives are pretty much the open source standard for writing=20
parsers. I believe there are Ruby bindings / variants of both.

ANTLR is also somewhat used, but you're probably looking at Java there.

David Vallner


 
Reply With Quote
 
 
 
 
Doug H
Guest
Posts: n/a
 
      02-14-2006
http://www.google.com/search?hl=en&q=parser

No seriously, check out ANTLR. Unless you are supposed to write the
parser from scratch.
If you want to do it in ruby, there are options like:
http://split-s.blogspot.com/2005/12/antlr-for-ruby.html
http://www.zenspider.com/ZSS/Products/CocoR/

 
Reply With Quote
 
Timothy Goddard
Guest
Posts: n/a
 
      02-14-2006
I just whipped this up in a bit of free time. It may be a decent
starting point for a pure ruby parser. Note that there is no lookahead
ability.

class ParseError < StandardError; end

class Parser

@@reductions = {}
@@reduction_procs = {}
@@tokens = {}
@@token_values = {}

# Parse either a string or an IO object (read all at once) using the
rules defined for this parser.
def parse(input)
stack = []
value_stack = []
text = input.is_a?(IO) ? input.read : input.dup
loop do
token, value = retrieve_token(text)
stack << token
value_stack << value
reduce_stack(stack, value_stack)
if text.length == 0
if stack.length == 1
return stack[0], value_stack[0]
else
raise ParseError, 'Stack failed to reduce'
end
end
end
end
protected

# Retrieve a single token from the input text and return an array of
it and its value.
def retrieve_token(text)
@@tokens.each do |regexp, token|
if md = text.match(regexp)
text.gsub!(regexp, '')
return [token, @@token_values[token] ?
@@token_values[token].call(md.to_s) : nil]
end
end
raise ParseError, "Invalid token in input near #{text}"
end

# Compare the stack to reduction rules to reduce any matches found
def reduce_stack(stack, value_stack)
loop do
matched = false
@@reductions.each do |tokens, result|
if tokens == stack[stack.length - tokens.length, tokens.length]
start_pos = stack.length - tokens.length
stack[start_pos, tokens.length] = result
value_stack[start_pos, tokens.length] =
@@reduction_procs[tokens] ?
@@reduction_procs[tokens].call(value_stack[start_pos, tokens.length]) :
nil
matched = true
break
end
end
return unless matched
end
end

def self.token(regexp, token, &block)
@@tokens[Regexp.new('\A' + regexp.to_s)] = token
@@token_values[token] = block
end

def self.rule(*tokens, &block)
final = tokens.pop
tokens += final.keys
result = final.values.first
@@reductions[tokens] = result
@@reduction_procs[tokens] = block
end
end

class TestParser < Parser
token /foo/i, :foo do |s|
s.upcase
end
token /bar/i, :bar do |s|
s.downcase
end
token /mega/i, :mega do |s|
3
end
rule :foo, :bar => :foobar do |foo, bar|
foo + bar
end
rule :mega, :foobar => :megafoobar do |mega, foobar|
foobar * mega
end
end

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      02-14-2006
Paatsch, Bernd wrote:
> Hello,
>
> I got this great task assigned to write a parser and looking at the
> files to parse is not very trivial. Does anybody know where to find a
> website that would explain steps and pitfalls to avoid writing a
> parser?
> Any suggestion/help in is appreciated.


http://raa.ruby-lang.org/project/racc/
http://raa.ruby-lang.org/project/ruby-yacc/

robert

 
Reply With Quote
 
Phil Tomson
Guest
Posts: n/a
 
      02-15-2006
In article <(E-Mail Removed). com>,
Timothy Goddard <(E-Mail Removed)> wrote:
>I just whipped this up in a bit of free time. It may be a decent
>starting point for a pure ruby parser. Note that there is no lookahead
>ability.
>
>class ParseError < StandardError; end
>
>class Parser
>
> @@reductions = {}
> @@reduction_procs = {}
> @@tokens = {}
> @@token_values = {}
>
> # Parse either a string or an IO object (read all at once) using the
>rules defined for this parser.
> def parse(input)
> stack = []
> value_stack = []
> text = input.is_a?(IO) ? input.read : input.dup
> loop do
> token, value = retrieve_token(text)
> stack << token
> value_stack << value
> reduce_stack(stack, value_stack)
> if text.length == 0
> if stack.length == 1
> return stack[0], value_stack[0]
> else
> raise ParseError, 'Stack failed to reduce'
> end
> end
> end
> end
> protected
>
> # Retrieve a single token from the input text and return an array of
>it and its value.
> def retrieve_token(text)
> @@tokens.each do |regexp, token|
> if md = text.match(regexp)
> text.gsub!(regexp, '')
> return [token, @@token_values[token] ?
>@@token_values[token].call(md.to_s) : nil]
> end
> end
> raise ParseError, "Invalid token in input near #{text}"
> end
>
> # Compare the stack to reduction rules to reduce any matches found
> def reduce_stack(stack, value_stack)
> loop do
> matched = false
> @@reductions.each do |tokens, result|
> if tokens == stack[stack.length - tokens.length, tokens.length]
> start_pos = stack.length - tokens.length
> stack[start_pos, tokens.length] = result
> value_stack[start_pos, tokens.length] =
>@@reduction_procs[tokens] ?
>@@reduction_procs[tokens].call(value_stack[start_pos, tokens.length]) :
>nil
> matched = true
> break
> end
> end
> return unless matched
> end
> end
>
> def self.token(regexp, token, &block)
> @@tokens[Regexp.new('\A' + regexp.to_s)] = token
> @@token_values[token] = block
> end
>
> def self.rule(*tokens, &block)
> final = tokens.pop
> tokens += final.keys
> result = final.values.first
> @@reductions[tokens] = result
> @@reduction_procs[tokens] = block
> end
>end
>
>class TestParser < Parser
> token /foo/i, :foo do |s|
> s.upcase
> end
> token /bar/i, :bar do |s|
> s.downcase
> end
> token /mega/i, :mega do |s|
> 3
> end
> rule :foo, :bar => :foobar do |foo, bar|
> foo + bar
> end
> rule :mega, :foobar => :megafoobar do |mega, foobar|
> foobar * mega
> end
>end
>


This is a bit like Grammar:
http://grammar.rubyforge.org/0.5/

Phil

 
Reply With Quote
 
Timothy Goddard
Guest
Posts: n/a
 
      02-15-2006
Grammar looks much more similar to Spirit, a C++ parser which looks
really simple to use. It uses a very simple domain-specific language
for writing grammars in C++ code. It's part of the boost libraries. It
would be my first choice for a medium-speed parser that could be used
quite easily from Ruby with just a few joining bits of C. Parsers in
the style of YACC or Bison are much faster again, but the added
complexity of defiing grammar probably makes using it a premature
optimisation for most tasks.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Writing a program to create a wireless connection. =?Utf-8?B?bWJvd3llckBwdXJkdWUuZWR1?= Wireless Networking 1 10-05-2005 12:59 PM
writing email problems? Bri Firefox 1 05-12-2005 03:03 AM
Any problems with writing the information into a file - Multi-users perform writing the same file at the same time ???? HNguyen ASP .Net 4 12-21-2004 01:53 PM
Unhandled exception in FileStream when writing to a full disk - bug in framework? Amit ASP .Net 8 08-04-2003 03:34 PM
A failure occurred writing to the resources file. Access is denied. -- RESX file is locked? -- WHY? Mark Kamoski ASP .Net 1 07-04-2003 12:02 PM



Advertisments