Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > [QUIZ] Bytecode Compiler (#100)

Reply
Thread Tools

[QUIZ] Bytecode Compiler (#100)

 
 
Marcel Ward
Guest
Posts: n/a
 
      11-06-2006
On 03/11/06, Ruby Quiz <(E-Mail Removed)> wrote:
> Note: This quiz isn't really as much work as it might seem!


Ouch! Oh well, I don't think my Ruby is up to some of the more
cunning techniques I see many have employed.

> Your compiler should support all basic arithmetic operations and explicit
> precedence (parenthesis). As standard, syntax/precedence/ associativity/etc.
> should follow Ruby itself.


By associativity does this also mean that "+--+1" - which could be
rewritten as "0+(0-(0-(0+1)))" - should be parsed into a bytecode
which, however well optimised, on execution results in the answer "1"?

I should warn that my solution below is rather long. I went down a
possibly more traditional route of writing a generic tokenizer/lexer.
I don't know if these are still commonly used but I couldn't find an
existing implementation in the Ruby Standard Library.

I also tried to document the functions using rdoc so someone else
might make use of it. For those who haven't tried it yet, just type
'rdoc' at the command prompt and it makes a nice doc directory under
the current directory with an index.html to start browsing the
file/classes/methods. Nice!

Back to the task...

My wanting a solution that coped with all difficult expressions that
Ruby itself can deal with (using the lexicon allowed) meant having to
get things like the aforementioned negation with parentheses working:

-(---3) # => 3

...and power precedence (as others have pointed out) combined with
negation and parentheses turned out to be tricky:

64**-(-(-3+5)**3**2) #=> a big number

There's a big list of test cases such as these in my unit tests (included).

So having written the lexer class, I now set up the state transition
table and ask the lexer to tokenize the expression. The tokens are
then parsed and the bytecode is generated using a simple mapping.

The code follows; there are two files in total.

Thanks for another fun challenge and congrats all round for reaching
the 100th Ruby Quiz!

Marcel

#!/usr/bin/env ruby
################################################## ################
# = compiler_mw.rb - bytecode compiler
#
# Author:: Marcel Ward <wardies ^a-t^ gmaildotcom>
# Documentation:: Marcel Ward
# Last Modified:: Monday, 06 November 2006

require 'interp'
require 'lexer_mw'

module Compiler
# The lexer needs to know the character sets involved in deciding
# which state transition will be fired...
CHAR_SETS = {
lus => [?+], :minus => [?-],
:digit => /\d/,
:div_mod => [?/, ?%], # matches '/' or '%'
:asterisk => [?*],
pen_paren => [?(], :close_paren => [?)]
}

# Tell the lexer how to parse a datastream: which tokens to
# generate, what state to switch to, etc.
# This table was designed according to my vague recollection of
# the dragon book on compiler construction by Aho/Sethi/Ullman.
STATE_TRANS_TABLE = {
:s_start => {
lus => {:next_s_skip => :s_start},
:minus => {:next_s => :s_negate},
:digit => {:next_s => :s_numeric},
pen_paren => {:next_s => :s_start,
:token => :tok_open_paren}
},
:s_negate => {
lus => {:next_s_skip => :s_negate},
:minus => {:next_s => :s_start},
:digit => {:next_s => :s_numeric},
pen_paren => {:next_s_backtrack => :s_start,
:token => :tok_negate}
},
:s_numeric => {
lus => {:next_s_backtrack => :s_operator,
:token => :tok_int},
:minus => {:next_s_backtrack => :s_operator,
:token => :tok_int},
:digit => {:next_s => :s_numeric},
:div_mod => {:next_s_backtrack => :s_operator,
:token => :tok_int},
:asterisk => {:next_s_backtrack => :s_operator,
:token => :tok_int},
:close_paren => {:next_s_backtrack => :s_operator,
:token => :tok_int},
:eof => {:next_s_backtrack => :s_operator,
:token => :tok_int},
},
:s_operator => {
lus => {:next_s => :s_start,
:token => :tok_add},
:minus => {:next_s => :s_start,
:token => :tok_subtract},
:div_mod => {:next_s => :s_start,
:token => :tok_div_mod},
:asterisk => {:next_s => :s_mult_or_power},
:close_paren => {:next_s => :s_operator,
:token => :tok_close_paren},
:eof => {} # when :next_s... is absent, finish
},
:s_mult_or_power => {
lus => {:next_s_backtrack => :s_start,
:token => :tok_multiply},
:minus => {:next_s_backtrack => :s_start,
:token => :tok_multiply},
:digit => {:next_s_backtrack => :s_start,
:token => :tok_multiply},
:asterisk => {:next_s => :s_start,
:token => :tok_power},
pen_paren => {:next_s_backtrack => :s_start,
:token => :tok_multiply}
}
}

# Compiles a string expression _sum_ into bytecode and returns
# the bytecode array (as per Ruby Quiz 100 requirements).
def self.compile(sum)
lexer = LexerMW.new()
lexer.init_char_sets(CHAR_SETS)
lexer.init_state_transitions(STATE_TRANS_TABLE)

toks = lexer.tokenize(sum)

puts toks.inspect + "\n\n" + toks.map {|a,b| b}.join(' ') \
if $DEBUG == 1

# Get the mnemonic stack by parsing the tokens.
mnemonic_stack = parse(toks)
puts "\nParsed toks => #{mnemonic_stack.inspect}" if $DEBUG == 1

# Last stage now, we convert our internal mnemonics directly
# to a byte stack in the required bytecode format.
mnemonics_to_bytecode(mnemonic_stack)
end

MNEMONIC_TO_BYTECODE = {
:tok_add => Interpreter::Ops::ADD,
:tok_subtract => Interpreter::Ops::SUB,
:tok_multiply => Interpreter::Ops::MUL,
:tok_divide => Interpreter::Ops:IV,
:tok_modulo => Interpreter::Ops::MOD,
:tok_power => Interpreter::Ops:OW
}


# This exception is raised by the mnemonic-to-bytecode method when
# an integer constant cannot be pushed onto the interpreter
# bytecode stack because it is too big to fit the
# Interpreter::Ops::LCONST instruction.
class OutOfRangeError < StandardError
end

# Convert our internal _mnemonics_ directly to a byte array and
# return this in the required bytecode format, ready to execute.
def self.mnemonics_to_bytecode(mnemonics)
bc = []
mnemonics.each do
|mnem|
if MNEMONIC_TO_BYTECODE.has_key? mnem
bc << MNEMONIC_TO_BYTECODE[mnem]
else
# Try packing this value as a 2-or 4-byte signed string
# and ensure we get back the same value on unpacking it.
if [mnem] == [mnem].pack('s').unpack('s')
# 2-bytes will be enough
bc << Interpreter::Ops::CONST
bc.concat([mnem].pack('n').unpack('C*'))
elsif [mnem] == [mnem].pack('l').unpack('l')
# 4-bytes will be enough
bc << Interpreter::Ops::LCONST
bc.concat([mnem].pack('N').unpack('C*'))
else
# It could be dangerous to silently fail when a
# number will not fit in a 4-byte signed int.
raise OutOfRangeError
end
end
end
bc
end

# If there is a mismatch in the number of parenthesis, this
# exception is raised by the #parse routine.
# E.g. "3+(4-2" and "(3-10))" are both considered invalid.
class ParenthesisError < Exception
end

# The operator precedence hash helps the #parse method to
# decide when to store up operators and when to flush a load
# out. The
PAREN_PRECEDENCE = 0
OP_PRECEDENCE = {
:tok_end => -1,
:tok_open_paren => PAREN_PRECEDENCE,
:tok_close_paren => PAREN_PRECEDENCE,
:tok_add => 1, :tok_subtract => 1,
:tok_multiply => 2, :tok_div_mod => 2,
:tok_power => 3,
:tok_negate => 4
}

# Parse an array of [token,value] pairs as returned by
# LexerMW::tokenize. Returns our own internal quasi-bytecode
# mnemonic array.
def self.parse(tokens)
operator_stack = []
ops = []

# Push the bottom-most element with precedence equivalent to that
# of :tok_end so when we see :tok_end all pending operation
# tokens on the stack get popped
precedence_stack = [OP_PRECEDENCE[:tok_end]]

tokens.each do
|tok, val|
if tok == :tok_int
# "--3".to_i => 0 is bad, so use eval("--3") => 3 instead.
ops << eval(val)
else
precedence = OP_PRECEDENCE[tok]
if not tok == :tok_open_paren
while precedence <= precedence_stack.last &&
precedence_stack.last > PAREN_PRECEDENCE
# Workaround for the fact that the ** power operation
# is calculated Right-to-left,
# i.e. 2**3**4 == 2**(3**4) /= (2**3)**4
break if tok == :tok_power &&
precedence_stack.last == OP_PRECEDENCE[:tok_power]

precedence_stack.pop
ops << operator_stack.pop
end
end

# Divide and modulo come out of the lexer as the same token
# so override tok according to its corresponding value
tok == :tok_div_mod && \
tok = (val == '/') ? :tok_divide : :tok_modulo

case tok
when :tok_close_paren
precedence_stack.pop == PAREN_PRECEDENCE \
or raise ParenthesisError
when :tok_negate
# val contains just the minuses ('-', '--', '---', etc.)
# Optimise out (x) === --(x) === ----(x), etc.
if val.size % 2 == 1
# No negate function for -(x) so simulate using 0 - (x)
precedence_stack.push precedence
operator_stack.push :tok_subtract
ops << 0
end
when :tok_end
raise ParenthesisError if precedence_stack.size != 1
else
precedence_stack.push precedence
operator_stack.push tok unless tok == :tok_open_paren
end
end
end
ops
end
end

if $0 == __FILE__
eval DATA.read, nil, $0, __LINE__+4
end

__END__

require 'test/unit'

class TC_Compiler < Test::Unit::TestCase
def test_simple
@test_data = [
'8', '124', '32767', # +ve CONST
'-1', '-545', '-32768', # -ve CONST
'32768', '294833', '13298833', # +ve LCONST
'-32769', '-429433', '-24892810', # -ve LCONST
'4+5', '7-3', '30+40+50', '14-52-125', # ADD, SUB
'512243+1877324', '40394-12388423', # LCONST, ADD, SUB
'3*6', '-42*-90', '94332*119939', # MUL
'8/3', '-35/-15', '593823/44549', # DIV
'8%3', '243%-59', '53%28%9', # MOD
'531%-81%14', '849923%59422', #
'-2147483648--2147483648', # SUB -ve LCONST
'2**14', '-4**13+2' # POW
]
@test_data.each do
|sum|
assert_equal [eval(sum)],
Interpreter.new(Compiler.compile(sum)).run,
"whilst calculating '#{sum}'"
end
end

def test_advanced
@test_data = [
'-(423)', '-(-523*32)', '---0',
'-(-(-(16**--++2)))',
'3**(9%5-1)/3+1235349%319883+24*-3',
'+42', '((2*-4-15/3)%16)', '4**3**((2*-4-15/3)%16)',
'64**-(-(-3+5)**3**2)', '4*165%41*341/7/2/15%15%13',
'--(---(4**3**((2*-4-15/3)%16))+++-410--4)'
]
@test_data.each do
|sum|
assert_equal [eval(sum)],
Interpreter.new(Compiler.compile(sum)).run,
"whilst calculating '#{sum}'"
end
end
end


#!/usr/bin/env ruby
################################################## ################
# = lexer_mw.rb - generic lexical analyser
#
# Author:: Marcel Ward <wardies ^a-t^ gmaildotcom>
# Documentation:: Marcel Ward
# Last Modified:: Monday, 06 November 2006
#
# Solution for Ruby Quiz number 100 - http://www.rubyquiz.com/

$DEBUG = 0

# If the lexer fails to find an appropriate entry in the state
# transition table for the current character and state, it
# raises this exception.
class LexerFailure < StandardError
end

# If the lexer encounters a character for which no matching charset
# has been supplied then it raises this exception.
#
# This exception will never be raised if #init_state_transitions
# has been called with an appropriate catch-all charset id.
class InvalidLexeme < StandardError
end

class LexerMW
# Creates an instance of the lexer class.
#
# _lexer_eof_ascii_::
# defines the ASCII byte value that the lexer considers as
# end-of-file when it is encountered. When #tokenize is called,
# the supplied datastream is automatically appended with this
# character.
def initialize(lexer_eof_ascii = 0)
@s_trans = {}
@columns = {}
@lex_eof = lexer_eof_ascii
end

# Initialize the character set columns to be used by the lexer.
#
# _cs_defs_::
# a hash containing entries of the form <tt>id => match</tt>,
# where _match_ defines the characters to be matched and _id_
# is the id that will be passed to the finite state machine
# to inidicate the character grouping encountered.
# _eof_charset_id_::
# defines the character set identifier which the lexer will
# attempt to match in the state machine table when the
# end-of-file character defined in #new is encountered.
#
# The content of _match_ falls into one of two main categories:
#
# _regexp_:: e.g. <tt>/\d/</tt> will match any digit 0..9; or
# _enum_:: an enumeration that describes the set of allowed
# character byte values, e.g.
# the array <tt>[?*, ?/, ?%]</tt> matches
# <b>*</b>, <b>/</b> or <b>%</b>, while the range
# <tt>(?a..?z)</tt> matches lowercase alphas.
#
# e.g.
#
# init_char_sets({
# :alphanum => /[A-Z0-9]/,
# :underscore => [?_],
# :lower_vowel => [?a, ?e, ?i, ?o, ?u],
# :special => (0..31)
# },
# :end_line)
#
# It is the responsibility of the caller to ensure that the
# match sets for each column are mutually exclusive.
#
# If a 'catch-all' set is needed then it is not necessary
# to build the set of all characters not already matched.
# Instead, see #init_state_transitions parameter list.
#
# Note, the contents of the hash is duplicated and stored
# internally to avoid any inadvertent corruption from outside.
def init_char_sets(cs_defs, eof_charset_id = :eof)
@charsets = {}
# Make a verbatim copy of the lexer charset columns
cs_defs.each_pair do
|charset_id, match|
@charsets[charset_id] = match.dup # works for array/regexp
end
# Add an end-of-file charset column for free
@charsets[eof_charset_id] = [@lex_eof]
puts "@charsets =\n#{@charsets.inspect}\n\n" if $DEBUG == 1
end

# Initialize the state transition table that will be used by the
# finite state machine to convert incoming characters to tokens.
#
# _st_::
# a hash that defines the state transition table to be used
# (see below).
# _start_state_::
# defines the starting state for the finite state machine.
# _catch_all_charset_id_::
# defines an optional charset id to be tried if the character
# currently being analysed matches none of the charsets
# in the charset table. The default +nil+ ensures that the
# InvalidLexeme exception is raised if no charsets match.
#
# The state transition table hash _st_ maps each valid original
# state to a hash containing the _rules_ to match when in that
# state.
#
# Each hash entry _rule_ maps one of the character set ids
# (defined in the call to #init_char_sets) to the _actions_ to be
# carried out if the current character being analysed by the lexer
# matches.
#
# The _action_ is a hash of distinct actions to be carried out for
# a match. The following actions are supported:
#
# <tt>:next_s => _state_</tt>::
# sets the finite state machine next state to be _state_ and
# appends the current character to the lexeme string being
# prepared, absorbing the current character in the datastream.
#
# <tt>:next_s_skip => _state_</tt>::
# as above but the lexeme string being prepared remains static.
#
# <tt>:next_s_backtrack => _state_</tt>::
# as for _next_s_skip_ above but does not absorb the current
# character (it will be used for the next state test).
#
# <tt>:token => _tok_</tt>::
# appends a hash containing a single entry to the array of
# generated tokens, using _tok_ as the key and a copy of the
# prepared lexeme string as the value.
#
# When the end of the datastream is reached, the lexer looks for
# a match against charset <tt>:eof</tt>.
#
# When the performed actions contain no +next_s+... action, the
# lexer assumes that a final state has been reached and returns
# the accumulated array of tokens up to that point.
#
# e.g.
#
# init_state_transitions({
# :s1 => {:alpha => {next_s = :s2},
# eriod => {:token => :tok_period}},
# :s2 => {:alphanum => {next_s = :s2},
# :underscore => {next_s_skip == :s2},
# eriod => {next_s_backtrack = :s1}
# :eof => {}}, // final state, return tokens
# }, :s1, ther_chars)
#
# Note, the contents of the hash is duplicated and stored
# internally to avoid any inadvertent corruption from outside.
def init_state_transitions(st, start_state = :s_start,
catch_all_charset_id = nil)
@start_state = start_state
@others_key = catch_all_charset_id
@s_trans = {}
# Make a verbatim copy of the state transition table
st.each_pair do
|orig_state, lexer_rules|
@s_trans[orig_state] = state_rules = {}
lexer_rules.each_pair do
|lexer_charset, lexer_actions|
state_rules[lexer_charset] = cur_actions = {}
lexer_actions.each_pair do
|action, new_val|
cur_actions[action] = new_val
end
end
end
puts "@s_trans =\n#{@s_trans.inspect}\n\n" if $DEBUG == 1
end

# Tokenize the datastream in _str_ according to the specific
# character set and state transition table initialized through
# #init_char_sets and #init_state_transitions.
#
# Returns an array of token elements where each element is
# a pair of the form:
#
# [:token_name, "extracted lexeme string"]
#
# The end token marker [:tok_end, nil] is appended to the end
# of the result on success, e.g.
#
# tokenize(str)
# # => [[:tok_a, "123"], [:tok_b, "abc"], [:tok_end, nil]]
#
# Raises the LexerFailure exception if no matching state
# transition is found for the current state and character.
def tokenize(str)
state = @start_state
lexeme = ''
tokens = []
# Append our end of file marker to the string to be tokenized
str += "%c" % @lex_eof
str.each_byte do
|char|
char_as_str = "%c" % char
loop do
match = @charsets.find {
|id, match|
(match.kind_of? Regexp) ? \
(match =~ char_as_str) : (match.include? char)
} || [@others_key, @charsets[@others_key]] or \
raise InvalidLexeme

# Look for the action matching our current state and the
# character set id for our current char.
action = @s_trans[state][match.first] or raise LexerFailure

# If found, action contains our hash of actions, e.g.
# {:next_s_backtrack => :s_operator, :token => :tok_int}
puts "#{char==@lex_eof?'<eof>':char_as_str}: " \
"#{state.inspect} - #{action.inspect}" if $DEBUG == 1

# Build up the lexeme unless we're backtracking or skipping
lexeme << char_as_str if action.has_key? :next_s

tokens << [action[:token], lexeme.dup] && lexeme = '' if \
action.has_key? :token

# Set the next state, or - when there is no specified next
# state - we've finished, so return the tokens.
state = action[:next_s] || action[:next_s_skip] ||
action[:next_s_backtrack] or
return tokens << [:tok_end, nil]

break unless action.has_key? :next_s_backtrack
end
end
tokens
end
end


if $0 == __FILE__
eval DATA.read, nil, $0, __LINE__+4
end

__END__

require 'test/unit'

class TC_LexerMW < Test::Unit::TestCase
def test_simple
@lexer = LexerMW.new()

@char_sets = {
:letter => (?a..?z),
:digit => (/\d/),
:space => [?\s, ?_]
}

@lexer.init_char_sets(@char_sets)

@st = {
:extract_chars => {
:letter => {:next_s => :extract_chars},
:digit => {:next_s => :extract_chars},
:space => {:next_s_skip => :extract_chars,
:token => :tok_text},
:eof => {:token => :tok_text}
},
:extract_alpha => {
:letter => {:next_s => :extract_alpha},
:digit => {:next_s_backtrack => :extract_num,
:token => :tok_alpha},
:space => {:next_s_skip => :extract_alpha,
:token => :tok_alpha},
ther => {:next_s_skip => :extract_alpha},
:eof_exit => {}
},
:extract_num => {
:letter => {:next_s_backtrack => :extract_alpha,
:token => :tok_num},
:digit => {:next_s => :extract_num},
:space => {:next_s_skip => :extract_num},
thers => {:next_s_skip => :extract_alpha,
:token => :tok_num}
}
}
@lexer.init_state_transitions(@st, :extract_chars)
assert_equal [
[:tok_text, "123"], [:tok_text, "45"],
[:tok_text, "6"], [:tok_text, "78"],
[:tok_text, "abcd"], [:tok_text, "efghi"],
[:tok_text, "jklmn"], [:tok_end, nil]
], @lexer.tokenize("123 45 6_78 abcd efghi_jklmn")

@lexer = LexerMW.new(?$)
@lexer.init_char_sets(@char_sets, :eof_exit)
@lexer.init_state_transitions(@st, :extract_num, thers)
assert_equal [
[:tok_num, "12345678"], [:tok_alpha, "abcd"],
[:tok_alpha, "efghi"], [:tok_num, "445"],
[:tok_alpha, ""], [:tok_num, "1222"], [:tok_end, nil]
], @lexer.tokenize("123 45 6_78 abcd efghi445!12_22!ab$45")

end
end

 
Reply With Quote
 
 
 
 
Bill Dolinar
Guest
Posts: n/a
 
      11-07-2006
Changed my previous solution to use some of the things others have
used that need to be pounded into my skull like \d in regular
expression and pack/unpack. Also added unary + and cleaned up code.

############
require 'interp'

module Compiler

# compile expression into bytecode array
def Compiler.compile(s)
stack = []
eval(s.gsub(/(\d+)/, 'Value.new(stack, \1)'))
stack
end

class Value
attr_reader :value # constant value or nil for on stack
ON_STACK = nil

def initialize(stack, value)
@stack = stack
@value = value
end

# generate code for each binary operator
{'+' => Interpreter::Ops::ADD,
'-' => Interpreter::Ops::SUB,
'*' => Interpreter::Ops::MUL,
'**'=> Interpreter::Ops:OW,
'/' => Interpreter::Ops:IV,
'%' => Interpreter::Ops::MOD}.each do |operator, byte_code|
Value.module_eval <<-OPERATOR_CODE
def #{operator}(rhs)
push_if_value(@value)
push_if_value(rhs.value)
# swap stack items if necessary
#{if operator != "+"
"@stack << Interpreter::Ops::SWAP if rhs.value == nil &&
@value != nil"
end}
@stack << #{byte_code}
Value.new(@stack, ON_STACK)
end
OPERATOR_CODE
end

def -@
if @value != ON_STACK
push_if_value(-@value)
else
push_if_value(0)
@stack << Interpreter::Ops::SWAP << Interpreter::Ops::SUB
end
Value.new(@stack, ON_STACK)
end

def +@
push_if_value(@value)
Value.new(@stack, ON_STACK)
end

def push_if_value(value)
if value != ON_STACK
if (-32768..32767).include?(value)
@stack << Interpreter::Ops::CONST
@stack.concat([value].pack("n").unpack("C*"))
else
@stack << Interpreter::Ops::LCONST
@stack.concat([value].pack("N").unpack("C*"))
end
end
end

end
end


 
Reply With Quote
 
 
 
 
Robert Conn
Guest
Posts: n/a
 
      11-07-2006
Here's my effort. I really enjoyed this quiz but spent way more than
the 1 hour on it (and it's still not correct)!

I tokenized the expression with a regex and then used the Shunting
Yard Algorithm to do the parsing. However, I fail some of tests -

1. Unary minus [-2+(2-2)]

2. Order of evaluation in this expression is right-->left as opposed
to left-->right and gives wrong result [(1+3)/(2/2)*(10-]

3. The optional efficient storage test.

I'll try to fix these if I have more time.

Thanks

Bob

require "Interpreter.rb"

class Compiler

@PRECEDENCE = {
'(' => -1,
')' => -1,
'+' => 0,
'-' => 0,
'*' => 1,
'/' => 1,
'**' => 3,
'%' => 3
}

@BYTECODE = {
'CONST' => 0x01,
'LCONST' => 0x02,
'+' => 0x0a,
'-' => 0x0b,
'*' => 0x0c,
'**' => 0x0d,
'/' => 0x0e,
'%' => 0x0f,
'SWAP' => 0xa0
}

def Compiler.compile(expression)
te = self.tokenize(expression)
be = self.parse(te)
self.to_bytecode(be)
end

def Compiler.tokenize(expression)
tokenized_expression = []

expression.gsub!(/\s+/, "")
expression.scan(/(\d+|\(|\)|\+|-|\*\*|\*|\/|\%)/) { |e|
tokenized_expression.push(e) }
tokenized_expression
end

def Compiler.parse(tokenized_expression)
output_queue, operator_stack = [], []
operator = nil

tokenized_expression.each { |token|

# If token is a number, place on output queue
if (token[0] =~ /^\d+$/)
output_queue.push(token[0].to_i)
elsif (token[0] == '(')
operator_stack.push(token[0])
elsif (token[0] == ')')
# Pop operators off stack and onto output queue until left
# bracket encountered
while (operator != '(')
if ((operator = operator_stack.pop) != '(')
output_queue.push(operator)
end
end
else
# If there are any operators, check precedence of current token
# against last operator on queue. If the operator on queue is
# more important, add it to the output before pushing the
current
# operator on
if (operator_stack.any? && (@PRECEDENCE[token[0]] <=
@PRECEDENCE[operator_stack.last]))
output_queue.push(operator_stack.pop)
end
operator_stack.push(token[0])
end
}

# Add the remaining operators to end of the output queue
operator_stack.reverse_each { |operator|
output_queue.push(operator)
}

output_queue
end

def Compiler.to_bytecode(bnf_expression)
stack = []

bnf_expression.delete("(")
bnf_expression.each { |token|
case token
when Integer
# If number is small enough, use smaller 2 byte storage

if ((token >= -3276 && (token <= 32767))
stack.push(@BYTECODE['CONST'])
stack.push(token >> 8, token)
else
stack.push(@BYTECODE['LCONST'])
stack.push(token >> 24, token >> 16, token >> 8, token)
end
else
stack.push(@BYTECODE[token])
end
}
stack
end

end

require "TestByteCode.rb"


 
Reply With Quote
 
Justin Bailey
Guest
Posts: n/a
 
      11-07-2006
Some people started doing the Ruby quiz problems using Haskell, and
this was a perfect opportunity for me to learn some Haskell. So here's
my solution below, in Haskell. It's hard to test the byte code
interpretation but all the expression do evaluate to the correct
values.

If anyone has questions about the Haskell code, please let me know.
I'm just learning it and its really cool!

BTW, I too spent way more time on this than I should have!

(this solution, along with others, can be found on
http://www.haskell.org/haskellwiki/H...ecode_Compiler)

This solution should work correctly. I was unable to test the byte
codes generated, for obvious reasons. However, all test strings from
the quiz do evaluate to the correct values.

To see the (symbolic) byte codes generated, run generate_tests. To see
the actual byte codes, run compile_tests. To see that the values
produced by each expression match those expected, run eval_tests. The
tests are contained in the variables test1,test2, ..., test6, which
correspond to the six "test_n" methods fouind in the quiz's test
program.

The byte codes aren't optimized. For example, SWAP is never used.
However, they should produce correct results (even for negative and
LCONST/CONST values).

The code below is literate Haskell.

\begin{code}
import Text.ParserCombinators.Parsec hiding (parse)
import qualified Text.ParserCombinators.Parsec as P (parse)
import Text.ParserCombinators.Parsec.Expr
import Data.Bits

-- Represents various operations that can be applied
-- to expressions.
data Op = Plus | Minus | Mult | Div | Pow | Mod | Neg
deriving (Show, Eq)

-- Represents expression we can build - either numbers or expressions
-- connected by operators.
data Expression = Statement Op Expression Expression
| Val Integer
| Empty
deriving (Show)

-- Define the byte codes that can be generated.
data Bytecode = NOOP | CONST Integer | LCONST Integer
| ADD
| SUB
| MUL
| POW
| DIV
| MOD
| SWAP
deriving (Show)


-- Using imported Parsec.Expr library, build a parser for expressions.
expr :: Parser Expression
expr =
buildExpressionParser table factor
<?> "expression"
where
-- Recognizes a factor in an expression
factor =
do{ char '('
; x <- expr
; char ')'
; return x
}
<|> number
<?> "simple expression"
-- Recognizes a number
number :: Parser Expression
number = do{ ds <- many1 digit
; return (Val (read ds))
}
<?> "number"
-- Specifies operator, associativity, precendence, and constructor to execute
-- and built AST with.
table =
[[prefix "-" (Statement Mult (Val (-1)))],
[binary "^" (Statement Pow) AssocRight],
[binary "*" (Statement Mult) AssocLeft, binary "/" (Statement
Div) AssocLeft, binary "%" (Statement Mod) AssocLeft],
[binary "+" (Statement Plus) AssocLeft, binary "-" (Statement
Minus) AssocLeft]
]
where
binary s f assoc
= Infix (do{ string s; return f}) assoc
prefix s f
= Prefix (do{ string s; return f})


-- Parses a string into an AST, using the parser defined above
parse s = case P.parse expr "" s of
Right ast -> ast
Left e -> error $ show e

-- Take AST and evaluate (mostly for testing)
eval (Val n) = n
eval (Statement op left right)
| op == Mult = eval left * eval right
| op == Minus = eval left - eval right
| op == Plus = eval left + eval right
| op == Div = eval left `div` eval right
| op == Pow = eval left ^ eval right
| op == Mod = eval left `mod` eval right

-- Takes an AST and turns it into a byte code list
generate stmt = generate' stmt []
where
generate' (Statement op left right) instr =
let
li = generate' left instr
ri = generate' right instr
lri = li ++ ri
in case op of
Plus -> lri ++ [ADD]
Minus -> lri ++ [SUB]
Mult -> lri ++ [MUL]
Div -> lri ++ [DIV]
Mod -> lri ++ [MOD]
Pow -> lri ++ [POW]
generate' (Val n) instr =
if abs(n) > 32768
then instr ++ [LCONST n]
else instr ++ [CONST n]

-- Takes a statement and converts it into a list of actual bytes to
-- be interpreted
compile s = toBytes (generate $ parse s)

-- Convert a list of byte codes to a list of integer codes. If LCONST or CONST
-- instruction are seen, correct byte representantion is produced
toBytes ((NOOP)s) = 0 : toBytes xs
toBytes ((CONST n)s) = 1 : (toConstBytes (fromInteger n)) ++ toBytes xs
toBytes ((LCONST n)s) = 2 : (toLConstBytes (fromInteger n)) ++ toBytes xs
toBytes ((ADD)s) = 0x0a : toBytes xs
toBytes ((SUB)s) = 0x0b : toBytes xs
toBytes ((MUL)s) = 0x0c : toBytes xs
toBytes ((POW)s) = 0x0d : toBytes xs
toBytes ((DIV)s) = 0x0e : toBytes xs
toBytes ((MOD)s) = 0x0f : toBytes xs
toBytes ((SWAP)s) = 0x0a : toBytes xs
toBytes [] = []

-- Convert number to CONST representation (2 element list)
toConstBytes n = toByteList 2 n
toLConstBytes n = toByteList 4 n

-- Convert a number into a list of 8-bit bytes (big-endian/network byte order).
-- Make sure final list is size elements long
toByteList :: Bits Int => Int -> Int -> [Int]
toByteList size n =
if (length bytes) < size
then (replicate (size - (length bytes)) 0) ++ bytes
else bytes
where
bytes = reverse $ toByteList' n
-- for negative, and with signed bit and remove negative. Then
continue recursion.
toByteList' 0 = []
toByteList' a | a < 0 = (a .&. 511) : toByteList' (abs(a) `shiftR`
| otherwise = (a .&. 255) : toByteList' (a `shiftR`

-- All tests defined by the quiz, with the associated values they
should evaluate to.
test1 = [(2+2, "2+2"), (2-2, "2-2"), (2*2, "2*2"), (2^2, "2^2"), (2
`div` 2, "2/2"),
(2 `mod` 2, "2%2"), (3 `mod` 2, "3%2")]

test2 = [(2+2+2, "2+2+2"), (2-2-2, "2-2-2"), (2*2*2, "2*2*2"), (2^2^2,
"2^2^2"), (4 `div` 2 `div` 2, "4/2/2"),
(7`mod`2`mod`1, "7%2%1")]

test3 = [(2+2-2, "2+2-2"), (2-2+2, "2-2+2"), (2*2+2, "2*2+2"), (2^2+2, "2^2+2"),
(4 `div` 2+2, "4/2+2"), (7`mod`2+1, "7%2+1")]

test4 = [(2+(2-2), "2+(2-2)"), (2-(2+2), "2-(2+2)"), (2+(2*2),
"2+(2*2)"), (2*(2+2), "2*(2+2)"),
(2^(2+2), "2^(2+2)"), (4 `div` (2+2), "4/(2+2)"), (7`mod`(2+1), "7%(2+1)")]

test5 = [(-2+(2-2), "-2+(2-2)"), (2-(-2+2), "2-(-2+2)"), (2+(2 * -2),
"2+(2*-2)")]

test6 = [((3 `div` 3)+(8-2), "(3/3)+(8-2)"), ((1+3) `div` (2 `div`
2)*(10-, "(1+3)/(2/2)*(10-"),
((1*3)*4*(5*6), "(1*3)*4*(5*6)"), ((10`mod`3)*(2+2),
"(10%3)*(2+2)"), (2^(2+(3 `div` 2)^2), "2^(2+(3/2)^2)"),
((10 `div` (2+3)*4), "(10/(2+3)*4)"), (5+((5*4)`mod`(2+1)),
"5+((5*4)%(2+1))")]

-- Evaluates the tests and makes sure the expressions match the expected values
eval_tests = map eval_tests [test1, test2, test3, test4, test5, test6]
where
eval_tests ((val, stmt):ts) =
let eval_val = eval $ parse stmt
in
if val == eval_val
then "True" : eval_tests ts
else (stmt ++ " evaluated incorrectly to " ++ show eval_val ++
" instead of " ++ show val) : eval_tests ts
eval_tests [] = []

-- Takes all the tests and displays symbolic bytes codes for each
generate_tests = map generate_all [test1,test2,test3,test4,test5,test6]
where generate_all ((val, stmt):ts) = generate (parse stmt) : generate_all ts
generate_all [] = []

-- Takes all tests and generates a list of bytes representing them
compile_tests = map compile_all [test1,test2,test3,test4,test5,test6]
where compile_all ((val, stmt):ts) = compile stmt : compile_all ts
compile_all [] = []

\end{code}

 
Reply With Quote
 
Mike Harris
Guest
Posts: n/a
 
      11-08-2006
My 2nd solution, it's the same as the first except i stole the
pack/unpack stuff.

class Compiler
def self.compile(s)
eval(s.gsub(/(\d+)([^\d])/,'\1.bc\2').gsub(/([^\d])(\d+)$/,'\1\2.bc'))
end
end

class Fixnum
def bc
lead,pt = ( (-2**15...2**15)===self ? [1,'n'] : [2,'N'] )
[lead].concat([self].pack(pt).unpack('C*'))
end
end

class Array
{:+ => 10,:- => 11,:* => 12,:** => 13,:/ => 14,:% => 15}.each do |op,code|
define_method(op) { |x| self.concat(x).concat([code]) }
end
end

 
Reply With Quote
 
Louis J Scoras
Guest
Posts: n/a
 
      11-09-2006
#!/usr/bin/env ruby
#
# compiler.rb - Byte-code compiler for simple arithmetic expressions
#
# Lou Scoras <(E-Mail Removed)>
# Wed Nov 8 20:33 EST 2006
#
# Here's my solution for Rubyquiz 100. Nothing too fancy on this one: I went
# for trying to make the shunting algorith as readable as possile.
#
# As for the parsing, StringScanner is very nice, although it might have been
# overkill for this problem.
#
# Thanks again to Ross and James for another fun quiz.

require 'enumerator'
require 'interp'
require 'optparse'
require 'strscan'

class Token
attr_reader :type, :value

def initialize(value)
@value = value
end

%w|number lparen rparen op|.each do |a|
module_eval %{ def #{a}?; false end }
end
end

class Paren < Token
def initialize(value, type)
super(value)
@type = type
end
def lparen?; @type == :lparen end
def rparen?; @type == :rparen end
end

class Number < Token
def initialize(value)
super(value.to_i)
end

def to_bc
code, fmt = ((-32768..32767).include? value) ? [0x01, 'n'] : [0x02, 'N']
[code, *[value].pack(fmt).to_enum(:each_byte).to_a]
end
def number?; true end
end

class Op < Token
attr_reader recedence

CodeTable = [:+, :-, :*, :**, :/, :%].to_enum(:each_with_index).
inject({}) {|h, (op,i)| h[op] = i + 0x0a; h}

def initialize(value,assoc,prec)
super(value.to_sym)
@assoc, @precedence = assoc, prec
end

%w|assoc lassoc rassoc|.each do |a|
module_eval %{
def #{a}?
@assoc == :#{a}
end
}
end

def op?; true end

def to_bc
CodeTable[value]
end
end

class Compiler
class << self

def compile(exp)
shunting_yard(exp).collect {|t| t.to_bc }.flatten
end

def tokens(i)
input = StringScanner.new(i)
until input.eos?
case
when t = input.scan(/\d+/) : yield Number.new(t)
when t = input.scan(/[(]/) : yield Paren.new(t, :lparen)
when t = input.scan(/[)]/) : yield Paren.new(t, :rparen)
when t = input.scan(/\*\*/) : yield Op.new(t, :rassoc, 3)
when t = input.scan(%r<[%/]>) : yield Op.new(t, :lassoc, 2)
when t = input.scan(%r<[*]>) : yield Op.new(t, :assoc, 2)
when t = input.scan(%r<[-]>) : yield Op.new(t, :lassoc, 1)
when t = input.scan(%r<[+]>) : yield Op.new(t, :assoc, 1)
when input.scan(/\s+/) : # skip ws
else
raise RuntimeError, "Parse Error: near '#{input.peek(}'"
end
end
end

def shunting_yard(s)
stack, queue = [] , []
last_tok, negate = nil, false # detect unary minus
tokens(s) do |token|
case
when token.number?
queue << (negate ? Number.new(-token.value) : token)
negate = false
when token.op?
if !last_tok || (last_tok.op? || last_tok.lparen?) &&
(token.value ==
negate = true
else
while stack.size > 0 and stack.last.op?
other_op = stack.last
if ( token.assoc? || token.lassoc? and
token.precedence <= other_op.precedence) ||
(token.rassoc? and token.precedence < other_op.precedence)
queue << stack.pop
else
break
end
end
stack << token
end
when token.lparen?
stack << token
when token.rparen?
while stack.size != 0 and op = stack.pop
break if op.lparen?
queue << op
end
end
last_tok = token
end
stack.reverse.each do |op|
queue << op
end
queue
end

def to_rpn(exp)
shunting_yard(exp).collect{|t| t.value}.join(' ')
end

DCBin = '/usr/bin/dc'

def dc_eval(exp)
if File.executable?(DCBin)
exp = to_rpn(exp)
IO.popen(DCBin, "w+") do |f|
f.write(exp.gsub(/\*\*/, '^') + ' p')
f.close_write
f.read
end
end
end

end
end


if $0 == __FILE__
opt = OptionParser.new do |opt|
opt.banner = "Usage: #$0 compile_method"
opt.separator ''

opt.on('-c', '--compile [expression]',
'prints bytecode sequence for [expression]') do |exp|
p Compiler.compile(exp)
end

opt.on('-d', '--dc-eval [expression]',
'trys to evaluate [expression] using dc(1)') do |exp|
puts Compiler.dc_eval(exp)
end

opt.on('-i', '--interpret [expression]',
'uses the byte-code interpreter to process [expression]') do |exp|
puts Interpreter.new(Compiler.compile(exp)).run
end

opt.on('-r', '--show-rpn [expression]',
'prints out an RPN translated version of [expression]') do |exp|
puts Compiler.to_rpn(exp)
end

opt.on('-h', '--help') { puts opt }
end
if ARGV.empty?
puts opt
else
opt.parse(ARGV)
end
end

 
Reply With Quote
 
Louis J Scoras
Guest
Posts: n/a
 
      11-09-2006
On 11/9/06, Ross Bamford <(E-Mail Removed)> wrote:
>
> Aww man, now I'm gonna have to look into Haskell a bit more I've been
> putting it off for a while, but this is too cool to not have a play
> with!


You won't be disappointed. Haskell is an amazingly cool language.

> Because I'm currently Haskell-ignorant, however, I don't know how to run
> your solution . What interpreter (or compiler?) should be used? I
> have Hugs installed, but that complains about a "Syntax error in input
> (unexpected backslash (lambda))" at line 1.


It looks like its supposed to be embedded in LaTeX. Just remove the
first and last lines since the TeX directives look like lambdas to
Haskell.

\x -> x * x
<==>
lamba {|x| x*x}

Also, Haskell is very picky about formatting. Your going to get a
bunch of errors because of line wraps.


--
Lou.

 
Reply With Quote
 
Justin Bailey
Guest
Posts: n/a
 
      11-09-2006
On 11/9/06, Ross Bamford <(E-Mail Removed)> wrote:
> Aww man, now I'm gonna have to look into Haskell a bit more I've been
> putting it off for a while, but this is too cool to not have a play
> with!


Awesome!

>
> Because I'm currently Haskell-ignorant, however, I don't know how to run
> your solution . What interpreter (or compiler?) should be used? I
> have Hugs installed, but that complains about a "Syntax error in input
> (unexpected backslash (lambda))" at line 1.


Like someone else pointed out, it's meant to be in a LateX document.
It's called "literate coding" and Haskell is the first place I
encountered it. The basic idea is you flip the normal source/comments
order and make comments the number 1 element in the document. Code is
then embedded in \begin{code} & \end{code} directives. (throwaway
comment: enabling ruby to run "literate code" would make a cool quiz).

Anyways, to run a literate haskell file in hugs, just save it with a
"lhs" extension. At least, that works for me in WinHugs.

To get a properly formatted version, its available on the web at:

http://www.haskell.org/haskellwiki/H..._Justin_Bailey

That version actually works (I'll be resubmitting in a minute). To run
all the tests there, type "interpret_tests" after you load the file.

Justin

p.s. One nice thing about literate haskell is you can copy that entire
page (i.e. don't worry about trying to grab only the code), paste it
into an lhs file, and run it. Works great on blogs too!

 
Reply With Quote
 
Justin Bailey
Guest
Posts: n/a
 
      11-09-2006
My solution in Haskell - and this time it actually works. The previous
implementation didn't work well with negative numbers and CONST/LCONST
weren't generating correctly.

The code is "literate" haskell, which means it must be saved in a file
with "lhs" extension to run under WinHugs. To test the generated byte
codes, run "interpret_tests" after loading the file. Other functions
which demonstrate what is generated are:

compile_tests - Spits out byte codes for all test expressions
generate_tests - Spits out symbolic byte codes for all test expressions
evaluate_tests - Evaluates ASTs generated (not bytecode) for all
test expressions.

This solution is also posted at

http://www.haskell.org/haskellwiki/H..._Justin_Bailey

Thanks again for a great quiz!

Justin

\begin{code}
import Text.ParserCombinators.Parsec hiding (parse)
import qualified Text.ParserCombinators.Parsec as P (parse)
import Text.ParserCombinators.Parsec.Expr
import Data.Bits
import Data.Int

-- Represents various operations that can be applied
-- to expressions.
data Op = Plus | Minus | Mult | Div | Pow | Mod | Neg
deriving (Show, Eq)

-- Represents expression we can build - either numbers or expressions
-- connected by operators. This structure is the basis of the AST built
-- when parsing
data Expression = Statement Op Expression Expression
| Val Integer
| Empty
deriving (Show)

-- Define the byte codes that can be generated.
data Bytecode = NOOP | CONST Integer | LCONST Integer
| ADD
| SUB
| MUL
| POW
| DIV
| MOD
| SWAP
deriving (Show)


-- Using imported Parsec.Expr library, build a parser for expressions.
expr :: Parser Expression
expr =
buildExpressionParser table factor
<?> "expression"
where
-- Recognizes a factor in an expression
factor =
do{ char '('
; x <- expr
; char ')'
; return x
}
<|> number
<?> "simple expression"
-- Recognizes a number
number :: Parser Expression
number = do{ ds <- many1 digit
; return (Val (read ds))
}
<?> "number"
-- Specifies operator, associativity, precendence, and constructor to execute
-- and built AST with.
table =
[[prefix "-" (Statement Mult (Val (-1)))],
[binary "^" (Statement Pow) AssocRight],
[binary "*" (Statement Mult) AssocLeft, binary "/" (Statement
Div) AssocLeft, binary "%" (Statement Mod) AssocLeft],
[binary "+" (Statement Plus) AssocLeft, binary "-" (Statement
Minus) AssocLeft]
]
where
binary s f assoc
= Infix (do{ string s; return f}) assoc
prefix s f
= Prefix (do{ string s; return f})

-- Parses a string into an AST, using the parser defined above
parse s = case P.parse expr "" s of
Right ast -> ast
Left e -> error $ show e

-- Take AST and evaluate (mostly for testing)
eval (Val n) = n
eval (Statement op left right)
| op == Mult = eval left * eval right
| op == Minus = eval left - eval right
| op == Plus = eval left + eval right
| op == Div = eval left `div` eval right
| op == Pow = eval left ^ eval right
| op == Mod = eval left `mod` eval right

-- Takes an AST and turns it into a byte code list
generate stmt = generate' stmt []
where
generate' (Statement op left right) instr =
let
li = generate' left instr
ri = generate' right instr
lri = li ++ ri
in case op of
Plus -> lri ++ [ADD]
Minus -> lri ++ [SUB]
Mult -> lri ++ [MUL]
Div -> lri ++ [DIV]
Mod -> lri ++ [MOD]
Pow -> lri ++ [POW]
generate' (Val n) instr =
if abs(n) > 32768
then LCONST n : instr
else CONST n : instr

-- Takes a statement and converts it into a list of actual bytes to
-- be interpreted
compile s = toBytes (generate $ parse s)

-- Convert a list of byte codes to a list of integer codes. If LCONST or CONST
-- instruction are seen, correct byte representantion is produced
toBytes ((NOOP)s) = 0 : toBytes xs
toBytes ((CONST n)s) = 1 : (toConstBytes (fromInteger n)) ++ toBytes xs
toBytes ((LCONST n)s) = 2 : (toLConstBytes (fromInteger n)) ++ toBytes xs
toBytes ((ADD)s) = 0x0a : toBytes xs
toBytes ((SUB)s) = 0x0b : toBytes xs
toBytes ((MUL)s) = 0x0c : toBytes xs
toBytes ((POW)s) = 0x0d : toBytes xs
toBytes ((DIV)s) = 0x0e : toBytes xs
toBytes ((MOD)s) = 0x0f : toBytes xs
toBytes ((SWAP)s) = 0x0a : toBytes xs
toBytes [] = []

-- Convert number to CONST representation (2 element list)
toConstBytes n = toByteList 2 n
toLConstBytes n = toByteList 4 n

-- Convert a number into a list of 8-bit bytes (big-endian/network byte order).
-- Make sure final list is size elements long
toByteList :: Bits Int => Int -> Int -> [Int]
toByteList size n = reverse $ take size (toByteList' n)
where
toByteList' a = (a .&. 255) : toByteList' (a `shiftR`

-- All tests defined by the quiz, with the associated values they
should evaluate to.
test1 = [(2+2, "2+2"), (2-2, "2-2"), (2*2, "2*2"), (2^2, "2^2"), (2
`div` 2, "2/2"),
(2 `mod` 2, "2%2"), (3 `mod` 2, "3%2")]

test2 = [(2+2+2, "2+2+2"), (2-2-2, "2-2-2"), (2*2*2, "2*2*2"), (2^2^2,
"2^2^2"), (4 `div` 2 `div` 2, "4/2/2"),
(7`mod`2`mod`1, "7%2%1")]

test3 = [(2+2-2, "2+2-2"), (2-2+2, "2-2+2"), (2*2+2, "2*2+2"), (2^2+2, "2^2+2"),
(4 `div` 2+2, "4/2+2"), (7`mod`2+1, "7%2+1")]

test4 = [(2+(2-2), "2+(2-2)"), (2-(2+2), "2-(2+2)"), (2+(2*2),
"2+(2*2)"), (2*(2+2), "2*(2+2)"),
(2^(2+2), "2^(2+2)"), (4 `div` (2+2), "4/(2+2)"), (7`mod`(2+1), "7%(2+1)")]

test5 = [(-2+(2-2), "-2+(2-2)"), (2-(-2+2), "2-(-2+2)"), (2+(2 * -2),
"2+(2*-2)")]

test6 = [((3 `div` 3)+(8-2), "(3/3)+(8-2)"), ((1+3) `div` (2 `div`
2)*(10-, "(1+3)/(2/2)*(10-"),
((1*3)*4*(5*6), "(1*3)*4*(5*6)"), ((10`mod`3)*(2+2),
"(10%3)*(2+2)"), (2^(2+(3 `div` 2)^2), "2^(2+(3/2)^2)"),
((10 `div` (2+3)*4), "(10/(2+3)*4)"), (5+((5*4)`mod`(2+1)),
"5+((5*4)%(2+1))")]

-- Evaluates the tests and makes sure the expressions match the expected values
eval_tests = concat $ map eval_tests [test1, test2, test3, test4, test5, test6]
where
eval_tests ((val, stmt):ts) =
let eval_val = eval $ parse stmt
in
if val == eval_val
then ("Passed: " ++ stmt) : eval_tests ts
else ("Failed: " ++ stmt ++ "(" ++ show eval_val ++ ")") : eval_tests ts
eval_tests [] = []

-- Takes all the tests and displays symbolic bytes codes for each
generate_tests = concat $ map generate_all [test1,test2,test3,test4,test5,test6]
where generate_all ((val, stmt):ts) = (stmt, generate (parse stmt))
: generate_all ts
generate_all [] = []

-- Takes all tests and generates a list of bytes representing them
compile_tests = concat $ map compile_all [test1,test2,test3,test4,test5,test6]
where compile_all ((val, stmt):ts) = (stmt, compile stmt) : compile_all ts
compile_all [] = []

interpret_tests = concat $ map f' [test1, test2, test3, test4, test5, test6]
where
f' tests = map f'' tests
f'' (expected, stmt) =
let value = fromIntegral $ interpret [] $ compile stmt
in
if value == expected
then "Passed: " ++ stmt
else "Failed: " ++ stmt ++ "(" ++ (show value) ++ ")"

fromBytes n xs =
let int16 = (fromIntegral ((fromIntegral int32) :: Int16)) :: Int
int32 = byte xs
byte xs = foldl (\accum byte -> (accum `shiftL` .|. (byte))
(head xs) (take (n - 1) (tail xs))
in
if n == 2
then int16
else int32

interpret [] [] = error "no result produced"
interpret (s1:s) [] = s1
interpret s (os) | o < 10 = interpret ((fromBytes (o*2) xs):s) (drop (o*2) xs)
interpret (s1:s2:s) (os)
| o == 16 = interpret (s2:s1:s) xs
| otherwise = interpret (((case o of 10 -> (+); 11 -> (-); 12 ->
(*); 13 -> (^); 14 -> div; 15 -> mod) s2 s1):s) xs

\end{code}

 
Reply With Quote
 
James Edward Gray II
Guest
Posts: n/a
 
      11-09-2006
On Nov 9, 2006, at 12:58 PM, Justin Bailey wrote:

> (throwaway comment: enabling ruby to run "literate code" would make
> a cool quiz).


Great, write it up for us so we can all learn how it works!

http://www.velocityreviews.com/forums/(E-Mail Removed)

James Edward Gray II



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
basic bytecode to machine code compiler (part 3) Rouslan Korneychuk Python 2 06-21-2011 08:06 PM
basic bytecode to machine code compiler (part 2) Rouslan Korneychuk Python 0 05-18-2011 01:03 AM
a basic bytecode to machine code compiler Rouslan Korneychuk Python 10 04-03-2011 12:12 AM
question about a command like 'goto ' in Python's bytecode or it'sjust a compiler optimization? higer Python 8 06-19-2009 12:11 AM
Delphi to bytecode compiler Jonathan Neve Java 3 08-28-2004 10:37 AM



Advertisments