Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Yet another attempt at a safe eval() call

Reply
Thread Tools

Yet another attempt at a safe eval() call

 
 
Grant Edwards
Guest
Posts: n/a
 
      01-04-2013
On 2013-01-04, Michael Torrie <(E-Mail Removed)> wrote:
> On 01/04/2013 08:53 AM, Grant Edwards wrote:
>> That's obviously the "right" thing to do. I suppose I should figure
>> out how to use the ast module.

>
> Or PyParsing.
>
> As for your program being "secure" I don't see that there's much to
> exploit.


There isn't.

> You're not running as a service, and you're not running your
> assembler as root, called from a normal user. The user has your code
> and can "exploit" it anytime he wants.


I'm just trying to prevent surprises for people who are running the
assembler. We have to assume that they trust the assembler code to
not cause damage intentionally. But, one would not expect them to
have to worry that assembly language input fed to the assembler code
might cause some sort of collateral damage.

Sure, I can change the source code for gcc so that it wreaks havok
when I invoke it. But, using the stock gcc compiler there shouldn't
be any source file I can feed it that will cause it to mail my bank
account info to somebody in Eastern Europe, install a keylogger, and
then remove all my files.

--
Grant Edwards grant.b.edwards Yow! I have a TINY BOWL in
at my HEAD
gmail.com
 
Reply With Quote
 
 
 
 
Grant Edwards
Guest
Posts: n/a
 
      01-04-2013
On 2013-01-04, Steven D'Aprano <(E-Mail Removed)> wrote:
> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
>
>> I've written a small assembler in Python 2.[67], and it needs to
>> evaluate integer-valued arithmetic expressions in the context of a
>> symbol table that defines integer values for a set of names.


[...]

[ my attaempt at a safer eval() ]

> So, here's my probably-not-safe-either "safe eval":
>
>
> def probably_not_safe_eval(expr):
> if 'import' in expr.lower():
> raise ParseError("'import' prohibited")
> for c in '_"\'.':
> if c in expr:
> raise ParseError('prohibited char %r' % c)
> if len(expr) > 120:
> raise ParseError('expression too long')
> globals = {'__builtins__': None}
> locals = symbolTable
> return eval(expr, globals, locals) # fingers crossed!
>
> I can't think of any way to break out of these restrictions, but that may
> just mean I'm not smart enough.


I've added equals, backslash, commas, square/curly brackets, colons and semicolons to the
prohibited character list. I also reduced the maximum length to 60
characters. It's unfortunate that parentheses are overloaded for both
expression grouping and for function calling...

def lessDangerousEval(expr):
if 'import' in expr.lower():
raise ParseError("'import' prohibited in expression")
for c in '_"\'.;:[]{}=\\':
if c in expr:
raise ParseError("prohibited char '%r' in expression" % c)
if len(expr) > 60:
raise ParseError('expression too long')
globals = {'__builtins__': None}
locals = symbolTable
return eval(expr, globals, locals) # fingers crossed!

Exploits anyone?

--
Grant Edwards grant.b.edwards Yow! I'm ZIPPY the PINHEAD
at and I'm totally committed
gmail.com to the festive mode.
 
Reply With Quote
 
 
 
 
Chris Angelico
Guest
Posts: n/a
 
      01-04-2013
On Sat, Jan 5, 2013 at 3:38 AM, Grant Edwards <(E-Mail Removed)> wrote:
> I've added equals, backslash, commas, square/curly brackets, colons and semicolons to the
> prohibited character list. I also reduced the maximum length to 60
> characters. It's unfortunate that parentheses are overloaded for both
> expression grouping and for function calling...


I have to say that an expression evaluator that can't handle parens
for grouping is badly flawed. Can you demand that open parenthesis be
preceded by an operator (or beginning of line)? For instance:

(1+2)*3+4 # Valid
1+2*(3+4) # Valid
1+2(3+4) # Invalid, this will attempt to call 2

You could explain it as a protection against mistaken use of algebraic
notation (in which the last two expressions have the same meaning and
evaluate to 15). Or, alternatively, you could simply insert the
asterisk yourself, though that could potentially be VERY confusing.

Without parentheses, your users will be forced to store intermediate
results in variables, which gets tiresome fast.

discriminant = b*b-4*a*c
denominator = 2*a
# Okay, this expression demands a square rooting, but let's pretend that's done.
sol1 = -b+discriminant
sol2 = -b-discrminant
sol1 = sol1/denominator
sol2 /= denominator # if they know about augmented assignment

You can probably recognize the formula I'm working with there, but
it's far less obvious and involves six separate statements rather than
two. And this is a fairly simple formula. It'll get a lot worse in
production.

ChrisA
 
Reply With Quote
 
Grant Edwards
Guest
Posts: n/a
 
      01-04-2013
On 2013-01-04, Chris Angelico <(E-Mail Removed)> wrote:
> On Sat, Jan 5, 2013 at 3:38 AM, Grant Edwards <(E-Mail Removed)> wrote:


>> I've added equals, backslash, commas, square/curly brackets, colons
>> and semicolons to the prohibited character list. I also reduced the
>> maximum length to 60 characters. It's unfortunate that parentheses
>> are overloaded for both expression grouping and for function
>> calling...

>
> I have to say that an expression evaluator that can't handle parens
> for grouping is badly flawed.


Indeed. That's why I didn't disallow parens.

What I was implying was that since you have to allow parens for
grouping, there's no simple way to disallow function calls.

> Can you demand that open parenthesis be preceded by an operator (or
> beginning of line)?


Yes, but once you've parsed the expression to the point where you can
enforce rules like that, you're probably most of the way to doing the
"right" thing and evaluating the expression using ast or pyparsing or
similar.

> You can probably recognize the formula I'm working with there, but
> it's far less obvious and involves six separate statements rather than
> two. And this is a fairly simple formula. It'll get a lot worse in
> production.


In the general case, yes. For this assembler I could _probably_ get
by with expressions of the form <symbol> <op> <literal> where op is
'+' or '-'. But, whenever I try to come up with a minimal solution
like that, it tends to get "enhanced" over the years until it's a
complete mess, doesn't work quite right, and took more total man-hours
than a general and "permanent" solution would have.

Some might argue that repeated tweaking of and adding limitiations to
a "safe eval" is just heading down that same road in a different car.
They'd probably be right: in the end, it will probably have been less
work to just do it with ast. But it's still interesting to try.

--
Grant Edwards grant.b.edwards Yow! Are you the
at self-frying president?
gmail.com
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      01-04-2013
On Sat, Jan 5, 2013 at 4:14 AM, Grant Edwards <(E-Mail Removed)> wrote:
> On 2013-01-04, Chris Angelico <(E-Mail Removed)> wrote:
>> On Sat, Jan 5, 2013 at 3:38 AM, Grant Edwards <(E-Mail Removed)> wrote:

>
>>> I've added equals, backslash, commas, square/curly brackets, colons
>>> and semicolons to the prohibited character list. I also reduced the
>>> maximum length to 60 characters. It's unfortunate that parentheses
>>> are overloaded for both expression grouping and for function
>>> calling...

>>
>> I have to say that an expression evaluator that can't handle parens
>> for grouping is badly flawed.

>
> Indeed. That's why I didn't disallow parens.
>
> What I was implying was that since you have to allow parens for
> grouping, there's no simple way to disallow function calls.


Yeah, and a safe evaluator that allows function calls is highly vulnerable.

>> Can you demand that open parenthesis be preceded by an operator (or
>> beginning of line)?

>
> Yes, but once you've parsed the expression to the point where you can
> enforce rules like that, you're probably most of the way to doing the
> "right" thing and evaluating the expression using ast or pyparsing or
> similar.
>
> Some might argue that repeated tweaking of and adding limitiations to
> a "safe eval" is just heading down that same road in a different car.
> They'd probably be right: in the end, it will probably have been less
> work to just do it with ast. But it's still interesting to try.


Yep, have fun with it. As mentioned earlier, though, security isn't
all that critical; so in this case, chances are you can just leave
parens permitted and let function calls potentially happen.

ChrisA
 
Reply With Quote
 
Grant Edwards
Guest
Posts: n/a
 
      01-04-2013
On 2013-01-04, Chris Angelico <(E-Mail Removed)> wrote:
> On Sat, Jan 5, 2013 at 4:14 AM, Grant Edwards <(E-Mail Removed)> wrote:
>> On 2013-01-04, Chris Angelico <(E-Mail Removed)> wrote:
>>> On Sat, Jan 5, 2013 at 3:38 AM, Grant Edwards <(E-Mail Removed)> wrote:

>>
>>>> I've added equals, backslash, commas, square/curly brackets, colons
>>>> and semicolons to the prohibited character list. I also reduced the
>>>> maximum length to 60 characters. It's unfortunate that parentheses
>>>> are overloaded for both expression grouping and for function
>>>> calling...
>>>
>>> I have to say that an expression evaluator that can't handle parens
>>> for grouping is badly flawed.

>>
>> Indeed. That's why I didn't disallow parens.
>>
>> What I was implying was that since you have to allow parens for
>> grouping, there's no simple way to disallow function calls.

>
> Yeah, and a safe evaluator that allows function calls is highly vulnerable.
>
>>> Can you demand that open parenthesis be preceded by an operator (or
>>> beginning of line)?

>>
>> Yes, but once you've parsed the expression to the point where you can
>> enforce rules like that, you're probably most of the way to doing the
>> "right" thing and evaluating the expression using ast or pyparsing or
>> similar.
>>
>> Some might argue that repeated tweaking of and adding limitiations to
>> a "safe eval" is just heading down that same road in a different car.
>> They'd probably be right: in the end, it will probably have been less
>> work to just do it with ast. But it's still interesting to try.

>
> Yep, have fun with it. As mentioned earlier, though, security isn't
> all that critical; so in this case, chances are you can just leave
> parens permitted and let function calls potentially happen.


An ast-based evaluator wasn't as complicated as I first thought: the
examples I'd been looking at implemented far more features than I
needed. This morning I found a simpler example at

http://stackoverflow.com/questions/2...on-in-a-string

The error messages are still pretty cryptic, so improving
that will add a few more lines. One nice thing about the ast code is
that it's simple to add code to allow C-like character constants such
that ('A' === 0x41). Here's the first pass at ast-based code:

import ast,operator

operators = \
{
ast.Add: operator.iadd,
ast.Sub: operator.isub,
ast.Mult: operator.imul,
ast.Div: operator.idiv,
ast.BitXor: operator.ixor,
ast.BitAnd: operator.iand,
ast.BitOr: operator.ior,
ast.LShift: operator.lshift,
ast.RShift: operator.rshift,
ast.Invert: operator.invert,
ast.USub: operator.neg,
ast.UAdd: operator.pos,
}

def _eval_expr(node):
global symbolTable
if isinstance(node, ast.Name):
if node.id not in symbolTable:
raise ParseError("name '%s' undefined" % node.id)
return symbolTable[node.id]
elif isinstance(node, ast.Num):
return node.n
elif isinstance(node, ast.operator) or isinstance(node, ast.unaryop):
return operators[type(node)]
elif isinstance(node, ast.BinOp):
return _eval_expr(node.op)(_eval_expr(node.left), _eval_expr(node.right))
elif isinstance(node, ast.UnaryOp):
return _eval_expr(node.op)(_eval_expr(node.operand))
else:
raise ParseError("error parsing expression at node %s" % node)

def eval_expr(expr):
return _eval_expr(ast.parse(expr).body[0].value)


--
Grant Edwards grant.b.edwards Yow! A can of ASPARAGUS,
at 73 pigeons, some LIVE ammo,
gmail.com and a FROZEN DAQUIRI!!
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      01-04-2013
On Sat, Jan 5, 2013 at 5:09 AM, Grant Edwards <(E-Mail Removed)> wrote:
> The error messages are still pretty cryptic, so improving
> that will add a few more lines. One nice thing about the ast code is
> that it's simple to add code to allow C-like character constants such
> that ('A' === 0x41). Here's the first pass at ast-based code:


Looks cool, and fairly neat! Now I wonder, is it possible to use that
to create new operators, such as the letter d? Binary operator, takes
two integers...

ChrisA
 
Reply With Quote
 
Grant Edwards
Guest
Posts: n/a
 
      01-04-2013
On 2013-01-04, Chris Angelico <(E-Mail Removed)> wrote:
> On Sat, Jan 5, 2013 at 5:09 AM, Grant Edwards <(E-Mail Removed)> wrote:
>> The error messages are still pretty cryptic, so improving
>> that will add a few more lines. One nice thing about the ast code is
>> that it's simple to add code to allow C-like character constants such
>> that ('A' === 0x41). Here's the first pass at ast-based code:

>
> Looks cool, and fairly neat! Now I wonder, is it possible to use that
> to create new operators, such as the letter d? Binary operator, takes
> two integers...


I don't think you can define new operators. AFAICT, the
lexing/parsing is done using the built-in Python grammar. You can
control the behavior of the predefined operators and reject operators
you don't like, but you can't add new ones or change precedence/syntax
or anything like that.

If you want to tweak the grammar itself, then I think you need to use
something like pyparsing.

--
Grant Edwards grant.b.edwards Yow! I own seven-eighths of
at all the artists in downtown
gmail.com Burbank!
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      01-04-2013
On Sat, Jan 5, 2013 at 5:43 AM, Grant Edwards <(E-Mail Removed)> wrote:
> On 2013-01-04, Chris Angelico <(E-Mail Removed)> wrote:
>> On Sat, Jan 5, 2013 at 5:09 AM, Grant Edwards <(E-Mail Removed)> wrote:
>>> The error messages are still pretty cryptic, so improving
>>> that will add a few more lines. One nice thing about the ast code is
>>> that it's simple to add code to allow C-like character constants such
>>> that ('A' === 0x41). Here's the first pass at ast-based code:

>>
>> Looks cool, and fairly neat! Now I wonder, is it possible to use that
>> to create new operators, such as the letter d? Binary operator, takes
>> two integers...

>
> I don't think you can define new operators. AFAICT, the
> lexing/parsing is done using the built-in Python grammar. You can
> control the behavior of the predefined operators and reject operators
> you don't like, but you can't add new ones or change precedence/syntax
> or anything like that.
>
> If you want to tweak the grammar itself, then I think you need to use
> something like pyparsing.


Oh well, hehe. I've not seen any simple parsers that let you
incorporate D&D-style dice notation ("2d6" means "roll two 6-sided
dice and sum the rolls" - "d6" implies "1d6").

ChrisA
 
Reply With Quote
 
Oscar Benjamin
Guest
Posts: n/a
 
      01-05-2013
On 4 January 2013 15:53, Grant Edwards <(E-Mail Removed)> wrote:
> On 2013-01-04, Steven D'Aprano <(E-Mail Removed)> wrote:
>> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
>>
>> * But frankly, you should avoid eval, and write your own mini-integer
>> arithmetic evaluator which avoids even the most remote possibility
>> of exploit.

>
> That's obviously the "right" thing to do. I suppose I should figure
> out how to use the ast module.


Someone has already created a module that does this called numexpr. Is
there some reason why you don't want to use that?

>>> import numexpr
>>> numexpr.evaluate('2+4*5')

array(22, dtype=int32)
>>> numexpr.evaluate('2+a*5', {'a':4})

array(22L)


Oscar
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: yet another feeb attempt to escape quotes in XPaths Bjoern Hoehrmann XML 0 04-13-2008 07:23 PM
Can't locate object method "first" via package "attempt" (perhaps you forgot to load "attempt"?) at .... GMI Perl Misc 3 06-19-2005 10:44 PM
Yet another book recommendation, but for someone who can program and yet does not the terminology well Berehem C Programming 4 04-28-2005 05:25 PM
Re: Yet another Attempt at Disproving the Halting Problem Peter Olcott C++ 245 08-21-2004 04:48 PM
Double checked locking (yet another attempt to get around) Ed Java 9 07-18-2003 03:59 PM



Advertisments