Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > using regular express to analyze lisp code

Reply
Thread Tools

using regular express to analyze lisp code

 
 
Kelie
Guest
Posts: n/a
 
      10-04-2007
hello,

i've spent couple of hours trying to figure out the correct regular
expression to catch a VisualLisp (it is for AutoCAD and has a syntax
that's similar to common lisp) function body. VisualLisp is case-
insensitive. Any line beginning with ";" is for comment (can have
space(s) before ";").

here is an example of VisualLisp function:

(defun get_obj_app_names (obj / rv)
(foreach app (get_registered_apps (vla-get-document obj))
(if (get_xdata obj app)
(setq rv (cons app rv))
)
)
(if rv
;;"This line is comment (comment)"
; This line is also comment
(acad_strlsort rv)
nil
)
)

for a function named foo, it is easy to find the beginning part of the
function
"(defun foo", but it is hard to find the ")" at the end of code block.
if eventually i can't come up with the solution using regular
expression only, what i was thinking is after finding the beginning
part, which is "(defun foo" in this case, i can count the parenthesis,
ignoring anything inside "" and any line for comment, until i find the
closing ")".

not sure if i've made myself understood. thanks for reading.

kelie

 
Reply With Quote
 
 
 
 
Dan
Guest
Posts: n/a
 
      10-04-2007
On Oct 4, 1:13 pm, Kelie <(E-Mail Removed)> wrote:
> hello,
>
> i've spent couple of hours trying to figure out the correct regular
> expression to catch a VisualLisp (it is for AutoCAD and has a syntax
> that's similar to common lisp) function body. VisualLisp is case-
> insensitive. Any line beginning with ";" is for comment (can have
> space(s) before ";").
>
> here is an example of VisualLisp function:
>
> (defun get_obj_app_names (obj / rv)
> (foreach app (get_registered_apps (vla-get-document obj))
> (if (get_xdata obj app)
> (setq rv (cons app rv))
> )
> )
> (if rv
> ;;"This line is comment (comment)"
> ; This line is also comment
> (acad_strlsort rv)
> nil
> )
> )
>
> for a function named foo, it is easy to find the beginning part of the
> function
> "(defun foo", but it is hard to find the ")" at the end of code block.
> if eventually i can't come up with the solution using regular
> expression only, what i was thinking is after finding the beginning
> part, which is "(defun foo" in this case, i can count the parenthesis,
> ignoring anything inside "" and any line for comment, until i find the
> closing ")".
>
> not sure if i've made myself understood. thanks for reading.
>
> kelie


So, paren matching is a canonical context-sensitive algorithm. Now,
many regex libraries have *some* not-purely-regular features, but I
doubt your going to find anything to match parens in a single regex.
If you want to go all out you can use a parser generator (for python
parser generators, see http://python.fyxm.net/topics/parsing.html).
Otherwise, you can go about it the quick-and-dirty way you describe:
scan for matching open and close parens, and ignore things in quotes
and comments.

-Dan

 
Reply With Quote
 
 
 
 
Tim Chase
Guest
Posts: n/a
 
      10-04-2007
> i've spent couple of hours trying to figure out the correct regular
> expression to catch a VisualLisp

[snipped]
> "(defun foo", but it is hard to find the ")" at the end of code block.
> if eventually i can't come up with the solution using regular
> expression only, what i was thinking is after finding the beginning
> part, which is "(defun foo" in this case, i can count the parenthesis,
> ignoring anything inside "" and any line for comment, until i find the
> closing ")".



"""
Some people, when confronted with a problem, think
"I know, I'll use regular expressions!"
Now they have two problems
"""


Regular expressions are a wonderful tool when the domain is
correct. However, when your domain involves processing
arbitrarily nested syntax, regexps are not your friend. It is
sometimes feasible to mung them into a fixed-depth-nesting
parser, but it's always fairly painful, and the fixed-depth is an
annoying limitation.

Use a parsing lib. I've tinkered a bit with PyParsing[1] which
is fairly easy to pick up, but powerful enough that you're not
banging your head against limitations. There are a number of
other parsing libraries[2] with various domain-specific features
and audiences, but I'd go browsing through them only if PyParsing
doesn't fill the bill.

As you don't detail what you want to do with the content or how
pathological the input can be, but you might be able to get away
with just skimming through the input and counting open-parens and
close-parens, stopping when they've been balanced, skipping lines
with comments.

-tkc

[1] http://pyparsing.wikispaces.com/
[2] http://nedbatchelder.com/text/python-parsers.html
 
Reply With Quote
 
Kelie
Guest
Posts: n/a
 
      10-04-2007
On Oct 4, 7:50 am, Tim Chase <(E-Mail Removed)> wrote:
> Use a parsing lib. I've tinkered a bit with PyParsing[1] which
> is fairly easy to pick up, but powerful enough that you're not
> banging your head against limitations. There are a number of
> other parsing libraries[2] with various domain-specific features
> and audiences, but I'd go browsing through them only if PyParsing
> doesn't fill the bill.
>
> As you don't detail what you want to do with the content or how
> pathological the input can be, but you might be able to get away
> with just skimming through the input and counting open-parens and
> close-parens, stopping when they've been balanced, skipping lines
> with comments.


thanks Tim. following you and Dan's advice i visited
http://python.fyxm.net/topics/parsing.html and i picked up pyparsing
after brief reading of descriptions for couple of packages. now that
you recommended it, seems that i made a good choice.

btw, the content found will be copied to a new text file.




 
Reply With Quote
 
Kelie
Guest
Posts: n/a
 
      10-04-2007
On Oct 4, 7:28 am, Dan <(E-Mail Removed)> wrote:
> So, paren matching is a canonical context-sensitive algorithm. Now,
> many regex libraries have *some* not-purely-regular features, but I
> doubt your going to find anything to match parens in a single regex.
> If you want to go all out you can use a parser generator (for python
> parser generators, seehttp://python.fyxm.net/topics/parsing.html).
> Otherwise, you can go about it the quick-and-dirty way you describe:
> scan for matching open and close parens, and ignore things in quotes
> and comments.
>
> -Dan


Dan, thanks for suggesting parser generators.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[OT] Using LISP/PROLOG to parse regular expression Man-wai Chang C++ 2 03-03-2012 02:36 PM
Nice historical Musical - VERY RELAXING - about LISP history -fundamental ideas of LISP nanothermite911fbibustards C++ 0 06-16-2010 09:47 PM
Nice historical Musical - VERY RELAXING - about LISP history -fundamental ideas of LISP nanothermite911fbibustards Python 0 06-16-2010 09:47 PM
tools to analyze the code at compile time junky_fellow@yahoo.co.in C Programming 2 04-17-2008 12:24 PM
pat-match.lisp or extend-match.lisp in Python? ekzept Python 0 08-10-2007 06:08 PM



Advertisments