Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Ruby (http://www.velocityreviews.com/forums/f66-ruby.html)
-   -   a little parsing challenge ☺ (http://www.velocityreviews.com/forums/t867522-a-little-parsing-challenge-a.html)

Xah Lee 07-17-2011 07:47 AM

a little parsing challenge ☺
 
2011-07-16

folks, this one will be interesting one.

the problem is to write a script that can check a dir of text files
(and all subdirs) and reports if a file has any mismatched matching
brackets.

• The files will be utf-8 encoded (unix style line ending).

• If a file has mismatched matching-pairs, the script will display the
file name, and the line number and column number of the first
instance where a mismatched bracket occures. (or, just the char number
instead (as in emacs's “point”))

• the matching pairs are all single unicode chars. They are these and
nothing else: () {} [] “” ‹› «» 【】 〈〉 《》 「」『』
Note that ‘single curly quote’ is not consider matching pair here.

• You script must be standalone. Must not be using some parser tools.
But can call lib that's part of standard distribution in your lang.

Here's a example of mismatched bracket: ([)], (“[[”), ((, 】etc. (and
yes, the brackets may be nested. There are usually text between these
chars.)

I'll be writing a emacs lisp solution and post in 2 days. Ι welcome
other lang implementations. In particular, perl, python, php, ruby,
tcl, lua, Haskell, Ocaml. I'll also be able to eval common lisp
(clisp) and Scheme lisp (scsh), Java. Other lang such as Clojure,
Scala, C, C++, or any others, are all welcome, but i won't be able to
eval it. javascript implementation will be very interesting too, but
please indicate which and where to install the command line version.

I hope you'll find this a interesting “challenge”. This is a parsing
problem. I haven't studied parsers except some Wikipedia reading, so
my solution will probably be naive. I hope to see and learn from your
solution too.

i hope you'll participate. Just post solution here. Thanks.

Xah

Raymond Hettinger 07-17-2011 09:48 AM

Re: a little parsing challenge ☺
 
On Jul 17, 12:47*am, Xah Lee <xah...@gmail.com> wrote:
> i hope you'll participate. Just post solution here. Thanks.


http://pastebin.com/7hU20NNL


Raymond

Robert Klemme 07-17-2011 01:20 PM

Re: a little parsing challenge ☺
 
On 07/17/2011 11:48 AM, Raymond Hettinger wrote:
> On Jul 17, 12:47 am, Xah Lee<xah...@gmail.com> wrote:
>> i hope you'll participate. Just post solution here. Thanks.

>
> http://pastebin.com/7hU20NNL


Ruby solution: https://gist.github.com/1087583

Kind regards

robert

mhenn 07-17-2011 01:55 PM

Re: a little parsing challenge ☺
 
Am 17.07.2011 15:20, schrieb Robert Klemme:
> On 07/17/2011 11:48 AM, Raymond Hettinger wrote:
>> On Jul 17, 12:47 am, Xah Lee<xah...@gmail.com> wrote:
>>> i hope you'll participate. Just post solution here. Thanks.

>>
>> http://pastebin.com/7hU20NNL

>
> Ruby solution: https://gist.github.com/1087583


I acutally don't know Ruby that well, but it looks like your program
recognizes "[(])" as correct although it is not, because you translate
"[(])" to "(())" (which is indeed correct, but does not resemble the
input correctly anymore).

>
> Kind regards
>
> robert



Thomas Jollans 07-17-2011 03:31 PM

Re: a little parsing challenge ☺
 
On Jul 17, 9:47*am, Xah Lee <xah...@gmail.com> wrote:
> 2011-07-16
>
> folks, this one will be interesting one.
>
> the problem is to write a script that can check a dir of text files
> (and all subdirs) and reports if a file has any mismatched matching
> brackets.
>
> • The files will be utf-8 encoded (unix style line ending).
>
> • If a file has mismatched matching-pairs, the script will display the
> file name, and the *line number and column number of the first
> instance where a mismatched bracket occures. (or, just the char number
> instead (as in emacs's “point”))
>
> • the matching pairs are all single unicode chars. They are theseand
> nothing else: () {} [] “” ‹› «»【】 〈〉 《》 「」 『』
> Note that ‘single curly quote’ is not consider matching pair here.
>
> • You script must be standalone. Must not be using some parser tools.
> But can call lib that's part of standard distribution in your lang.
>
> Here's a example of mismatched bracket: ([)], (“[[”), ((,】etc. (and
> yes, the brackets may be nested. There are usually text between these
> chars.)
>
> I'll be writing a emacs lisp solution and post in 2 days. Ι welcome
> other lang implementations. In particular, perl, python, php, ruby,
> tcl, lua, Haskell, Ocaml. I'll also be able to eval common lisp
> (clisp) and Scheme lisp (scsh), Java. Other lang such as Clojure,
> Scala, C, C++, or any others, are all welcome, but i won't be able to
> eval it. javascript implementation will be very interesting too, but
> please indicate which and where to install the command line version.
>
> I hope you'll find this a interesting “challenge”. This is a parsing
> problem. I haven't studied parsers except some Wikipedia reading, so
> my solution will probably be naive. I hope to see and learn from your
> solution too.
>
> i hope you'll participate. Just post solution here. Thanks.


I thought I'd have some fun with multi-processing:

https://gist.github.com/1087682

Thomas Boell 07-17-2011 03:49 PM

Re: a little parsing challenge ☺
 
On Sun, 17 Jul 2011 02:48:42 -0700 (PDT)
Raymond Hettinger <python@rcn.com> wrote:

> On Jul 17, 12:47*am, Xah Lee <xah...@gmail.com> wrote:
> > i hope you'll participate. Just post solution here. Thanks.

>
> http://pastebin.com/7hU20NNL


I'm new to Python. I think I'd have done it in a similar way (in any
language). Your use of openers/closers looks nice though. In the
initialization of openers, I guess you implicitly create a kind of
hash, right? Then the 'in' operator checks for the keys. That is elegant
because you have the openers and closers right next to each other, not
in separate lists.

But why do you enumerate with start=1? Shouldn't you start with index 0?



Robert Klemme 07-17-2011 04:01 PM

Re: a little parsing challenge ☺
 
On 07/17/2011 03:55 PM, mhenn wrote:
> Am 17.07.2011 15:20, schrieb Robert Klemme:
>> On 07/17/2011 11:48 AM, Raymond Hettinger wrote:
>>> On Jul 17, 12:47 am, Xah Lee<xah...@gmail.com> wrote:
>>>> i hope you'll participate. Just post solution here. Thanks.
>>>
>>> http://pastebin.com/7hU20NNL

>>
>> Ruby solution: https://gist.github.com/1087583

>
> I acutally don't know Ruby that well, but it looks like your program
> recognizes "[(])" as correct although it is not, because you translate
> "[(])" to "(())" (which is indeed correct, but does not resemble the
> input correctly anymore).


Right you are. The optimization breaks the logic. Good catch!

Kind regards

robert

Robert Klemme 07-17-2011 04:54 PM

Re: a little parsing challenge ☺
 
On 07/17/2011 06:01 PM, Robert Klemme wrote:
> On 07/17/2011 03:55 PM, mhenn wrote:
>> Am 17.07.2011 15:20, schrieb Robert Klemme:
>>> On 07/17/2011 11:48 AM, Raymond Hettinger wrote:
>>>> On Jul 17, 12:47 am, Xah Lee<xah...@gmail.com> wrote:
>>>>> i hope you'll participate. Just post solution here. Thanks.
>>>>
>>>> http://pastebin.com/7hU20NNL
>>>
>>> Ruby solution: https://gist.github.com/1087583

>>
>> I acutally don't know Ruby that well, but it looks like your program
>> recognizes "[(])" as correct although it is not, because you translate
>> "[(])" to "(())" (which is indeed correct, but does not resemble the
>> input correctly anymore).

>
> Right you are. The optimization breaks the logic. Good catch!


Turns out with a little possessiveness I can fix my original approach
which has the added benefit of not needing three passes through the file
(the two #tr's are obsolete now).

https://gist.github.com/1087583

Cheers

robert

Raymond Hettinger 07-17-2011 07:16 PM

Re: a little parsing challenge ☺
 
On Jul 17, 8:49*am, Thomas Boell <tbo...@domain.invalid> wrote:
> But why do you enumerate with start=1? Shouldn't you start with index 0?


The problem specification says that the the char number should match
the emacs goto-char function which is indexed from one, not from
zero. This is testable by taking the output of the program and
running it through emacs to see that the cursor gets moved exactly to
the location of the mismatched delimiter.


Raymond


rantingrick 07-18-2011 01:52 AM

Re: a little parsing challenge ☺
 
On Jul 17, 2:47*am, Xah Lee <xah...@gmail.com> wrote:
> 2011-07-16
>
> folks, this one will be interesting one.
>
> the problem is to write a script that can check a dir of text files
> (and all subdirs) and reports if a file has any mismatched matching
> brackets.
>
>[...]
>
> You script must be standalone. Must not be using some parser tools.
> But can call lib that's part of standard distribution in your lang.


I stopped reading here and did...

>>> from HyperParser import HyperParser # python2.x


....and called it a day. ;-) This module is part of the stdlib (idlelib
\HyperParser) so as per your statement it is legal (may not be the
fastest solution).



All times are GMT. The time now is 01:56 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.