Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C++ (http://www.velocityreviews.com/forums/f39-c.html)
-   -   Can anyone write this recursion for simple regexp more beautifullyand clearly than the braggarts (http://www.velocityreviews.com/forums/t696411-can-anyone-write-this-recursion-for-simple-regexp-more-beautifullyand-clearly-than-the-braggarts.html)

bolega 08-29-2009 04:35 AM

Can anyone write this recursion for simple regexp more beautifullyand clearly than the braggarts
 
This braggart admits that he had to put this code in TWO books and
visit it twice to be explained. I am puting the excerpt from pp2-4 of
this book and the C code. The C code will become indented and syntax
highlighted once you paste in emacs etc. It is my belief and
observation on a lot of problems by these so called "demi gods" that
they are actually all average and no more intelligent. Its all that
they got some opportunities to study some things at opportune time and
facilities and also spent a lot of time in a productive environment
and team.

I know that lisp eval is written more clear than this recursion below
because I am able to read it easily. and that code is almost self
explanatory. C is more quirky. When you really mean recursively call
another function, you are using return so you can have tail
recursion ???? .

Anyway, its your chance to show how clear C/C++/lisp/Scheme code you
can write that is clearer. Also, i dont exclude pseudocode but it
should be clear enough to be instantly translatable to a programming
language. The real goal is to learn how to write or DERIVE recursion,
how to enumerate cases, order them, and build recursion. You may even
put some introductory tuturial and dont have to reply here in ascii
but can upload a pdf at some link in rapidshare etc. Look how he put
the code after problem statement and then has the need to explain it.
Ideally, the discussion should be such that the student or reader
himself jumps to the solution. That is why I give these unix school of
pike/thomson/kernighan low grade in almost all their expositions
except that they wrote earliest books to make millions of dollars in
royalties and since now they are nobody due to linux, they are poorly
regurgitating old material.

Enjoy .............

============================

The Practice of Programming

In 1998, Rob Pike and I were writing The Practice of Programming
(Addison-Wesley). The
last chapter of the book, “Notation,” collected a number of examples
where good notation
led to better programs and better programming. This included the use
of simple data specifications
(printf, for instance), and the generation of code from tables.
Because of our Unix backgrounds and nearly 30 years of experience with
tools based on
regular expression notation, we naturally wanted to include a
discussion of regular
expressions, and it seemed mandatory to include an implementation as
well. Given our
emphasis on tools, it also seemed best to focus on the class of
regular expressions found in
grep—rather than, say, those from shell wildcards—since we could also
then talk about the
design of grep itself.
The problem was that any existing regular expression package was far
too big. The local
grep was over 500 lines long (about 10 book pages) and encrusted with
barnacles. Open
source regular expression packages tended to be huge—roughly the size
of the entire
book—because they were engineered for generality, flexibility, and
speed; none were
remotely suitable for pedagogy.
I suggested to Rob that we find the smallest regular expression
package that would illustrate
the basic ideas while still recognizing a useful and nontrivial class
of patterns. Ideally,
the code would fit on a single page.
Rob disappeared into his office. As I remember it now, he emerged in
no more than an
hour or two with the 30 lines of C code that subsequently appeared in
Chapter 9 of The
Practice of Programming. That code implements a regular expression
matcher that handles
the following constructs.

Character Meaning
c Matches any literal character c.
.. (period) Matches any single character.
^ Matches the beginning of the input string.
$ Matches the end of the input string.
* Matches zero or more occurrences of the previous character.

This is quite a useful class; in my own experience of using regular
expressions on a day-today
basis, it easily accounts for 95 percent of all instances. In many
situations, solving the
right problem is a big step toward creating a beautiful program. Rob
deserves great credit
for choosing a very small yet important, well-defined, and extensible
set of features from
among a wide set of options.
Rob’s implementation itself is a superb example of beautiful code:
compact, elegant,
efficient, and useful. It’s one of the best examples of recursion that
I have ever seen, and it
shows the power of C pointers. Although at the time we were most
interested in conveying
the important role of good notation in making a program easier to use
(and perhaps
easier to write as well), the regular expression code has also been an
excellent way to
illustrate algorithms, data structures, testing, performance
enhancement, and other
important topics.

Implementation
In The Practice of Programming, the regular expression matcher is part
of a standalone program
that mimics grep, but the regular expression code is completely
separable from its
surroundings. The main program is not interesting here; like many Unix
tools, it reads
either its standard input or a sequence of files, and prints those
lines that contain a match
of the regular expression.
This is the matching code:
/* match: search for regexp anywhere in text */
int match(char *regexp, char *text)
{
if (regexp[0] == '^')
return matchhere(regexp+1, text);
do { /* must look even if string is empty */
if (matchhere(regexp, text))
return 1;
} while (*text++ != '\0');
return 0;
}
/* matchhere: search for regexp at beginning of text */
int matchhere(char *regexp, char *text)
{
if (regexp[0] == '\0')
return 1;
if (regexp[1] == '*')
return matchstar(regexp[0], regexp+2, text);
Character Meaning
c Matches any literal character c.
.. (period) Matches any single character.
^ Matches the beginning of the input string.
$ Matches the end of the input string.
* Matches zero or more occurrences of the previous character.
4 C H A P T E R O N E
if (regexp[0] == '$' && regexp[1] == '\0')
return *text == '\0';
if (*text!='\0' && (regexp[0]=='.' || regexp[0]==*text))
return matchhere(regexp+1, text+1);
return 0;
}
/* matchstar: search for c*regexp at beginning of text */
int matchstar(int c, char *regexp, char *text)
{
do { /* a * matches zero or more instances */
if (matchhere(regexp, text))
return 1;
} while (*text != '\0' && (*text++ == c || c == '.'));
return 0;
}


bolega 08-29-2009 04:50 AM

Re: Can anyone write this recursion for simple regexp morebeautifully and clearly than the braggarts
 
let me paste the code separately since it has some garbage that
inserted in the middle although it was just one block of text.

This is the matching code:
/* match: search for regexp anywhere in text */
int match(char *regexp, char *text)
{
if (regexp[0] == '^')
return matchhere(regexp+1, text);
do { /* must look even if string is empty */
if (matchhere(regexp, text))
return 1;
} while (*text++ != '\0');
return 0;
}
/* matchhere: search for regexp at beginning of text */
int matchhere(char *regexp, char *text)
{
if (regexp[0] == '\0')
return 1;
if (regexp[1] == '*')
return matchstar(regexp[0], regexp+2, text);
if (regexp[0] == '$' && regexp[1] == '\0')
return *text == '\0';
if (*text!='\0' && (regexp[0]=='.' || regexp[0]==*text))
return matchhere(regexp+1, text+1);
return 0;
}
/* matchstar: search for c*regexp at beginning of text */
int matchstar(int c, char *regexp, char *text)
{
do { /* a * matches zero or more instances */
if (matchhere(regexp, text))
return 1;
} while (*text != '\0' && (*text++ == c || c == '.'));
return 0;
}


By the way did you note the spicy style of narration by which they
promote each other ? and can you abstract the technique of narration
for image building ? Another scum in electronic, Bob Pease of national
semiconductor writes in the same style. I know all their basic ideas
in that field, which are trivial. The only art is the art of writing
and propaganda.

The one man I have real respect for and is humble is McCarthy. He
really put some original ideas together but still did not give any
clue how he constructed them. However, in lisp recursion is lucidly
clear. You test if a thing is nil. Else you check if it is an atom and
act appropriately to call a handler and otherwise recursion.


spinoza1111 08-29-2009 09:51 AM

Re: Can anyone write this recursion for simple regexp morebeautifully and clearly than the braggarts
 
On Aug 29, 12:35*pm, bolega <gnuist...@gmail.com> wrote:
> This braggart admits that he had to put this code in TWO books and
> visit it twice to be explained. I am puting the excerpt from pp2-4 of
> this book and the C code. The C code will become indented and syntax
> highlighted once you paste in emacs etc. It is my belief and
> observation on a lot of problems by these so called "demi gods" that
> they are actually all average and no more intelligent. Its all that
> they got some opportunities to study some things at opportune time and
> facilities and also spent a lot of time in a productive environment
> and team.
>
> I know that lisp eval is written more clear than this recursion below
> because I am able to read it easily. and that code is almost self
> explanatory. C is more quirky. When you really mean recursively call
> another function, you are using return so you can have tail
> recursion ???? .
>
> Anyway, its your chance to show how clear C/C++/lisp/Scheme code you
> can write that is clearer. Also, i dont exclude pseudocode but it
> should be clear enough to be instantly translatable to a programming
> language. The real goal is to learn how to write or DERIVE recursion,
> how to enumerate cases, order them, and build recursion. You may even
> put some introductory tuturial and dont have to reply here in ascii
> but can upload a pdf at some link in rapidshare etc. Look how he put
> the code after problem statement and then has the need to explain it.
> Ideally, the discussion should be such that the student or reader
> himself jumps to the solution. That is why I give these unix school of
> pike/thomson/kernighan low grade in almost all their expositions
> except that they wrote earliest books to make millions of dollars in
> royalties and since now they are nobody due to linux, they are poorly
> regurgitating old material.
>
> Enjoy .............
>
> ============================
>
> The Practice of Programming
>
> In 1998, Rob Pike and I were writing The Practice of Programming
> (Addison-Wesley). The
> last chapter of the book, “Notation,” collected a number of examples
> where good notation
> led to better programs and better programming. This included the use
> of simple data specifications
> (printf, for instance), and the generation of code from tables.
> Because of our Unix backgrounds and nearly 30 years of experience with
> tools based on
> regular expression notation, we naturally wanted to include a
> discussion of regular
> expressions, and it seemed mandatory to include an implementation as
> well. Given our
> emphasis on tools, it also seemed best to focus on the class of
> regular expressions found in
> grep—rather than, say, those from shell wildcards—since we could also
> then talk about the
> design of grep itself.
> The problem was that any existing regular expression package was far
> too big. The local
> grep was over 500 lines long (about 10 book pages) and encrusted with
> barnacles. Open
> source regular expression packages tended to be huge—roughly the size
> of the entire
> book—because they were engineered for generality, flexibility, and
> speed; none were
> remotely suitable for pedagogy.
> I suggested to Rob that we find the smallest regular expression
> package that would illustrate
> the basic ideas while still recognizing a useful and nontrivial class
> of patterns. Ideally,
> the code would fit on a single page.
> Rob disappeared into his office. As I remember it now, he emerged in
> no more than an
> hour or two with the 30 lines of C code that subsequently appeared in
> Chapter 9 of The
> Practice of Programming. That code implements a regular expression
> matcher that handles
> the following constructs.
>
> Character Meaning
> c Matches any literal character c.
> . (period) Matches any single character.
> ^ Matches the beginning of the input string.
> $ Matches the end of the input string.
> * Matches zero or more occurrences of the previous character.
>
> This is quite a useful class; in my own experience of using regular
> expressions on a day-today
> basis, it easily accounts for 95 percent of all instances. In many
> situations, solving the
> right problem is a big step toward creating a beautiful program. Rob
> deserves great credit
> for choosing a very small yet important, well-defined, and extensible
> set of features from
> among a wide set of options.
> Rob’s implementation itself is a superb example of beautiful code:
> compact, elegant,
> efficient, and useful. It’s one of the best examples of recursion that
> I have ever seen, and it
> shows the power of C pointers. Although at the time we were most
> interested in conveying
> the important role of good notation in making a program easier to use
> (and perhaps
> easier to write as well), the regular expression code has also been an
> excellent way to
> illustrate algorithms, data structures, testing, performance
> enhancement, and other
> important topics.
>
> Implementation
> In The Practice of Programming, the regular expression matcher is part
> of a standalone program
> that mimics grep, but the regular expression code is completely
> separable from its
> surroundings. The main program is not interesting here; like many Unix
> tools, it reads
> either its standard input or a sequence of files, and prints those
> lines that contain a match
> of the regular expression.
> This is the matching code:
> /* match: search for regexp anywhere in text */
> int match(char *regexp, char *text)
> {
> if (regexp[0] == '^')
> return matchhere(regexp+1, text);
> do { /* must look even if string is empty */
> if (matchhere(regexp, text))
> return 1;} while (*text++ != '\0');
> return 0;
> }
>
> /* matchhere: search for regexp at beginning of text */
> int matchhere(char *regexp, char *text)
> {
> if (regexp[0] == '\0')
> return 1;
> if (regexp[1] == '*')
> return matchstar(regexp[0], regexp+2, text);
> Character Meaning
> c Matches any literal character c.
> . (period) Matches any single character.
> ^ Matches the beginning of the input string.
> $ Matches the end of the input string.
> * Matches zero or more occurrences of the previous character.
> 4 C H A P T E R O N E
> if (regexp[0] == '$' && regexp[1] == '\0')
> return *text == '\0';
> if (*text!='\0' && (regexp[0]=='.' || regexp[0]==*text))
> return matchhere(regexp+1, text+1);
> return 0;}
>
> /* matchstar: search for c*regexp at beginning of text */
> int matchstar(int c, char *regexp, char *text)
> {
> do { /* a * matches zero or more instances */
> if (matchhere(regexp, text))
> return 1;
>
>
>
> } while (*text != '\0' && (*text++ == c || c == '.'));
> return 0;
> }- Hide quoted text -
>
> - Show quoted text -


Many people seem to have been upset with this article in Beautiful
Code. I emailed Kernighan about it and received a reply (I'd met the
guy), but no real answer to the basic problem, which is that C cannot
express programming ideas clearly.

The first problem is that by being so "simple" it fails to implement a
recognizable "regular expression" processor.

I thought a second problem was that it used a value parameter as a
work area, which isn't something I would do in my languages of choice
(C Sharp and VB in recent years), but I found myself doing this in C
when I returned to the language mostly to kick the **** out it.

A third problem is praising a programmer for fast work. There's enough
slipshod work in our industry, and enough programmer overwork as it
is.

I think Brian Kernighan would be the first to admit that Lisp as
opposed to C made a more far-reaching contribution to computer science.

JustBoo 08-29-2009 02:10 PM

Re: Can anyone write this recursion for simple regexp more beautifullyand clearly than the braggarts
 
bolega wrote:
> This braggart admits that he had to put this code in TWO books and
> visit it twice to be explained. I am puting the excerpt from pp2-4
> of this book and the C code.

[...]

Well, witness yet another way, boy and girls, to market a book. We'll
be seeing a lot of it from now on I'm sure. Meh. Enjoy.

bolega 08-29-2009 02:38 PM

Re: Can anyone write this recursion for simple regexp morebeautifully and clearly than the braggarts
 
On Aug 29, 2:51*am, spinoza1111 <spinoza1...@yahoo.com> wrote:

> Many people seem to have been upset with this article in Beautiful
> Code. I emailed Kernighan about it and received a reply (I'd met the
> guy), but no real answer to the basic problem, which is that C cannot
> express programming ideas clearly.


I admire the fact that someone brought all this in the public
discussion because a lot of sleazy people (not referring in particular
to this troika) have been able to propagate crap and kept it immune to
academic critique just due to people afraid of criticizing it which
leads to a lot of newbie sheeple.

Do you think C syntax is brain damaged ? Can you explain your second
point in more detail and clearly as I could not understand any of it.

> The first problem is that by being so "simple" it fails to implement a
> recognizable "regular expression" processor.
>
> I thought a second problem was that it used a value parameter as a
> work area, which isn't something I would do in my languages of choice
> (C Sharp and VB in recent years), but I found myself doing this in C
> when I returned to the language mostly to kick the **** out it.
>
> A third problem is praising a programmer for fast work. There's enough
> slipshod work in our industry, and enough programmer overwork as it
> is.
>
> I think Brian Kernighan would be the first to admit that Lisp as
> opposed to C made a more far-reaching contribution to computer science.- Hide quoted text -


Actually, in C all they did was to borrow parens and braces from math
to delimit blocks - BUT lisp already had the idea in the prefix
notation. The other contribution is the for loop with all the loop
elements at the top in the interface. The declaration syntax of
pointers is poor and ugly. switch-case has fall thru. break and
continue is new unless PL1 or BCPL had it.

BUT YOU HAVE NOT SHOWN how to structure the recursion. Does it need
NFA ? I need a tutorial because C is to stay and I can avoid some of
its ugliness by using C++ as pretty C.

> - Show quoted text -


I admire the fact that you


Ed Morton 08-29-2009 03:11 PM

Re: Can anyone write this recursion for simple regexp morebeautifully and clearly than the braggarts
 
On Aug 28, 11:35*pm, bolega <gnuist...@gmail.com> wrote:
> This braggart admits that he had to put this code in TWO books and
> visit it twice to be explained. I am puting the excerpt from pp2-4 of
> this book and the C code.


You should post this to comp.lang.c.

Ed.

bolega 08-29-2009 05:16 PM

Re: Can anyone write this recursion for simple regexp morebeautifully and clearly than the braggarts
 
On Aug 29, 8:53*am, A.L. <alewa...@aol.com> wrote:
> On Sat, 29 Aug 2009 07:10:59 -0700, JustBoo <B...@boowho.com> wrote:
> >bolega wrote:
> >> This braggart admits that he had to put this code in TWO books and
> >> visit it twice to be explained. I am puting the excerpt from pp2-4
> >> of this book and the C code.

> >[...]

>
> >Well, witness yet another way, boy and girls, to market a book. We'll
> >be seeing a lot of it from now on I'm sure. Meh. Enjoy.

>
> Very good way to market a book. Something criticized on c.l.l. must be
> really good.
>
> Ordered from Amazon 5 minutes ago.
>
> A.L.


please dont derail my thread, BUT, you ordered the wrong book. The one
you SHOULD have ordered is this one:

Book Review
---------------------------------------------------------------------------*-----
Advanced C Struct Programming by John W.L. Ogilvie
Highly Recommended
ISBN: 0-471-51943-X Publisher: Wiley Pages: 405pp
Price: £22.95

Categories: advanced c data structures
Reviewed by Francis Glassborow in C Vu 3-2 (Jan 1991)


This is the kind of book that I might easily miss on a casual visit
to
my local bookshop. In all honesty my first impression when I opened
it
was not that good. It is aimed at people who wish to take programming
seriously yet it first seemed more like the kind of text that I am
used to finding on the hobbyists shelves. It is much better than
that.
When you have mastered the foothills of programming in C, have a good
runtime library reference on your shelf and, perhaps, have invested
in
a book on data structures and another on programming algorithms or
techniques what do you get to help you develop good medium size
programs (all right large ones if you must)?


If you had asked my advice a month ago, I would have hummed and hawed
and come up with a couple of titles and then suggested that what you
really wanted was a book on program/data design rather than one on C.


Advanced C Struct Programming tackles this need. The author's
declared
intent is to present a practical method for designing and
implementing
complex (complicated) data structures in C. In doing so he leads you
through experience (sometimes of false trails) in tackling a number
of
different programming problems.


The book does not include complete applications or libraries of
source
code. It is a book to be worked through. By the time you have
finished
it you should be a much better programmer. Let me warn you that it is
not a book to dip into in that odd spare moment. If that is all you
have time for then go and do something else.


On the other hand it is not like some of the books above that will
take you years to fully grasp (if ever). Buy this book, set aside a
regular time to work at it, stick to your routine and find yourself
becoming far more professional in your programming.


Yes, I like it and it is not machine dependant. For once I am glad
that the supporting discs are relatively expensive ($39.95 in IBM and
Apple Mac formats) as I think that you will only get the full benefit
by grafting at the keyboard yourself.


---------------------------------------------------------------------------*-----
Last Update - 13 May 2001.


To link to this review, please use the URL:
http://www.accu.org/bookreviews/publ.../a/a000142.htm


Copyright © The Association of C & C++ Users 1998-2000. All rights
reserved.
Mirrored from http://www.accu.org/



Chris McDonald 08-29-2009 11:02 PM

Re: Can anyone write this recursion for simple regexp more beautifully and clearly than the braggarts
 
Ed Morton <mortonspam@gmail.com> writes:

>On Aug 28, 11:35=A0pm, bolega <gnuist...@gmail.com> wrote:
>> This braggart admits that he had to put this code in TWO books and
>> visit it twice to be explained. I am puting the excerpt from pp2-4 of
>> this book and the C code.


>You should post this to comp.lang.c.


> Ed.


He did, and I smelt a Bilges sock puppet.

--
Chris.

Francesco 08-30-2009 11:20 AM

Re: Can anyone write this recursion for simple regexp morebeautifully and clearly than the braggarts
 
On 29 Ago, 19:16, bolega <gnuist...@gmail.com> wrote:
> On Aug 29, 8:53*am, A.L. <alewa...@aol.com> wrote:
>
>
>
> > On Sat, 29 Aug 2009 07:10:59 -0700, JustBoo <B...@boowho.com> wrote:
> > >bolega wrote:
> > >> This braggart admits that he had to put this code in TWO books and
> > >> visit it twice to be explained. I am puting the excerpt from pp2-4
> > >> of this book and the C code.
> > >[...]

>
> > >Well, witness yet another way, boy and girls, to market a book. We'll
> > >be seeing a lot of it from now on I'm sure. Meh. Enjoy.

>
> > Very good way to market a book. Something criticized on c.l.l. must be
> > really good.

>
> > Ordered from Amazon 5 minutes ago.

>
> > A.L.

>
> please dont derail my thread, BUT, you ordered the wrong book. The one
> you SHOULD have ordered is this one:


Just out of curiosity bolega, how could you know which book A.L.
ordered?

Regards,
Francesco

Chris Dollin 09-01-2009 07:17 AM

Re: Can anyone write this recursion for simple regexp more beautifully and clearly than the braggarts
 
spinoza1111 wrote:

(re: the Beautiful Code RE matcher)

> The first problem is that by being so "simple" it fails to implement a
> recognizable "regular expression" processor.


False. It implements a /useful subset/ of REs and provides a framework
for adding more features -- it's an educational tool, not a library
component.

--
"I have travelled far and wide upon this journey." - The Reasoning,
/A Musing Dream/

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN



All times are GMT. The time now is 01:39 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.