Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > the de-facto way to "parse" input

Reply
Thread Tools

the de-facto way to "parse" input

 
 
Krumble Bunk
Guest
Posts: n/a
 
      06-11-2008
Hi all,

I am trying my hands at writing a shell for unix. A very rubbish
shell, but nonetheless, I come to a point where I am confused.

I would like to have something like

shell> stop xyz

whereupon the command "stop" will take the argument "xyz" and perform
foo action on it. What in your opinion is the best (easiest?) way to
validate the input (perhaps iterate over a "valid commands" table),
and what calls would you use? (getc(), scanf(), a big while loop and
pointer arithmetic, ...)

I hope this question is not too ambiguous

many thanks
kb
 
Reply With Quote
 
 
 
 
Chris Dollin
Guest
Posts: n/a
 
      06-11-2008
Krumble Bunk wrote:

> I am trying my hands at writing a shell for unix. A very rubbish
> shell, but nonetheless, I come to a point where I am confused.
>
> I would like to have something like
>
> shell> stop xyz
>
> whereupon the command "stop" will take the argument "xyz" and perform
> foo action on it. What in your opinion is the best (easiest?) way to
> validate the input (perhaps iterate over a "valid commands" table),
> and what calls would you use? (getc(), scanf(), a big while loop and
> pointer arithmetic, ...)


I'd use `fgets` to read the line, expanding the buffer as necessary,
carve the line up into space-separated chunks (if we're doing a rubbish
shell, we won't worry about quoting ...) and then I can look the first
chunk up in a table.

If we want something less rubbish, I'd write a recursive-descent
parser for commands. That would force me to be explicit about the
grammar I'm using and what my tokens are supposed to be. I'd build
an abstract-syntax tree for the commands; on no account would I
try and execute them while parsing them.

And I'd write unit tests. Lots of unit tests. And get something
working end-to-end as soon as possible. (Because it's very
disheartening spending a day or more writing a Super Duper
Program That Does It All, and then spending a week or more
debugging it until it does /something/, as opposed to writing
the smallest program one can manage that recognisably does
something right. Like, read a command line in, and print out
the tokens, /and do nothing else/.)

And likely throw away the first attempt, as a learning exercise.

--
"I don't make decisions. I'm a bird." /A Fine and Private Place/

Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England

 
Reply With Quote
 
 
 
 
Krumble Bunk
Guest
Posts: n/a
 
      06-11-2008
On Jun 11, 2:46 pm, Chris Dollin <(E-Mail Removed)> wrote:
> Krumble Bunk wrote:



[.....]


> the tokens, /and do nothing else/.)
>
> And likely throw away the first attempt, as a learning exercise.
>
> --
> "I don't make decisions. I'm a bird." /A Fine and Private Place/
>
> Hewlett-Packard Limited Cain Road, Bracknell, registered no:
> registered office: Berks RG12 1HN 690597 England



Very good advice - I will investigate using lex/yacc.

thanks

kb
 
Reply With Quote
 
rahul
Guest
Posts: n/a
 
      06-12-2008
On Jun 11, 7:12 pm, Krumble Bunk <(E-Mail Removed)> wrote:
> On Jun 11, 2:46 pm, Chris Dollin <(E-Mail Removed)> wrote:
>
> > Krumble Bunk wrote:

>
> [.....]
>
> > the tokens, /and do nothing else/.)

>
> > And likely throw away the first attempt, as a learning exercise.

>
> > --
> > "I don't make decisions. I'm a bird." /A Fine and Private Place/

>
> > Hewlett-Packard Limited Cain Road, Bracknell, registered no:
> > registered office: Berks RG12 1HN 690597 England

>
> Very good advice - I will investigate using lex/yacc.
>
> thanks
>
> kb


lex is by-large the de-facto way for tokenizing. I believe gcc makes
extensive use of lex/yacc ( or may be flex/bison but that does not
make a hell of a difference )
 
Reply With Quote
 
Bartc
Guest
Posts: n/a
 
      06-13-2008

"Chris Dollin" <(E-Mail Removed)> wrote in message
news:g2ol0g$7kd$(E-Mail Removed)...
> Krumble Bunk wrote:
>
>> I am trying my hands at writing a shell for unix. A very rubbish
>> shell, but nonetheless, I come to a point where I am confused.
>>
>> I would like to have something like
>>
>> shell> stop xyz


> If we want something less rubbish, I'd write a recursive-descent
> parser for commands. That would force me to be explicit about the
> grammar I'm using and what my tokens are supposed to be. I'd build
> an abstract-syntax tree for the commands; on no account would I
> try and execute them while parsing them.


An AST for a simple command-line interpreter?

How complex would this shell have to be to make this worthwhile?

>(Because it's very disheartening spending a day or more writing


Might be more disheartening to start a huge project that doesn't finish
because it's overspecified.

I would have suggesting starting with something like the following,
replacing the system() call (and perhaps adjusting the parameter) with the
local equivalent. This would require the handler for each 'command' to be a
separate C program, but has the advantage of the parameters being already
separated.

A bit more work and the commands and parameters can be identified and
executed in the same program.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void main()
{
#define llength 1000
char line[llength];
int i,n;

puts("Type exit to exit.");
puts("");

while (1) {

printf("Prompt>");
fflush(stdout);

if (fgets(line,llength,stdin)==NULL) break;

n=strlen(line); /* get rid of troublesome trailing \n */
if (line[n-1]=='\n') line[n-1]=0;

if (strcmp(line,"exit")==0) break;

if (line[0])
system(line);
};

}

--
Bartc


 
Reply With Quote
 
vippstar@gmail.com
Guest
Posts: n/a
 
      06-14-2008
On Jun 13, 7:07 pm, "Bartc" <(E-Mail Removed)> wrote:
> "Chris Dollin" <(E-Mail Removed)> wrote in message
>
> news:g2ol0g$7kd$(E-Mail Removed)...
>
> > Krumble Bunk wrote:

>
> >> I am trying my hands at writing a shell for unix. A very rubbish
> >> shell, but nonetheless, I come to a point where I am confused.

>
> >> I would like to have something like

>
> >> shell> stop xyz

> > If we want something less rubbish, I'd write a recursive-descent
> > parser for commands. That would force me to be explicit about the
> > grammar I'm using and what my tokens are supposed to be. I'd build
> > an abstract-syntax tree for the commands; on no account would I
> > try and execute them while parsing them.

>
> An AST for a simple command-line interpreter?
>
> How complex would this shell have to be to make this worthwhile?
>
> >(Because it's very disheartening spending a day or more writing

>
> Might be more disheartening to start a huge project that doesn't finish
> because it's overspecified.
>
> I would have suggesting starting with something like the following,
> replacing the system() call (and perhaps adjusting the parameter) with the
> local equivalent. This would require the handler for each 'command' to be a
> separate C program, but has the advantage of the parameters being already
> separated.
>
> A bit more work and the commands and parameters can be identified and
> executed in the same program.
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
>
> void main()

<snip>
Bartc, haven't you been here long enough to remember main returns int?
 
Reply With Quote
 
Bart
Guest
Posts: n/a
 
      06-14-2008
On Jun 14, 3:41*pm, (E-Mail Removed) wrote:
> On Jun 13, 7:07 pm, "Bartc" <(E-Mail Removed)> wrote:
>
>
>
> > "Chris Dollin" <(E-Mail Removed)> wrote in message

>
> >news:g2ol0g$7kd$(E-Mail Removed)...

>
> > > Krumble Bunk wrote:

>
> > >> I am trying my hands at writing a shell for unix. *A very rubbish
> > >> shell, but nonetheless, I come to a point where I am confused.

>
> > >> I would like to have something like

>
> > >> shell> stop xyz
> > > If we want something less rubbish, I'd write a recursive-descent
> > > parser for commands. That would force me to be explicit about the
> > > grammar I'm using and what my tokens are supposed to be. I'd build
> > > an abstract-syntax tree for the commands; on no account would I
> > > try and execute them while parsing them.

>
> > An AST for a simple command-line interpreter?

>
> > How complex would this shell have to be to make this worthwhile?

>
> > >(Because it's very disheartening spending a day or more writing

>
> > Might be more disheartening to start a huge project that doesn't finish
> > because it's overspecified.

>
> > I would have suggesting starting with something like the following,
> > replacing the system() call (and perhaps adjusting the parameter) with the
> > local equivalent. This would require the handler for each 'command' to be a
> > separate C program, but has the advantage of the parameters being already
> > separated.

>
> > A bit more work and the commands and parameters can be identified and
> > executed in the same program.

>
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <string.h>

>
> > void main()

>
> <snip>
> Bartc, haven't you been here long enough to remember main returns int?- Hide quoted text -


Yes, but I suspect I didn't write that bit. Probably the remnants of a
copy&paste of someone else's code. Not my fault at all..

--
Bartc
 
Reply With Quote
 
Chris Dollin
Guest
Posts: n/a
 
      06-16-2008
Bartc wrote:

>
> "Chris Dollin" <(E-Mail Removed)> wrote in message
> news:g2ol0g$7kd$(E-Mail Removed)...
>> Krumble Bunk wrote:
>>
>>> I am trying my hands at writing a shell for unix. A very rubbish
>>> shell, but nonetheless, I come to a point where I am confused.
>>>
>>> I would like to have something like
>>>
>>> shell> stop xyz

>
>> If we want something less rubbish, I'd write a recursive-descent
>> parser for commands. That would force me to be explicit about the
>> grammar I'm using and what my tokens are supposed to be. I'd build
>> an abstract-syntax tree for the commands; on no account would I
>> try and execute them while parsing them.

>
> An AST for a simple command-line interpreter?


"something less rubbish" allows for something that isn't simple.

ASTs aren't complicated, even in C.

> How complex would this shell have to be to make this worthwhile?


Pipes, sequencing, commands. Brackets and built-in commands,
definitely.

>>(Because it's very disheartening spending a day or more writing

>
> Might be more disheartening to start a huge project that doesn't finish
> because it's overspecified.


Where did "huge" come from? And "overspecified"?

--
"Tells of trouble and warns of change to come." /Lothlorien/

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN

 
Reply With Quote
 
Bart
Guest
Posts: n/a
 
      06-16-2008
On Jun 16, 8:04*am, Chris Dollin <(E-Mail Removed)> wrote:
> Bartc wrote:
>
> > "Chris Dollin" <(E-Mail Removed)> wrote in message
> >news:g2ol0g$7kd$(E-Mail Removed)...
> >> Krumble Bunk wrote:

>
> >>> I am trying my hands at writing a shell for unix. *A very rubbish
> >>> shell, but nonetheless, I come to a point where I am confused.

>
> >>> I would like to have something like

>
> >>> shell> stop xyz

>
> >> If we want something less rubbish, I'd write a recursive-descent
> >> parser for commands.

> > An AST for a simple command-line interpreter?

>
> "something less rubbish" allows for something that isn't simple.
>
> ASTs aren't complicated, even in C.


> > How complex would this shell have to be to make this worthwhile?

>
> Pipes, sequencing, commands. Brackets and built-in commands,
> definitely.


I'm not familiar with unix shells. But I don't remember seeing
anything more complicated than a linear series of commands, filenames,
numbers and switches in Windows' shell. But then, maybe Windows' shell
is rubbish.

> >>(Because it's very disheartening spending a day or more writing

>
> > Might be more disheartening to start a huge project that doesn't finish
> > because it's overspecified.

>
> Where did "huge" come from? And "overspecified"?


OK not huge. But I associate ASTs with compilers, and that would seem
an overkill for this task.

Perhaps the OP should start by writing the specifications of his/her
syntax, then it might become clearer which approach is best.

--
Bart
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
way way way OT: MCNGP Announcement Neil MCSE 174 04-17-2006 05:55 PM
AMD Opteron: 1-way, 2-way, ... Up to 8-way. John John Windows 64bit 12 12-27-2005 08:17 AM
Which is faster in ASIC: 2-input AND gate or a 2-input multiplexer Weng Tianxiang VHDL 12 08-11-2005 10:50 AM
hide the input box or change the color of input box in html ashutosh Java 3 06-16-2005 02:21 PM
Input Drops With An Empty Input Queue Spiz Cisco 12 05-18-2005 05:28 PM



Advertisments