Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > LaTeX-Like Parsing in C

Reply
Thread Tools

LaTeX-Like Parsing in C

 
 
nedelm@po-box.mcgill.ca
Guest
Posts: n/a
 
      07-26-2007
My problem's with parsing. I have this (arbitrary, from a file)
string, lets
say:

"Directory: /file{File:/filename(/size) }"

I would like it to behave similar to LaTeX. I parse it, and then I
write it
out for diferent variables, like:

"Directory: File:.(0) File:..(0) File:a.out(12) File:foo(1) "

But I keep getting into a mess of complication. I'm using C (of
course.) How
do I parse it? strpbrk(,"/{}") (what then?) How can I get the string
to a
data-structure that I could write out? Algorithms?

-Neil

 
Reply With Quote
 
 
 
 
Richard Heathfield
Guest
Posts: n/a
 
      07-26-2007
said:

> My problem's with parsing. I have this (arbitrary, from a file)
> string, lets
> say:
>
> "Directory: /file{File:/filename(/size) }"
>
> I would like it to behave similar to LaTeX. I parse it, and then I
> write it
> out for diferent variables, like:
>
> "Directory: File:.(0) File:..(0) File:a.out(12) File:foo(1) "
>
> But I keep getting into a mess of complication. I'm using C (of
> course.) How
> do I parse it? strpbrk(,"/{}") (what then?) How can I get the string
> to a
> data-structure that I could write out? Algorithms?


Start with a lexing stage, where you simply break the input into lexical
tokens, doing your best to identify them as you go but not worrying too
much about odd cases. Store your lexical tokens in some kind of dynamic
data structure such as a linked list. Yes, strpbrk will work for this,
or even strtok if your input is writeable.

That will massively reduce the complexity of the parsing stage, since
you won't have to worry about tokenisation (because each token is
simply the next node on the linked list), and so you can focus purely
on the grammar that you are trying to implement.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
 
Reply With Quote
 
 
 
 
Chris Dollin
Guest
Posts: n/a
 
      07-27-2007
Richard Heathfield wrote:

> said:
>
>> My problem's with parsing. I have this (arbitrary, from a file)
>> string, lets
>> say:
>>
>> "Directory: /file{File:/filename(/size) }"
>>
>> I would like it to behave similar to LaTeX. I parse it, and then I
>> write it
>> out for diferent variables, like:
>>
>> "Directory: File:.(0) File:..(0) File:a.out(12) File:foo(1) "
>>
>> But I keep getting into a mess of complication. I'm using C (of
>> course.) How
>> do I parse it? strpbrk(,"/{}") (what then?) How can I get the string
>> to a
>> data-structure that I could write out? Algorithms?

>
> Start with a lexing stage, where you simply break the input into lexical
> tokens, doing your best to identify them as you go but not worrying too
> much about odd cases. Store your lexical tokens in some kind of dynamic
> data structure such as a linked list. Yes, strpbrk will work for this,
> or even strtok if your input is writeable.


And if your tokenisation rules are sufficiently bizarre [1], you can
resort to tools such as [f]lex, which [typically|can] generate C
code/tables for you.

> That will massively reduce the complexity of the parsing stage, since
> you won't have to worry about tokenisation (because each token is
> simply the next node on the linked list), and so you can focus purely
> on the grammar that you are trying to implement.


And again, if you end up with a sufficiently complex grammar [1again],
there are tools that will help. But if you're in control of the grammar,
such complexity may be a grammar smell ...

(Also helpful: existing books. And writing unit tests.)

[1] What counts as "sufficiently" is variable.

--
Far-Fetched Hedgehog
"It took a very long time, much longer than the most generous estimates."
- James White, /Sector General/

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
What libraries should I use for MIME parsing, XML parsing, and MySQL ? John Levine Ruby 0 02-02-2012 11:15 PM
[ANN] Parsing Tutorial and YARD 1.0: A C++ Parsing Framework Christopher Diggins C++ 0 07-09-2007 09:01 PM
[ANN] Parsing Tutorial and YARD 1.0: A C++ Parsing Framework Christopher Diggins C++ 0 07-09-2007 08:58 PM
SAX Parsing - Weird results when parsing content between tags. Naren XML 0 05-11-2004 07:25 PM
Perl expression for parsing CSV (ignoring parsing commas when in double quotes) GIMME Perl 2 02-11-2004 05:40 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57