Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Choosing the right parser for parsing C headers

Reply
Thread Tools

Choosing the right parser for parsing C headers

 
 
Jean de Largentaye
Guest
Posts: n/a
 
      02-08-2005
Hi,

I need to parse a subset of C (a header file), and generate some unit
tests for the functions listed in it. I thus need to parse the code,
then rewrite function calls with wrong parameters. What I call "shaking
the broken tree"
I chose to make my UT-generator in Python 2.4. However, I am now
encountering problems in choosing the right parser for the job. I
struggle in choosing between the inappropriate, the out-of-date, the
alpha, or the too-big-for-the task...

So far I've indentified 9(!) potential candidates (Mostly taken from
the http://www.python.org/moin/LanguageParsing page) :

- Plex:
Only a lexical analyser as far as I understand. Kinda RE++, no syntax
processing
- ply:
Lex / Yacc for python! Tackle the Beast! Syntax processing looks
complex..
- Pyggy:
Lex / Yacc -styled too. More recent, but will a 0.4 version be good
enough?
- PyLR:
fast parser with core functions in C... hasn't moved since '97
- Pyparsing:
quick and easy parser... but I don't think it does more than lexical
analysis
- spark:
Here's some wood. Now build your house.
- yapps2 :
yapps2+ (I hesitate to call it yapps3):
chosen by http://www.python.org/sigs/parser-si...-standard.html.
Is the choice up-to-date?
But will it do for parsing C?
- TPG (Toy Parser Generator):
looks cool
- ANTLR (latest version from Jan 28 produces Python code) :
Seems powerful and has a lot of support, but I don't want to have to
use an exterior Java tool. Furthermore, does it let me control what
happens at each stage easily, or does it just make me a compiler?

I've omitted these: shlex, kwparsing (webpage?), PyBison, Trap
(webpage?), DParser, and SimpleParse (I don't want the extra
dependancy).

I was hoping for a quick and easy choice, but got caught in the tar pit
of Too Much Information. Parsing is a large and complex field. As an
added handicap, I'm new to the dark minefield of parsers... I've had
some experience with Lex/Yacc, and have some knowledge of parser
theory, through a course on compilators. I am thus used to EBNF-style
grammar.
I was disappointed to see that Parser-SIG has died out.
Would you have any ideas on which parser is best suited for the task?

John

 
Reply With Quote
 
 
 
 
Fredrik Lundh
Guest
Posts: n/a
 
      02-08-2005
Jean de Largentaye wrote:

> I need to parse a subset of C (a header file), and generate some unit
> tests for the functions listed in it. I thus need to parse the code,
> then rewrite function calls with wrong parameters. What I call "shaking
> the broken tree"
>
> I chose to make my UT-generator in Python 2.4. However, I am now
> encountering problems in choosing the right parser for the job. I
> struggle in choosing between the inappropriate, the out-of-date, the
> alpha, or the too-big-for-the task...


why not use a real compiler?

http://www.boost.org/libs/python/pyste/
http://www.gccxml.org/HTML/Index.html

</F>



 
Reply With Quote
 
 
 
 
Thomas Heller
Guest
Posts: n/a
 
      02-08-2005
"Jean de Largentaye" <(E-Mail Removed)> writes:

> Hi,
>
> I need to parse a subset of C (a header file), and generate some unit
> tests for the functions listed in it. I thus need to parse the code,
> then rewrite function calls with wrong parameters. What I call "shaking
> the broken tree"


IMO, for parsing 'real-world' C header files, nothing can beat gccxml.

Thomas
 
Reply With Quote
 
Miki Tebeka
Guest
Posts: n/a
 
      02-08-2005
Hello Jean,

> - ply:
> Lex / Yacc for python! Tackle the Beast! Syntax processing looks

mini_c is a C compiler written using ply. You can just use it as is.
http://people.cs.uchicago.edu/~varmaa/mini_c/

HTH.
--
------------------------------------------------------------------------
Miki Tebeka <(E-Mail Removed)>
http://tebeka.bizhat.com
The only difference between children and adults is the price of the toys
 
Reply With Quote
 
Fredrik Lundh
Guest
Posts: n/a
 
      02-08-2005
Thomas Heller wrote:

> IMO, for parsing 'real-world' C header files, nothing can beat gccxml.


no free tool, at least. if a budget is involved, I'd recommend checking
out the Edison Design Group stuff.

</F>



 
Reply With Quote
 
Jean de Largentaye
Guest
Posts: n/a
 
      02-08-2005
GCC-XML looks like a very interesting alternative, as Python includes
tools to parse XML.
The mini-C compiler looks like a step in the right direction for me.
I'm going to look into that.
I'm not comfortable with C++ yet, and am not sure how I'd use Pyste.

Thanks for the information guys, you've been quite helpful!

John

 
Reply With Quote
 
Fredrik Lundh
Guest
Posts: n/a
 
      02-08-2005
Jean de Largentaye wrote:

> GCC-XML looks like a very interesting alternative, as Python includes
> tools to parse XML.
> The mini-C compiler looks like a step in the right direction for me.
> I'm going to look into that.
> I'm not comfortable with C++ yet, and am not sure how I'd use Pyste.


to clarify, Pyste is a Python tool that uses GCCXML to generate bindings; it might
not be something that you can use out of the box for your project, but it's definitely
something you should study, and perhaps borrow implementation ideas from.

</F>



 
Reply With Quote
 
Roman Yakovenko
Guest
Posts: n/a
 
      02-08-2005
try http://sourceforge.net/projects/pygccxml
There are a few examples and nice ( for me ) documentation.

Roman

On Tue, 8 Feb 2005 13:35:57 +0100, Fredrik Lundh <(E-Mail Removed)> wrote:
> Jean de Largentaye wrote:
>
> > GCC-XML looks like a very interesting alternative, as Python includes
> > tools to parse XML.
> > The mini-C compiler looks like a step in the right direction for me.
> > I'm going to look into that.
> > I'm not comfortable with C++ yet, and am not sure how I'd use Pyste.

>
> to clarify, Pyste is a Python tool that uses GCCXML to generate bindings; it might
> not be something that you can use out of the box for your project, but it's definitely
> something you should study, and perhaps borrow implementation ideas from.
>
> </F>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>

 
Reply With Quote
 
Jean de Largentaye
Guest
Posts: n/a
 
      02-08-2005
That looks cool Roman, however, I'm behind a Corporate Firewall, is
there any chance you could send me a cvs snapshot?

John

 
Reply With Quote
 
Paddy McCarthy
Guest
Posts: n/a
 
      02-08-2005
Jean de Largentaye wrote:
> Hi,
>
> I need to parse a subset of C (a header file), and generate some unit
> tests for the functions listed in it. I thus need to parse the code,
> then rewrite function calls with wrong parameters. What I call "shaking
> the broken tree"
> I chose to make my UT-generator in Python 2.4. However, I am now
> encountering problems in choosing the right parser for the job. I
> struggle in choosing between the inappropriate, the out-of-date, the
> alpha, or the too-big-for-the task...


Why not see if the output from a tags file generator such as ctags or
etags will do what you want.

I often find that some simpler tools do 95% of the work and it is easier
to treat the other five percent as broken-input.

try http://ctags.sourceforge.net/


- Paddy.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
advice on choosing right control Mark Siffer ASP .Net 1 06-17-2004 06:26 AM
Reading 'received' headers: Email Headers Parsing dont bother Python 0 03-03-2004 08:18 PM
Need help choosing the right router. Robert Le Feve Cisco 3 11-19-2003 02:31 AM
Re: Choosing the right training company Deane MCSE 0 09-05-2003 07:58 PM
Re: Choosing the right training company S. O'Brien MCSE 0 09-05-2003 01:58 PM



Advertisments