Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > is there a command that can take a C source code as input and outputa token tree

Reply
Thread Tools

is there a command that can take a C source code as input and outputa token tree

 
 
learner1020
Guest
Posts: n/a
 
      08-31-2010
I know gcc does compiling by converting a C source code into a token
tree, but I don't if there is a command options to make it output just
token tree (in, say, xml format).

Thanks in advance.
 
Reply With Quote
 
 
 
 
Nobody
Guest
Posts: n/a
 
      09-01-2010
On Tue, 31 Aug 2010 15:32:01 -0700, learner1020 wrote:

> I know gcc does compiling by converting a C source code into a token
> tree, but I don't if there is a command options to make it output just
> token tree (in, say, xml format).


Not in XML. You can use e.g. -fdump-tree-original-raw to get the parse
tree as a list of nodes.

 
Reply With Quote
 
 
 
 
Jorgen Grahn
Guest
Posts: n/a
 
      09-01-2010
On Wed, 2010-09-01, Nobody wrote:
> On Tue, 31 Aug 2010 15:32:01 -0700, learner1020 wrote:
>
>> I know gcc does compiling by converting a C source code into a token
>> tree, but I don't if there is a command options to make it output just
>> token tree (in, say, xml format).

>
> Not in XML. You can use e.g. -fdump-tree-original-raw to get the parse
> tree as a list of nodes.


And if I recall correctly there are people experimenting with the gcc
source code in this area. People are interested in using gcc as a C++
parser for use in static analysis, because it's so hard to write one
from scratch. (This might not apply to the C compiler; I don't know
much about this.)

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
Gene
Guest
Posts: n/a
 
      09-02-2010
On Aug 31, 6:32*pm, learner1020 <(E-Mail Removed)> wrote:
> I know gcc does compiling by converting a C source code into a token
> tree, but I don't if there is a command options to make it output just
> token tree (in, say, xml format).
>
> Thanks in advance.


If you are not tied to gcc, look at clang. I recall one of the
project's threads is to emit abstract syntax trees as XML for C,
Objective-C, and C++. Don't know where that effort stands. This is a
new build with benefit of "going to school" on gcc and lots of recent
research and experience. The code looks much easier to get a handle on
than gcc's.
 
Reply With Quote
 
BGB / cr88192
Guest
Posts: n/a
 
      09-02-2010

"Jorgen Grahn" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed). ..
> On Wed, 2010-09-01, Nobody wrote:
>> On Tue, 31 Aug 2010 15:32:01 -0700, learner1020 wrote:
>>
>>> I know gcc does compiling by converting a C source code into a token
>>> tree, but I don't if there is a command options to make it output just
>>> token tree (in, say, xml format).

>>
>> Not in XML. You can use e.g. -fdump-tree-original-raw to get the parse
>> tree as a list of nodes.

>
> And if I recall correctly there are people experimenting with the gcc
> source code in this area. People are interested in using gcc as a C++
> parser for use in static analysis, because it's so hard to write one
> from scratch. (This might not apply to the C compiler; I don't know
> much about this.)
>


parsing C is not particularly difficult...

a few kloc of code can do the trick, although it may be a little work to
understand how to write it (it helps to first have experience with simpler
languages, like Scheme and JavaScript, as each will give the experience and
a foundation to build on).


(the real evils are deeper in the compiler internals...).

if my server were up right now (it is down recently because internet
bandwidth here is too limited and others complain if I "waste" the bandwidth
over something so trivial as having a webserver running...), I could post a
link to my parser, which can parse C (and also Java and C#), and emits an
XML-based AST (not a token-tree / CST though, if this is what the OP
wanted).


personally my bias is to avoid things like parser generators, as to me they
seem like more of a trick to make people *think* they are making the task
easier for themselves, but setting themselves up for much pain once they get
past simple languages, and into languages with all sorts of bizarre stuff
going on (such as tokens which may or may not exist or may be parsed
differently depending on context, as may exist in languages such as C++ or
C#, or syntax which is ambiguous apart from knowing prior declarations,
such as in C and C++, ...).

personally, I am a fan of hand-written recursive descent, as IME it seems to
work fairly well, and I just haven't really run into problems where parser
generators would seem to be the right tool for the job.

a lexer may make sense to generate from a tool, although personally I don't
really think this is necessary either.

or such...


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
problem in running a basic code in python 3.3.0 that includes HTML file Satabdi Mukherjee Python 1 04-04-2013 07:48 PM
This is an unexpected token. The expected token is 'NAME' =?Utf-8?B?Y2FzaGRlc2ttYWM=?= ASP .Net 2 07-13-2007 11:38 AM
Token pasting (## operator) - Add whitespace to a token Wessi C Programming 3 08-11-2005 01:02 PM
"token" "token sequence" "scalar variable" "vector" ?? G Fernandes C Programming 1 02-18-2005 05:32 AM
preprocessor, token concatenation, no valid preprocessor token Cronus C++ 1 07-14-2004 11:10 PM



Advertisments