Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > How do I make my own custom C compiler?

Reply
Thread Tools

How do I make my own custom C compiler?

 
 
Roberto Waltman
Guest
Posts: n/a
 
      06-08-2006
"smnoff" <(E-Mail Removed)> wrote:
>So, if I wanted to make my "custom" C compiler that's different that the
>current C99 or ANSI C, where would I start?


Others gave you good advice already. This is a short bibliography you
may find useful, all these books have a practical approach, as opposed
to theoretical (Dragon book)

Holub: "Compiler Design in C"
Wirth: "Compiler Construction" (Free on-line. Oberon subset)
Pemberton & Daniels: "Pascal Implementation: The P4 Compiler and
Interpreter" (Free on-line)
Hendrix: "The Small-C Handbook" (C subset)
Brinch Hansen: "Brinch Hansen on Pascal Compilers" (Pascal subset)
Crenshaw: "Let's Build a Compiler" (Free articles on-line. Basic(?) )
Appel: "Modern Compiler Implementation in C"
Wirth & Gutknecht: "Project Oberon - The Design of an Operating System
and Compiler" (Free on-line)

I agree that gcc is *not* a good choice for a beginner compiler
writer. I would recommend starting with Wirth or Hansen's books.
They implement compilers for "toy" languages, using recursive descent
parsers, so there is no need, (at least at this stage) to learn about
additional parsing tools. LCC (a full C compiler) could follow.
Try also posting in comp.compilers.
 
Reply With Quote
 
 
 
 
Giannis Papadopoulos
Guest
Posts: n/a
 
      06-08-2006
Ben Pfaff wrote:
> Giannis Papadopoulos <(E-Mail Removed)> writes:
>
>> Keith Thompson wrote:
>>> "smnoff" <(E-Mail Removed)> writes:
>>>> So, if I wanted to make my "custom" C compiler that's different that the
>>>> current C99 or ANSI C, where would I start?
>>> I'd start with an existing open-source compiler, such as gcc or lcc.

>> Isn't a bit risky to start with such a behemoth?

>
> Why? Hacking simple features into GCC is not that difficult.
> I've done it a couple of times and so have my officemates.


Yes, but since this question is asked I'd expect that the OP does not
have the necessary experience to pursue such a quest.

--
one's freedom stops where others' begin

Giannis Papadopoulos
Computer and Communications Engineering dept. (CCED)
University of Thessaly
http://dop.freegr.net/
 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      06-08-2006
Giannis Papadopoulos <(E-Mail Removed)> writes:
> Ben Pfaff wrote:
>> Giannis Papadopoulos <(E-Mail Removed)> writes:
>>
>>> Keith Thompson wrote:
>>>> "smnoff" <(E-Mail Removed)> writes:
>>>>> So, if I wanted to make my "custom" C compiler that's different that the
>>>>> current C99 or ANSI C, where would I start?
>>>> I'd start with an existing open-source compiler, such as gcc or lcc.
>>> Isn't a bit risky to start with such a behemoth?

>>
>> Why? Hacking simple features into GCC is not that difficult.
>> I've done it a couple of times and so have my officemates.

>
> Yes, but since this question is asked I'd expect that the OP does not
> have the necessary experience to pursue such a quest.


I'll concede that hacking gcc is probably not a good starting point
for a beginner. (I've never really looked at the gcc sources.)

As someone else mentioned, lcc is said to be reasonably easy to hack
-- and it even has its own newsgroup.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
Peter Shaggy Haywood
Guest
Posts: n/a
 
      06-11-2006
Groovy hepcat smnoff was jivin' on Wed, 7 Jun 2006 22:49:37 -0500 in
comp.lang.c.
How do I make my own custom C compiler?'s a cool scene! Dig it!

>Ok, I am think I am a little more knowledgeable about C and pointers, ughh.
>
>And likewise, I want to fix C.....and not so much to make a C++ or Java or
>C# or even D like language.
>
>So, if I wanted to make my "custom" C compiler that's different that the
>current C99 or ANSI C, where would I start?


This would probably be best asked in comp.compilers. But anyhow...
Writing a C compiler is no mean feat. It is quite a complex language.
My advice is to start with an easier language.
Others have mentioned the "Dragon Book", also known as Compilers:
Principles, Techniques & Tools by Aho, Sethi & Ullman. This is
generally considered *the* book on compiler design, but is very dry
and technical. I'm currently reading it.
I highly recommend Compiler Construction by Wirth
(http://www.oberon.ethz.ch/books.html). It's an excellent work, and
quite hands-on. Wirth takes you through the construction of a compiler
for a subset of the Oberon language (similar to Pascal). I didn't
really feel fully confident about writing my own compiler until I read
this one. (Actually, it's an assembler I'm writing. I'll write
compilers for high level languages later.)
Crenshaw's series of articles entitled Let's Build a Compiler (URL
unavailable at this time) is aimed squarely at the rank beginner, and
is intended to get you writing compilers quickly. Unfortunately it has
its problems. For one thing the series was never finished. For another
thing it's rather haphazard, chopping and changing all over the place,
going over the same ground repeatedly, looking like he was making it
all up as he went along. There is much useful information in it,
though. This series takes you through the process of building a
compiler for a subset of a language the author made up, called KISS.

--

Dig the even newer still, yet more improved, sig!

http://alphalink.com.au/~phaywood/
"Ain't I'm a dog?" - Ronny Self, Ain't I'm a Dog, written by G. Sherry & W. Walker.
I know it's not "technically correct" English; but since when was rock & roll "technically correct"?
 
Reply With Quote
 
Morris Dovey
Guest
Posts: n/a
 
      06-11-2006
smnoff (in n7Nhg.5643$f76.4621@dukeread06) said:

| Ok, I am think I am a little more knowledgeable about C and
| pointers, ughh.
|
| And likewise, I want to fix C.....and not so much to make a C++ or
| Java or C# or even D like language.
|
| So, if I wanted to make my "custom" C compiler that's different
| that the current C99 or ANSI C, where would I start?

There are a several ways to approach the problem: modify the source
for an existing C compiler - or start from scratch and write the whole
thing in the language of your choosing.

Either way you'll learn much more than you expect. Some time back I
approached a similar goal by creating an intermediate compiler (which
compiled PL/C, a superset of BNF) - but by the time the PL/C compiler
was running cleanly, I'd lost interest in the original problem (mostly
because I'd learned enough that the original problem looked trivial.)

Go for it. I predict that you won't arrive at the originally intended
destination - but you will have learned a lot getting wherever you do
arrive.


--
Morris Dovey
DeSoto Solar
DeSoto, Iowa USA
http://www.iedu.com/DeSoto


 
Reply With Quote
 
Allan Adler
Guest
Posts: n/a
 
      06-11-2006
jacob navia <(E-Mail Removed)> writes:

> gcc is impossible to understand unles you spend at least 2-3 YEARS working
> in it full time. [...]
> The first problem is to know RTL. You have to completely understand
> RTL to understand the flow of things.


I've already pointed out that I am not qualified to give advice about
this, but I will give some anyway.

I spent some time about 20 years ago trying to read some of the
source code for GCC and to configure it for a hypothetical machine.
I was singularly unqualified to do that and am no less so now.
However, it was very educational and I would be glad to have an
excuse to do something like that again. I do remember some of the
things I learned. I thought RTL was a lot of fun since it was
conceptually simple and fairly self-contained. Where I got into
trouble was in filling in the machine description files. To the
extent that it just described hardware and big- vs. little-
endianness, it was no problem, but there are places where you
have to give exact details about the calling sequence the operating
system uses to load a program on the target machine. I didn't know
enough about operating systems to guess what the calling sequence
would be on the machine I was trying to imagine.

Even if you fail to understand the code for GCC, it probably won't
do you any harm to try. You might find yourself going back to to the
source code again and again for guidance and inspiration as you learn
more about compilers in other ways.

> Second, the sheer size of the code base. There are 13-15 MB
> of C source code to understand. And the code is mostly very sparsely
> commented. Macros everywhere hide from you what is going on.


One way of getting around that problem is to download an old version
of GCC, before it was ported to so many machines and before it supported
so many languages.

> Accessing data structures is always done with macros, to easy
> things when structure layout changes, but this makes it very
> hard for newcomers to understand what the hell those macros
> are DOING...


How about this: GCC is full of interesting data structures. You can
just take their definitions in isolation and try to figure out what
to do with them, even if their relevance to compilers is not immediately
apparent. Maybe the original code uses macros for greater efficiency,
but there are certain things you would always want to be able to do
with a given data structure and you can just write them yourself using
functions. Once you have a set of functions that will create or modify
or copy one of these data structures, or print one of them out in some
way, you can then try these macros out on them and see exactly what their
effects are, since you will know exactly what the data structure looks
like before you feed it to the macro.

In other words, as long as you are patient and don't mind studying the
code for its own sake, it seems to me that there are a lot of ways to
understand it. If you are in a hurry because you need to use the code
or modify it, or if you want to learn it quickly and then go write your
own, then the code appears as an obstacle and that might get in the way
of studying it. Just get what you can out of it and be glad that you got
that much.

> Third, you have to find your way in a mess of #ifdefs that defies
> the imagination. gcc runs in many machines, and "portability"
> has been taken to ridiculous extremes (the assembler, for instance).
> This means that the same macro can have several interpretations
> depending on which combination of machine/os you are running.


I am not very good at GCC but I vaguely recall that it has a lot of options
that let you print out the results of various stages of processing a program.
For example, you can tell GCC to give you RTL output. Maybe if you compile
GCC with GCC and look at the output at the right stage (e.g. after cpp gets
through with it) you can get rid of all the #ifdefs by compiling with all
the things defined that need to be defined. As Jacob Navia points out,
that may not give you the meaning of a given macro on all possible platforms,
but for starters I think one would be happy to know what it means on one
platform.
--
Ignorantly,
Allan Adler <(E-Mail Removed)>
* Disclaimer: I am a guest and *not* a member of the MIT CSAIL. My actions and
* comments do not reflect in any way on MIT. Also, I am nowhere near Boston.
 
Reply With Quote
 
spibou@gmail.com
Guest
Posts: n/a
 
      06-11-2006

Allan Adler wrote:

> > Third, you have to find your way in a mess of #ifdefs that defies
> > the imagination. gcc runs in many machines, and "portability"
> > has been taken to ridiculous extremes (the assembler, for instance).
> > This means that the same macro can have several interpretations
> > depending on which combination of machine/os you are running.

>
> I am not very good at GCC but I vaguely recall that it has a lot of options
> that let you print out the results of various stages of processing a program.
> For example, you can tell GCC to give you RTL output. Maybe if you compile
> GCC with GCC and look at the output at the right stage (e.g. after cpp gets
> through with it) you can get rid of all the #ifdefs by compiling with all
> the things defined that need to be defined. As Jacob Navia points out,
> that may not give you the meaning of a given macro on all possible platforms,
> but for starters I think one would be happy to know what it means on one
> platform.
> --


You can get the output of the preprocessor using the -E option. But the
horrendous format will very likely make this output unreadable by a
human.

By the way , since noone has mentioned it , doesn't one need to be
fairly
proficient in the assembly of some processor before writing a compiler ?

 
Reply With Quote
 
Morris Dovey
Guest
Posts: n/a
 
      06-11-2006
(E-Mail Removed) (in
(E-Mail Removed). com) said:

| By the way , since noone has mentioned it , doesn't one need to be
| fairly
| proficient in the assembly of some processor before writing a
| compiler ?

Only if the compiler is to output assembly code.

[ Imagine a compiler that translated it's source language into C, or
COBOL, or APL... ]

--
Morris Dovey
DeSoto Solar
DeSoto, Iowa USA
http://www.iedu.com/DeSoto


 
Reply With Quote
 
Giannis Papadopoulos
Guest
Posts: n/a
 
      06-12-2006
(E-Mail Removed) wrote:
> By the way , since noone has mentioned it , doesn't one need to be
> fairly
> proficient in the assembly of some processor before writing a compiler ?



If that one needs a full-feautered compiler yes. But he might stop his
compiler just before the creation of assembly language.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Using own classloader inside J2EE to load and unload own classes. Stefan Siegl Java 1 07-02-2013 05:05 AM
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Allowing access to my own computers within my own network =?Utf-8?B?VHJldm9y?= Wireless Networking 2 07-20-2006 09:05 PM
I have built my own (simple) thread manager [TM], but just found java 5 has its own. Saverio M. Java 0 07-03-2006 08:52 AM
Your own photos in your own book Frank ess Digital Photography 1 12-09-2004 05:54 PM



Advertisments