Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Programming Technology

Which Compiler to Extend for a Small Project? 89

Andreas(R) asks: "While planning the design of my small programming language, and would appreciate some lessons learned from experienced programmers which have already tried this. I was investigating whether to start from an existing compiler and extend it. The compiler will be based on yacc, or bison. The programming language will be interpreted, object oriented and have higher order programming. Perl 1 seems like a decent starting point, as it's yacc based, and 5000 lines of code. Later versions of Perl are too large to get a good understanding of the whole program in a short period of time. Perl also has the right license (GPL). Is Python out of the question for such a project, since it's not GPL? What other small languages can be used instead? How do I go about designing a small programming language in practice, using what I already know about compiler theory?"
This discussion has been archived. No new comments can be posted.

Which Compiler to Extend for a Small Project?

Comments Filter:
  • Holy cow! (Score:5, Insightful)

    by sidecut ( 126820 ) on Wednesday November 10, 2004 @11:15PM (#10783987) Homepage Journal
    Can you give us a few more specifics on what the language will be used for? Will it be embedded? Database connected? Real-time? Interactive?

    While this seems like beaucoup fun, I'd question the need to extend an existing language by altering the compiler. Towards that end, you might want to use LISP or Scheme, as language extension is built into the language. ( See what Paul Graham has to say about the subject) [yahoo.com]

  • by Dancin_Santa ( 265275 ) <DancinSanta@gmail.com> on Wednesday November 10, 2004 @11:21PM (#10784012) Journal
    Look. The source code to Perl 1 is only 5000 lines long. The source code is open and available for anyone who wants to use it for research and investigation as well as for educational purposes. If we had to release all our source code because at one time in our lives we saw some source code that was covered by the GPL, none of us could ever get jobs programming.

    The source is there. Use it as a base for your own program. Change it enough that you aren't blatantly copying it. Release it under your own license. If you like the GPL, then do that. Some of us like less restrictive licenses like BSD or the original (non-GPL compatible) Artistic License.

    Definitely, go with Perl 1. From the look of it, it seems to have some pretty good foundations to build upon. Taking a look at where the language itself is now, there's much improvement needed in v1. It seems like a pretty good place to start.

    Don't be restricted by licenses on small projects, much less ones that are essentially abandoned.
  • Perl isn't just GPL. (Score:5, Informative)

    by hattmoward ( 695554 ) on Wednesday November 10, 2004 @11:33PM (#10784078)
    Also available under The Artistic License [perl.com]. You might like it over the GPL.
  • by Tom7 ( 102298 ) on Wednesday November 10, 2004 @11:34PM (#10784083) Homepage Journal
    Perl might not be the best starting point if you want to learn something about desining a programming language. If you're thinking of using that, I'd say you're better off starting from scratch.

    Functional languages, especially those with pattern matching primitives (like ML or Haskell), are really good at this kind of program -- in fact, writing compilers and interpreters is really their shining point. I highly recommend using one of these languages rather than C. Lots of undergraduate computer science classes (like 15-212 and 15-312 [cmu.edu] at my university) write interpreters in functional languages as part of the curriculum. You could try to find some course notes... I speak from experience when I say that this beats the hell out of mucking around in C or C++, not really knowing what you're doing. ;)
    • Rewrite! (Score:5, Insightful)

      by acidrain ( 35064 ) on Thursday November 11, 2004 @12:23AM (#10784429)
      Adapting something that is not really what you are making is just taking on a crutch. You get moving faster to start, but become dependant on something that slows you down.

      If you have what it takes to write a new language, then you would be best starting from scratch. Read 3-5 codebases, make a list of the things you liked/didin't like and start out on you own. In the long run, having written the thing your self will give you the advantage. You will know intuitively how everyhting works and how extensions will fit in. That will give you a 2x advantage.

      Don't be afraid to read over two other existing implementations as you go. Sharing ideas is very important.

      An approach I have also taken is re-write a program in parts. You pick a major component, and replace it with what you need. This gives you testing check-points. The more often you get to a working state and test, the less time you will spend debugging overall. If you can look at perl 1 and determine you can add to it to make the parser a super-set of what you want you could start there. Then you can write something that interprets the byte-code output (the subset generated by your language) to what you want, and write your own interpreter. Then you can tackle replacing the parser and byte code generator... With flex and bison, that should be easy enough. But plan to replace the entire thing. Otherwise you will spend a lot of time reworking things that are not really what you want. If you discover a few gems along the way that you want to keep/port all the better, just don't take on any crutches.

      Oh, I would recommed the STL if you are in a hurry. I don't know why functional is better, personally it just gave me a headache in school. But some people claim great results with it...

      Finally, a real language must be able to compile itself. Or at least generate it's own byte-code in the case of interpreted langauges. Think about it! You could hack perl 1 to generate your byte code, and then write your parser in perl 1 and have a self compiling language.
      • Oh, I would recommed the STL if you are in a hurry. I don't know why functional is better, personally it just gave me a headache in school.

        Among the many reasons that functional languages are superior for this kind of task is algebraic data types and pattern matching. Since C++ and Java don't even have sum types, the only way to simulate this is with a load of objects and "instanceof" stuff. For example, here's a simple calculator interpreter in ML:

        datatype exp =
        Int of int
        | Plus of exp * exp
        | Tim
      • Re:Rewrite! (Score:3, Insightful)

        a real language must be able to compile itself.

        Not necessarily.
        A more accurate statement would be "a real general purpose language must be able to compile itself.".
        Some languages fulfill narrow requirements that may not include compiling.
        Adding compiling ability to such a language may make it less efficient for fulfilling its primary purpose.
        Some examples of languages that would probably be made worse if self-compiling ability were added: SQL, APL, and most "descriptive" languages (e.g., HTML, XML, and ot

  • Two quick comments (Score:5, Insightful)

    by poincaraux ( 114797 ) on Wednesday November 10, 2004 @11:37PM (#10784096)
    I can't really tell when you're asking questions and when you're stating project requirements, but ..
    I was investigating whether to start from an existing compiler and extend it. The compiler will be based on yacc, or bison.

    you might want to check out ANTLR [antlr.org].
    Perl also has the right license (GPL). Is Python out of the question for such a project, since it's not GPL?

    Did you just say "I can only use GPL'd things. Python isn't GPL'd. Can I use it?" I'll assume you meant something like "I want something with a nice license. Does Python have a nice license?" instead. If that's what you meant, you should check out the Python license [python.org] for yourself. Summary: it's a nice license. It's a certified Open Source license, imposes fewer restrictions than the GPL and is compatible with the GPL.

    Python definitely doesn't fit into 5000 lines of source, though, so Perl1 might be a better bet. PyPy [codespeak.net] is pretty cool, if you're looking for something smallish and Pythonish.

    Sorry if you're looking for something else and these comments turn out to be totally useless.
    • More quick comments (Score:5, Informative)

      by DavidNWelton ( 142216 ) on Thursday November 11, 2004 @05:57AM (#10785816) Homepage
      A cool site for people interested in languages and their design is: http://lambda-the-ultimate.org/

      I agree completely that only looking for GPL code is a mistaken approach, as the GPL is more restrictive than other licenses. You can almost always include BSD code in GPL, but not vice versa. In any case, the GPL is probably not an ideal license if you want your language to be very widely used.

      Thirdly, if you want to look at some nice source code and an interesting way of doing things, have a look at Tcl. It's C sources (modulo the regexp package which came from somewhere else) are some of the nicest I have read. Beautifully commented, and very clear to read. Also, Tcl is a smart approach for a language designed to be extensible.

      The article's author really needs to state his purpose, though... just for fun? To 'get famous'? For work? As a homework assignment?;-)

      • If you're designing a new language, I don't see that the compiler and/or interpreter being GPL'ed would be a hindrance. For example, gcc is GPL'ed, but places no restrictions on what you do with code you compile using it.

        If, OTOH, you want to encourage people to modify the compiler/interpreter and sell *that*, then the GPL might not be the best choice.
        • It has to do with economics. Languages are really something you want people to spread as widely as possible, because there are positive network externalities. And part of spreading things widely is also getting them used commercially...
          • I agree, but I think you missed my point: that if what he wants is for people to write programs in his language, choosing to use GPL for his compiler/interpreter needn't affect commercialization at all. I can write code, compile it with a GPLed compiler, gcc for example, and sell the resulting binary as a commercial product just fine, without having to release the source code for my program.

            If what the article author wants is for companies to develop the compiler/interpreter itself and sell *that*, then yo
  • Object BF (Score:5, Funny)

    by forkazoo ( 138186 ) <wrosecrans@@@gmail...com> on Wednesday November 10, 2004 @11:44PM (#10784123) Homepage
    You know, i was wondering, earlier today, if anybody was ever going to make an object oriented Brainfuck. Object BF (of B F++ if you prefer) would be a delightfully horrible lanfuage. And, since it is such a compact language (though I make no promises about programs written in it) you should have no problem acquianting yourself fully with the existing compiler in about the length of a coffe break.
    • How would you suggest making a Turing Machine, for that is what BF is, object oriented?
      • Uh. (Score:3, Interesting)

        by warrax_666 ( 144623 )
        All Turing Complete languages (which BF is) are equivalent. That means that any OO Turing Complete language can be implemented in a non-OO Turing Complete language. Case in point: C++ compiles to assembly/machine code which is certainly not OO.

        How to implement an OO language on a Turing Machine: Do something equivalent to how OO is implemented in C, ie. store "function pointers" (tape addresses) along with the data and use indirection to call those every time a virtual method is needed.
        • Re:Uh. (Score:3, Interesting)

          You may be interested in this guy [brown.edu]'s thoughts on the matter of Turing Completeness.
          • ... but AFAICT it doesn't really counter anything I said about TC language equivalence. Granted, the languages might actually be supersets of (classical) TMs, but that's nitpicking... they're still computationally equivalent since their only non-TM mechanism is I/O and they would both posses equivalent I/O mechanisms.
        • Re:Uh. (Score:3, Insightful)

          by keesh ( 202812 ) *
          They're only 'equivalent' at a really basic level. Sure, from an academic "what can be calculated?" POV they're the same (although *not* when time complexity is considered), but for any practical purpose they are not equivalent. Compare how long it takes to write, say, a program which adds up ten numbers read from stdin in BF or INTERCAL with how long it takes to write the same program in Ruby or Haskell...
          • ... don't enter into it (to paraphrase a famous skit w/John Cleese). :)

            The GP was suggesting that OO was somehow not implementable on a TM and I felt compelled to correct him/her/it.

            IIRC, it has also been shown that algorithms always stay within their "class" (P, NP, P-space, etc.) when transformed from a TC language to another TC language. So your observation about complexity is only half true in that an O(n) algorithm may turn into an O(n^10), but it will never turn into an exponential-time algorithm wh
    • You could make your signature more nerdy by changing it to "three long days."
    • Uh, ++ increments twice in brainfuck. I think you mean bf+

      (Though in fact,
      Add one to brainfuck giving brainfuck+1
      would have the same effect (anything except +-,.>[] gets ignored by the compiler).
  • by blackcoot ( 124938 ) on Wednesday November 10, 2004 @11:44PM (#10784126)

    if you have never taken a compiler class before or written a compiler on your own, i suggest the following:

    • while i encourage re-use, if your purpose is to learn how to write a compiler, don't extend someone else's. find a grammar for a language you're comfortable with (e.g. pascal) and start from there. you'll find that getting just plain pascal to compile properly will be quite a challenge. oo-ness just adds another layer of complexity to the compiler.
    • to aid your debugging, you'll want to spend some time thinking up good ways of a) visualizing your parse tree, b) representing your IL in a human readable format, c) representing the entire state of your interpreter in human readable format. these just by themselves can be very challenging projects.
    • acquire a copy of the dragon book [amazon.com]. this is only a starting point, you may also want to peruse some of andrew appel's compiler books (such as this one [amazon.com] or this one [amazon.com])
    • lex/flex and bison/yacc are rather antiquated, you may want to check out terence parr's antlr [antlr.org] (formerly pccts) instead. this allows you to implement your compiler+interpreter in your language of choice, rather than being forced to use c. my compiler classes all required that we used lex/yacc, so that's what i did; however, i would have really liked to have the option of doing it all in java or c++.
    • how you setup your symbol table will have a large determining factor in your success. i've used trees of hashtables in the past quite successfully (each node in the tree corresponding to a lexical context such that all symbols visible in the current scope are on the path between the present node and the root). i expect that extending this to support an OO language shouldn't be too hard. e.g: augment nodes representing a type to include back pointers to parent types. you will have to modify search to do lookups as appropriate.

    there are several "toy" grammars out there which allow you to do useful stuff (recursion, 'interesting' data structures [i.e. self referential], etc.) without wading through a lot of useless cruft (implementing huge amounts of runtime support, for example). i'd go with one of those. once you're comfortable that you can make one of these learning languages work, then try to hack one of your own.

    this all said, good luck! i am by no means an expert on compiler construction (worked on a custom in-house scripting language as an intern a couple years back and had to take compiler classes to satisfy breadth requirements for my m.s. c.s.) but i do hope this is a little bit useful.

  • Licenses (Score:4, Informative)

    by Mr.Ned ( 79679 ) on Thursday November 11, 2004 @12:09AM (#10784304)
    "Is Python out of the question for such a project, since it's not GPL?"

    You seem to be concerned about the limitations the license puts on you. Python's license is less restrictive than the GPL - read about it (http://python.org/psf/license.html).

    One of many things this means is that if you decide the Python License isn't restrictive enough for you, you can relicense the combination of Python plus your changes under the GPL, as long as you adhere to Python's license (leaving its copyright and other required information intact).
  • by crmartin ( 98227 ) on Thursday November 11, 2004 @12:11AM (#10784318)
    You don't really say what sort of problems you're talking about or why you want to build the language.

    If you just want to build a language that will teach you about programming languages, look at old fashioned Pascal [inf.ethz.ch] not Delphi or Kylix or even Turbo Pascal, but good old-fashioned Jensen and Wirth 1974 Pascal.

    If you want to design a programming language, the best advice is to write some code in the proposed language. Remember Tony Hoare's rule, and keep it simple. Most programming languages, from Perl to Python to Java 5, suffer from being accumulations of features.

    Have a look at Ruby, Modula-n, Oberon, and so forth.

    If you're looking to learn lots about programming in general, think about things you want to do, and construct a lanaguage that does them. Icon is a nice example. Look at SNOBOL, if only because you'll appreciate the "five miles through the snow" stories we old farts tell.
  • there are plenty of tiny implementations of either. if I were you I'd use a higher level parser/lexical analyzer generator than lex/flex/yacc/bison. also check out the 'Introduction to Compiler Construction With Unix' from prentice hall, sadly out of print, and 'Compilers' by Aho.
  • A few ideas (Score:5, Insightful)

    by RevAaron ( 125240 ) <revaaron AT hotmail DOT com> on Thursday November 11, 2004 @12:38AM (#10784538) Homepage
    First, there are two kinds of small languages:
    1. small languages like lua [lua.org], io [iolanguage.com], and scheme [mit.edu] that are small in the built-in libraries and in the total distro. These three are great places to start- both are small, OOPish, allow higher-order programming by passing classes, objects, functions and methods as objects.

    2. Then there are languages that are big in some ways, but small in syntax. Some of these are easier to extend than so-called "little languages." The reason is usually that their syntax is small, in an isolated place, easy to get at, and meant to be modified. The two best examples for this are Smalltalk [smalltalk.org] and Lisp. Both of these languages satisfy your other requirements and really kick ass for extention. Unlike the above languages, the so-called little-languages, most Smalltalk and Lisp dialects have big, useful libraries. Unlike a big fat language like perl or C++, having a useful library doesn't mean that the language is a huge pain in the ass to extend.

    Both Lisp and Smalltalk have a number of implementations. I am a big fan of Squeak Smalltalk [squeak.org], though systems like Little Smalltalk [smalltalk.org] or even GNU Smalltalk [gnu.org] maybe worth checking out.

    A lot of people here have bad feelings about Lisp-like languages. It's a shame, since Scheme, ISLISP (OpenLisp is a great implementation) and Common Lisp are all *very* powerful languages. You can be quite productive with them once you get over the part about whining about parens. But Lisp may very well be the best option here, there is a long history of people writing custom-syntaxes and language extensions. Look up Common Lisp macros- power almost beyond comprehension, a lot of fun to play with, and with an elegance all its own.

    There are examples of people writing a C-like syntaxes [umin.ac.jp] for various Scheme implementations. IIRC, Gambit-C [umontreal.ca] (a Scheme to C compiler) comes with one. On Cliki, there are a bunch of other alternative Scheme syntaxes [tunes.org] listed.

    To, one of the big advantages to using a language in the second category is that syntax extension/modification is done in the language itself, rather than in C. With that comes the familiarity of the language you're creating and the other benefits you gain by using a high-level language like Smalltalk or Common Lisp.

    Just some thoughts...
  • Try a compiled Lisp (Score:1, Interesting)

    by Anonymous Coward
    You're building a compiler? A compiler requires a parser in the front and a code generator in the rear, tied together by some sort of syntax tree. Various language constructs will require transformations upon this syntax tree.

    As it happens, there already exists a class of languages that are strong at manipulating syntax trees, and at writing parsers. Several of them also support dynamic compilation, meaning that your language implementation can choose when to stop dicking with the syntax tree and instead

  • Reason One, trivial to write a parser for the core language.

    Reason Two, lots of nice literature available online. Check out Shriram Krishnamurthhi's text
    Programming Languages: Applications and Interpretation on his website at http://www.cs.brown.edu/~sk [brown.edu]

    Reason Three, its already conceptually an AST, so you can get involved in the more interesting work sooner
  • by Piquan ( 49943 ) on Thursday November 11, 2004 @01:58AM (#10785009)

    I'm not really sure what your goal here is. Are you wanting to write a compiler for the sheer joy of writing a compiler? That's a good goal, for sure, and I recommend that every programmer write a compiler or two during their careers, or at least some interpreters.

    On the other hand, maybe you want to extend an existing language because you need some specific language feature. Also a good goal, but I do want to caution you to evaluate existing languages first; you may find that some language does what you want, or makes it easy to write a language library that does what you want.

    Or maybe you want to write an interpreter to script a bigger program. Then I'd say that you may be better off using something that's already there.

    In the first case, if you want to learn how to write a compiler, I generally recommend writing Scheme, or some other simple Lisp. Scheme has advantages in that its parse tree representation is obvious (that's why Lisp looks like it does), the structure of an interpreter and of a compiler are quite similar, and it covers the fundamentals of compilers without burdening you with a bunch of cruft. (If this is your first compiler, you may want to leave out continuations and garbage collection for now.) The very excellent book Structure and Interpretation of Computer Programming [mit.edu] has what you need: chapter 4 is about writing interpreters (for Scheme, some modified Schemes, and Prolog), and chapter 5 is all about writing a compiler. All the code in SICP is in Scheme, but the book starts from the beginning with (+ 1 1).

    SICP may be too academic for your taste, so you may prefer Paradigms of Artificial Intelligence Programming [norvig.com] instead. It uses Common Lisp, and has a little bit more practical feel that SICP. Chapter 22 is about writing a Scheme interpreter, and chapter 23 is about writing a Scheme compiler. Unlike SICP, PAIP doesn't cover the garbage collector. PAIP uses Common Lisp, and although it has enough of an introduction to be a "refresher" for somebody already familiar with Lisp, it's not really suitable for learning the language.

    It's really simple to write Lisp in Lisp. Indeed, a month ago I wrote one for a comp.lang.lisp post [google.com], just to get a silly quine to work! That's why I keep talking about Lisp and Scheme: it's easy to do.

    If Scheme isn't your thing, then Pascal may be a good alternative. Don't try to get fancy with heap allocation, pointers, objects, and other new add-ons; I'd start with plain old Wirth-designed Pascal, and get fancy later if you want. Pascal is really designed for a classroom setting. I have a dim memory of Ada being used for this purpose in some books, but I can't speak very authoratively there.

    Of course, the definitive book on writing compilers is the Dragon Book [wikipedia.org]. But you may want to be familiar with some basic CS theory about FSMs [wikipedia.org] first.

    Now, that's if you want to learn about compilers for the joy of learning about compilers. But what if you just need one particular language feature for your problem, and that's why you want to write a compiler? Well, then I'd suggest you make sure you've looked at a lot of different languages first. Some languages have surprising features that may let you write a small in-language library to do what you need, instead of needing to extend the compiler.

    Lisp is a good candidate for this. John Foderaro once described Lisp as "a programmable programming language". You can alter the language to suit the problem at hand, instead of having to work the other way around. At work now, I use Lisp as a bridge between two very different programming languages because I can extend it in both directions to cover what I nee

  • Lua (Score:4, Informative)

    by rbright ( 54766 ) on Thursday November 11, 2004 @02:00AM (#10785020)
    I highly recommend Lua. It's a brilliant language. The code base is clean, portable, and easy to read.

    http://www.lua.org [lua.org]

    In fact, I recommend you not create a new language at all and just use Lua. It's a deceptively simple language that allows you to extend it through certain meta-constructs to become pretty much anything you need-- from simple data description to full Object-Orientation.

    I once read through the entire dragon book with the intention of creating my own language; I gave it all up when I found Lua.

    Good stuff.
    • by Kz ( 4332 )

      I once read through the entire dragon book with the intention of creating my own language; I gave it all up when I found Lua.

      That's my case too. Lua is absolutely great!

      for a long time, i've been looking at Lisp and Scheme for more expressive power than usual languages. (in fact, my first C program was a stupidly slow lisp interpreter). but the syntax is just too awkward.

      i also looked at lazy evaluation languages, but couldn't wrap my mind on that model.

      a few months ago i found Lua... and it's perfe

    • There's IO langauge [iolanguage.com] too. Prototype based, very small, looks very interesting to play with.

      Otherwise, I would recomment looking into Lisp or Haskell. There are nice books like Essentials of programming languages [amazon.com] dealing with Lisp/Scheme and how to implement a new language with them. No previous experience of Lisp/Scheme is required for this book.

      Another book which is more technical and performance oriented is Lisp In Small Pieces [amazon.com].

      Good luck.

      Regards, Tommy

  • by avida ( 683037 )
    Use lisp. Lisp has been called the programmable programming language. Think XSLT for code. You can define your own mini languages that work your way but are themselves valid lisp and can use all the cool lisp stuff.
  • OCaml (Score:3, Interesting)

    by Markus Registrada ( 642224 ) on Thursday November 11, 2004 @02:47AM (#10785200)
    OCaml was pretty much designed specifically for this sort of thing. Every part of the language system accepts plug-ins for your private variants. If you can bear to use a parsing language that is far more powerful than yacc, you might consider it.

    That said, interpreted languages are stupidly easy to do, so much so that it's hard to learn much -- they forgive every mistake until long after it's too late to fix it. (Witness Perl.) A high-performance language is a Good Thing not particularly because the programs run faster, although that's a nice side benefit, but rather because it enforces a fundamental rigor. There's no faking performance. When you face that kind of problem squarely, God speaks to you, and if you listen carefully enough you can learn deep truths.

    But that's not for everybody. Most people have had any desire to learn anything, deep or otherwise, beaten out of them. By all means go for easy comfort, the economic vitality of the nation depends on people like you. Forget I said anything. Garbage collection, bytecodes, regexps, yeah!

  • Have A Look At LLVM (Score:3, Interesting)

    by swagora ( 729659 ) on Thursday November 11, 2004 @02:59AM (#10785246)
    Andreas(R):

    While you haven't provided enough details to comment in length, I do have some experience with what you're planning. A couple of years back I started a programming system (XPS) which was rather audacious in scope. After two years of working on it, I realized that I too needed a "back end" compilation system. I looked at various alternatives like GCC (too complex), research compilers (low quality), open source virtual machines like Mono (immature at the time). I was quite surprised when I looked at UIUC's LLVM compiler toolkit. I thought it would be just another half-baked compiler system from a University that never got finished when the Ph.D student left. Instead, I found a well designed, working, *toolkit* for compiler construction. While LLVM still lacks some features, its core is very solid and easily extensible. I've been working with it for a year now and its been quite a pleasure. Check it out at http://llvm.cs.uiuc.edu/ [uiuc.edu]
  • by Anonymous Coward on Thursday November 11, 2004 @03:23AM (#10785329)

    Honestly, its easier to write a recursive descent parser by hand for a programming language than you think, and interpreters are ridiculously easy unless you're worried about making it fast, which is way overrated too. It mattered with 640KB of RAM at 20MHz, but these days, its just stupid to care unless you notice its insanely slow.

    First off, if you've not found this link: http://compilers.iecc.com/crenshaw/ [iecc.com], then I recommend you start with it. While its about writing a compiler, it really help make parsing much clearer.

    Scheme is a good language to check out if you want to start with another design(a scheme interpreter can be written in a few hours, even in C, if you're slick, even if you're not, it would be short project to get 90%).

    Some other reference material: Parsing Techniques [cs.vu.nl](free online). Also: Modern Compiler Design [cs.vu.nl] by the same guys and well worth the investment. Concepts, Techniques, and Models of Programming Languages [ucl.ac.be], teaches kernal theory of language design, and may open your mind to some other techniques you may not be aware of.

    Checking out the archives on Lambda The Ultimate [lambda-the-ultimate.org] would be wise too. Also, if you're in Boston on December the 4th, you might check out the Lightweight Languages Workshop [mit.edu] at MIT.

  • Use a simple parser (Score:3, Interesting)

    by Frans Faase ( 648933 ) on Thursday November 11, 2004 @03:51AM (#10785437) Homepage
    Consider to use a parser that is easier and more powerful than yacc and lex. Have a look at IParse [planet.nl], a simple, small interpretting parser. The whole source is in a single 92 Kbyte file.

    Forget about using a Virtual Machine. That is nice for speed, but it requires a lot of work. Beter make an interpretter that interprets the abstract program tree. I once started doing this for a JavaScript interpretter, but I never came to finish it.

  • Parrot (Score:5, Informative)

    by Per Wigren ( 5315 ) on Thursday November 11, 2004 @04:09AM (#10785493) Homepage
    Maybe you should have a look at Parrot [parrotcode.org]? The CVS includes a lot of working and non-working language-implementations [perl.org] which you could have a look at..

    I also recommend you to have a look at Lua [lua.org] which is a minimalistic yet beautiful language..

  • Choose something that requires some extra work to get compatible with some other language.

    Don't even bother too much to write an initial compiler, and go straight to the horrible task of making something compatible.

    While you are making an own language, dealing with compability issues will lead you to more pitfalls than a straight clean 1st order approach of some language than.
  • Parrot! (Score:5, Informative)

    by cyberkreiger ( 463962 ) on Thursday November 11, 2004 @05:56AM (#10785812) Homepage
    Write a compiler for parrot [parrotcode.org].

    Lots of people are doing that right now, and you could learn a lot from the included compilers for different languages, all in varying stages of completion.
  • .. use the only REAL hardcore programming language remaining - machine code
  • Original poster: What, pray tell, is "compiler theory"? I'm a little perplexed that someone who knows something about "compiler theory" is asking Slashdot how to write a compiler. Most of the answers you will get here are from people who don't really know any "compiler theory".

    People suggesting Lisp/Scheme: Sure, these languages are extensible, but any extension of Lisp/Scheme you can create with macros or whatever will still look like Lisp/Scheme. If the point of this exercise is to design one's own l

One way to make your old car run better is to look up the price of a new model.

Working...