Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Data Storage Programming Technology

Pros and Cons of Garbage Collection? 243

ers asks: "Most new programming languages are using garbage collection, rather than programmer-controlled memory management. The advantages are obvious: programmers no longer have to worry about forgetting to delete allocated memory, leading to far fewer memory leaks. The disadvantages are often glossed over by programming language designers - aside from the performance issues, predictable memory management can be used for controlling access to files and similar resources, creating safer thread locking code and even providing better error messages. Some programming languages, which usually predictable memory management, can also be made to behave like they are garbage collected - for example, Boost provides various C++ smart pointer classes. So, given the choice between garbage collection or manual memory management, which would you choose and why? When using a manual memory management language, when do you consider the performance and syntactic overhead of faked garbage collection to be worthwhile?"
This discussion has been archived. No new comments can be posted.

Pros and Cons of Garbage Collection?

Comments Filter:
  • by keesh ( 202812 ) on Monday November 28, 2005 @11:04PM (#14134761) Homepage
    The C++ model is basically correct. It doesn't treat the programmer like an idiot (which admittedly may be a problem if you have idiot programmers), and it gives you the choice of how to handle memory allocation. The lack of reference-counted pointer in the standard library is a bit of a bitch, but the Boost shared pointer templates will likely make it into C++0x, and it's only a hundred lines of code or so to make your own in the mean time.

    Of course, the C++ model is not perfect either. Lack of virtual and const constructors can be a nuisance (the workaround being the pimpl idiom and a shared pointer), and not being able to use shared pointers to functions without nasty syntactic hackery occasionally breaks the "stuff pretending to be a pointer" illusion. Still, the power it gives over the Java model is definitely worth the occasional bit of extra effort.

    Then again, if you're coding some quick scripting hack rather than a proper program, who cares about memory allocation?
    • I don't think garbage collection implies treating the programmer like an idiot. The programmer's attention is a finite resource that is often better spent on something other than memory management, especially given that garbage collection performs quite adequately for many programs. A Perl, Java, or Lisp programmer isn't an idiot for not doing his own memory management any more a person who doesn't make his own shoes is an idiot.
      • The problem with most garbage-collected languages is that they don't have the concept of a limited-lifetime object. Stacks are a tool, and a damned useful one. We still mostly structure our code in stacks. But the direction of Java and co is to tell the programmer that since he needs some help with non-stack-allocated stuff, well, then he just doesn't get to have use a stack at all. That's just not a useful tradeoff; I'd rather have a proper stack.
        • But the direction of Java and co is to tell the programmer that since he needs some help with non-stack-allocated stuff, well, then he just doesn't get to have use a stack at all. That's just not a useful tradeoff; I'd rather have a proper stack.

          Um... Just where do you think the method-local object references are stored to ?-) And anyway, the decision on whether or not to store the fields (data) of an object to stack or to heap is completely mechanical: if you pass references to other functions, the obj

          • The thing is, it is _very_ common to pass a reference to an object to another method, so nearly all objects are allocated on the heap. But most methods that take an object reference as an argument don't actually keep any copies of it after the method returns, meaning many objects that are allocated on the heap using your 'purely mechanical decision' could actually be allocated on the stack. But figuring that out requires escape analysis, which is both expensive and error prone, which is why Java is only get
            • Thinking some more on this subject, I realized that all objects have a reference to them passed to another method (the constructor) on creation, so using your simple rules, not a single object would ever be allocated on the stack. And there is no way around that without using at least some simple escape analysis, since a constructor may pass on a reference to the newly created object to some other method.
              • Thinking some more on this subject, I realized that all objects have a reference to them passed to another method (the constructor) on creation, so using your simple rules, not a single object would ever be allocated on the stack.

                True. However, you can simply analyze the constructor too - it only needs to be done once per constructor. And you could recurse trough all the methods the object gets passes to, altought this might be less usefull since the exact same calling sequence is unlikely to repeat oft

        • This is not correct:

          under .NET you can chose wether you want to allocate an obejct on the stack: short term allocation
          or wether you want to allocate it on the ehap. long time allocation.

          Disadvantage: this is in most cases class based, so all objects of a given class are value objects and allocated on the stack, or "reference" objects and alloated on the heap.

          In Java the GC is a generational GC, so it automatically distinguishes between short and long time allocatin by propagating young uncollected objetcs t
      • I don't think garbage collection implies treating the programmer like an idiot. The programmer's attention is a finite resource that is often better spent on something other than memory management.

        I have mixed feelings regarding garbage collection. Sure enough when people are learning how to write programs, it's far better to use a language garbage collection, so that one has to really understand what's happening. Also, having to manually keep track of your data can lead to cleaner code (I know one can wri
    • by Pseudonym ( 62607 ) on Tuesday November 29, 2005 @12:46AM (#14135262)
      The C++ model is basically correct.

      On the contrary, the C++ model is basically correct for some applications.

      A "proper program" is programmed in the appropriate language for the job. Sometimes this is a domain-specific language. Sometimes you need the close-to-the-metal-yet-still-maintainable-for-larg e-applications qualities that C++ provides. And sometimes you don't.

      Very few people write web applications in C++, and for good reason. Web servers run at the speed of the network card, not the speed of the L1 cache. Pulling out extra cycles is pointless especially if you lose the maintainability that a general purpose language like C++ provides. And yet you wouldn't call many of these "quick scripting hacks".

  • Depends (Score:5, Insightful)

    by Apreche ( 239272 ) on Monday November 28, 2005 @11:09PM (#14134785) Homepage Journal
    It depends on what you are trying to make, duh.

    If you are trying to make something where performance is important, like a 3d game, then manage memory yourself. If you are making a simple business application where reliability and security are important, use garbage collection. If your program uses lots of RAM and you need every last drop either find an expert at RAM management to get every last bit or use garbage collection if your programmers are not so awesome.

    And so on and so on...
    • Re:Depends (Score:3, Insightful)

      by ivan256 ( 17499 ) *
      Actually, it seems to me that if you want reliability, maintainability, and perhaps most important, debugability, you want to manage your memory yourself.

      When debugging a program with a leak (Yes, garbage collected programs have leaks too, they're just nastier, because they don't look like bugs because a reference is persisting somewhere.) if memory is program managed, finding the leak is a deterministic process. You're guaranteed success in a well-defined, and finite amount of time (The amount of time it t
      • Re:Depends (Score:3, Insightful)

        by alienw ( 585907 )
        Are you joking? A large fraction of bugs in software are due to mismanaged memory. This is one of the main reasons Java apps have much, much better reliability than C++ ones. Without a garbage collector, many types of (perfectly legitimate) structures become very difficult to use. When you create objects in one module and give them to someone else, you create bugs. Then you have to come up with some kind of reference counting system anyway.

        Yes, garbage collected programs have leaks too, they're just na
        • Re:Depends (Score:2, Interesting)

          by Mr. Slippery ( 47854 )

          When you create objects in one module and give them to someone else, you create bugs.

          No, the caller of your module creates a bug when they fail to free the object that you have clearly defined in the interface to be their responsibility. It's no different than any other violation of an interface condition. (If you don't clearly define your interfaces, then yes, you have of course created a bug.)

          That's not a leak, it's sloppy programming.

          Are you saying that leaks are not a form of sloppy programmin

      • Re:Depends (Score:3, Insightful)

        by Spy Hunter ( 317220 )
        I fail to see how following a chain of references to a memory hog is harder than finding a memory leak which has nothing pointing to it at all. In a garbage-collected application, with a proper debugger and profiler you should not have any trouble figuring out exactly what's taking up every byte of your memory, and once you've done that you can easily figure out who has the references to it. I recommend you take a look at Microsoft's awesome CLR profiler; I'm sure a similar tool exists for Java but it may
        • Re:Depends (Score:3, Insightful)

          by ivan256 ( 17499 ) *
          so C++ can have the same types of so-called "hard to debug" leaks you blame on garbage collection

          I said they become more difficult to debug because of garbage collection. They're certainly not caused by the garbage collection. They're caused (usually) by poor programming.

          Garbage collection is a tool. It makes your job as a programmer easier, but it does not free you from the need to understand things like scope. Just because you don't have to worry about the mechanics of managing your memory, you still need
      • Re:Depends (Score:3, Interesting)

        Actually, it seems to me that if you want reliability, maintainability, and perhaps most important, debugability, you want to manage your memory yourself.

        And try to pinpoint which of the hundred thousand totally unrelated functions has modified my data because it happens to use a bad pointer?

        I had to debug a C program that started crashing after an unused variable declaration had been removed. The reason? - a dangling pointer.

        The program was compiled without any optimization, so the memory for the vari

        • There are so many poor practices involved in that situation you just described that I don't even know where to begin.

          There are dozens of simple rules you can follow when you write C code, any one of which would have prevented that problem. Either way, having your memory managed for you doesn't imply that you don't have access to the raw data anyway. Protection is only enforced in some languages.
    • Re:Depends (Score:5, Insightful)

      by swillden ( 191260 ) <shawn-ds@willden.org> on Tuesday November 29, 2005 @01:03AM (#14135353) Journal

      It depends on what you are trying to make, duh.

      Agreed.

      If you are trying to make something where performance is important, like a 3d game, then manage memory yourself.

      It's not that simple.

      In most cases, the total run-time cost of garbage collection is lower than that of malloc/free memory management, at the cost of higher on-average memory usage (which can obviously destroy performance if you end up having to swap). On the other hand, application-tuned manual memory management using pooled allocation is generally faster than GC. Whether or not pooled allocation increases memory usage as much or more than GC depends on many things. Another consideration is that although GC often consumes less total CPU cycles than malloc/free, non-incremental collectors tend to use those cycles in big batches, which can produce GC 'pauses'. That's bad for some applications. Incremental collectors can minimize this effect, but only with some cost in CPU cycles.

      Then there's also the whole issue of the effect of different approaches on the multi-tiered memory caching in modern systems.

      In short: yes it depends on what you're trying to make. No, it's not nearly as simple an analysis as you describe.

      Not only that, in practice other constraints usually dictate the choice anyway. Using GC generally means using something like Java, C#, Python, etc. rather than C or C++, which brings in a whole raft of other considerations, many of them more important than the memory management discussion. Platform, target environment and libraries will often dictate language selection, which will dictate much of memory management approach.

      • This is a common claim, but it is an apples to oranges comparison. No one (including the compiler) dynamically allocates objects in C/C++ when they can place them on the stack instead. Garbage collected languages like Java, on the other hand, require practically everything to be managed on the heap.

        In addition, an array of objects on the heap requires only a single memory allocation in C or C++, where Java has to allocate and track each separately. As one luminary once said, "C++ is better because there i
        • by swillden ( 191260 ) <shawn-ds@willden.org> on Tuesday November 29, 2005 @06:51AM (#14136471) Journal

          No one (including the compiler) dynamically allocates objects in C/C++ when they can place them on the stack instead.

          Are you certain of that? Here:

          void foo()
          {
          //...
          auto_ptr<Foo> f(new Foo);
          //...
          };

          What would the compiler do? What *could* it do, if it were smarter? And have you really never seen any code that does this? Or written it?

          Lots of C and C++ programs dynamically allocate many objects that could be heap allocated. In particular, many C++ objects that are placed on the stack immediately allocate storage on the heap. Think std::string. Many programmers do make an attempt to allocate as much on the stack as possible, but I think most don't really consider it. And keep in mind when I say this that I've been writing C and C++ (mostly C++) professionally for nearly 15 years -- I've seen more than a little code.

          Garbage collected languages like Java, on the other hand, require practically everything to be managed on the heap.

          Interestingly, Java does *not* require that at all... it's just the most obvious way to implement it. In fact, I read a while back that the next generation of Java compilers will perform escape analysis, looking for objects whose lifetime is associated with a stack frame. Here's a link [ibm.com]. When they find such an object, it will be allocated on the stack. If such an object creates other objects, as long as the analysis can prove that their lifetimes are also frame-associated, they will also be allocate on the stack.

          The same analysis will often allow Java objects and their sub-objects to be allocated as a single block. Since the compiler can see that the constructor of class Foo always allocates objects of Bar and Baz, all of fixed size, it can allocate a single block, just like a C++ compiler would be able to for a class like:

          class Foo
          {
          // ...
          Bar bar;
          Baz baz;
          };

          The same sort of analysis should also allow your other point to be addressed: An array of objects can be allocated as a single block. The compiler can recognize code like:

          Foo[] f = new Foo[n];
          for (int i; i < n; ++i)
          f[i] = new Foo;

          And allocate a single block that is n*(sizeof(Foo)+sizeof(Bar)+sizeof(Baz)) in size, and if 'f' has a stack-associated lifetime, allocate the whole pile on the stack.

          All of the above is still theoretical, of course, but it's coming quickly.

          That might be acceptable, but the worst part is random application pauses of arbitrary duration for garbage collection. Unless that problem can be resolved, garbage collected languages will be always be a poor match for latency sensitive applications, even where the net throughput is otherwise adequate.

          As I pointed out in my previous post, whether or not that problem exists depends on the GC implementation. Incremental GCs keep the pauses small, and there are GCs designed for real-time usage that further guarantee maximum latencies. It's worth pointing out also that normal malloc() and free() implementations don't provide any run-time guarantees. Real-time code that uses a heap uses special versions that do provide guaranteed latencies, at the expense of worse average performance.

    • Actually, while you may want to make certain parts of you game in C for speed purposes, you're probably going to want a simple scripting engine that everyone from designers to artists can use with little additional training... And you can't expect them to deallocate variables intelligently.

    • In the mainframe-based online transaction application environment where I work, each program is given a fixed block of memory to play with. Period. After the program terminates, that memory is usually freed, but often a transaction has either its code or its data memory area(s) locked into core to speed up program load times (it remains resident but idle until the next time the program is activated).

      Transaction programs are event-driven entities, though, and they have very short lifetimes -- they are load
  • by 2starr ( 202647 ) on Monday November 28, 2005 @11:09PM (#14134787) Homepage
    In general I prefer having a GC because most of the time I don't want to have to worry about memory management... there's no need. However, sometimes it would really be nice to have more direct control. Not being a VM expert myself, it seems like it should be possible (though I can imagine the types of problems that would arise) to allow specifying that you're assuming manual memory control either over certain objects or while inside a particular context.
    • What I don't understand is why they couldn't have written Java (and .NET) so we could have our cake and eat it too.

      For the most part, GC for memory is a good thing. (I do business apps, so the immediate performance of memory typically isn't a problem.) But why couldn't they give us a default "going-out-of-scope" method? I love the whole C++ constructor/destructor idiom because it makes using the native classes for resource management a breeze. Want a class to wrap a file handle? Sure, no problem, we'

      • But why couldn't they give us a default "going-out-of-scope" method?

        They did.

        protected void finalize()

      • But why couldn't they give us a default "going-out-of-scope" method?

        That would be the "IDispoable" interface and the "using" statement in c#.
        • Or "try/finally" in C# or Java.
          • try/finally isn't automatically run when you go out of scope. Yes the finally block is always run when you leave scope, but you must remember to wrap blocks of code with it, and put the proper contents in it. With C++ a destructor is always run when you go out of scope. Always barring an exception happening during an exception handling.

            I wish Java had that, even if I didn't get to clean up memory with it. If only so I could use it to close files that go out of scope. Even if I had to declare the vari

      • What I don't understand is why they couldn't have written Java (and .NET) so we could have our cake and eat it too.

        They have. Visual Studio 2005 adds syntax to Managed C++ (C++/CLI) to allow you to manage your lifetime and memory separately. Herb Sutter has been talking about this for at least a year IIRC. Dinkumware even made the STL work with it.

        See for instance this article [codeguru.com]. I'm not currently developing on .Net, but I'm hoping that these extensions can be considered at sometime for standard C++.

        • Thanks for the link, I saw Kate's presentation on C++/CLI at TechEd earlier this year. It sounded really good, but she also pointed out if you do any C++ things your code was no longer "verifiable" (in terms of .NET) It was really cool that she was able to recompile any old C++ program as a .NET assembly (no changes.) But the reason those things work is the code becomes "mixed", with some machine language, some MSIL. The benefits of verifiable .NET assemblies (things like assured correctness of memory m
  • by itistoday ( 602304 ) on Monday November 28, 2005 @11:22PM (#14134850) Homepage
    Garbage collection does not equal poor performance. In some instances, it actually speeds things up--when done properly. Take, for example, the D Programming language [digitalmars.com]. It's just as fast as C (faster in some cases) yet it has a garbage collector. The reason is that most programmers tend to not realize that the free() operation actually takes up a decent amount of CPU cycles, and when you're freeing a bunch of little things all over the place, the overhead tends to add up. With a well-designed garbage collector, however, memory is freed all in one big chunk in a single go, and thereby decreasing that overhead. The myth that garbage collection = poor performance is just that, a myth, and most likely started by people who associate Java's performance issues with garbage collection.
    • Which is only true when you're dealing with badly written manual memory management. C++ custom allocators avoid that problem, and let the programmer specify the best behaviour for every given situation.
      • by be-fan ( 61476 ) on Monday November 28, 2005 @11:57PM (#14135039)
        In theory C++ custom allocators let the programmer specify the best behavior for any given situation. In practice, very few people use it except for the simple case of pool allocation (which is an optimization you can make in the more sophisticated GC systems). The problem with the C++ mechanism is that it always exposes 100% of the complexity, even in the 99% of the time that you absolutely don't need it.
        • ...and you can ignore it whenever the code profiler shows you that you don't need to care about it, and you can switch allocaters extremely late on in the development process.
          • You can't ignore the complexity of manual memory management. You must free all your allocations, and you must police dangling pointers. C++ exposes that complexity all of the time, even though you only need it occasionally, if ever. You can use a smart pointer class, but the more sophisticated of those are simply slow unsafe reference-counting garbage collectors...
            • You can't ignore the complexity of manual memory management.

              A lot of posts in this discussion almost imply that there is 100% manual memory management, or some sort of super-generational-buzzwordy-GC, and nothing in between. That simply isn't the case.

              I write C++ for a living. I work with intricate, graph-like data structures, using performance-sensitive algorithms, with pointers all over the place. And yet I can't remember the last time I had to use the delete operator, nor any sort of super-ref-coun

    • not true (Score:3, Funny)

      by doug ( 926 )
      The myth that garbage collection = poor performance is just that, a myth, and most likely started by people who associate Java's performance issues with garbage collection.
      Sorry, but I've thought GC was slow since the 80s. Java had nothing to do with it. - doug
    • The reason is that most programmers tend to not realize that the free() operation actually takes up a decent amount of CPU cycles, and when you're freeing a bunch of little things all over the place, the overhead tends to add up.

      This depends entirely on the underlying memory manager. Using pooled allocation or other "zone-based" allocators can obviate the hit of these frees. As with many things, it's a tradeoff between the time spent putting a block back on its free list (naive implementation) to storin

  • Pros and cons (Score:5, Insightful)

    by studerby ( 160802 ) on Monday November 28, 2005 @11:24PM (#14134864)
    As someone who works on long-lived projects with a mid-sized team (a dozen or so developers), I prefer a GC-based language. The biggest pro is the great reduction in memory leaks, closely followed by the productivity increase by not having to think about allocation/deallocation (very much). The biggest con is that far too many "young whippersnappers" seem to think memory allocation/deallocation is therefore "free" in a GC-based language and will take absolutely no care at all about when they allocate (e.g. will allocate a largish object inside a very tight loop instead of allocating it outside and reusing it...). And the 2nd biggest con is that a lot of developers can't believe you can have memory leaks in a GC-based language, won't look for them until you rub their nose in them, and don't really know how to find them when they look.
    • Re:Pros and cons (Score:5, Insightful)

      by metamatic ( 202216 ) on Tuesday November 29, 2005 @12:12AM (#14135106) Homepage Journal
      And the 2nd biggest con is that a lot of developers can't believe you can have memory leaks in a GC-based language, won't look for them until you rub their nose in them, and don't really know how to find them when they look.

      I've always thought that the use of the term "memory leak" to describe resource management problems in Java is a really poor choice, as it's quite a different problem from a memory leak in (say) C.

      Keeping memory allocated and referenced for longer than you need it isn't really a leak, to my mind. It's just bad programming. To me, a memory leak is when you lose the pointers to a piece of allocated memory, so the code is no longer able to deallocate it.

      In other words, your developers might give a better answer if you ask "Are there objects you keep around longer than necessary?", rather than "Are there memory leaks?"

      Or maybe I'm the only one.

      • When a Real Java Programmer refers to a Java "memory leak", (s)he's being a little bit snarky. Real Java Programmers know that there's no such thing as a traditional memory leak in a Java program, so they use the term generically to refer to inadvertently unfreed memory.

        Beware, for by openly objecting to this usage, you open yourself to the Real Java Programmer for characterization as an old-school programmer (the bad, bit-flipping kind) or worse, a n00b in need of a lecture. The best approach, rather, is
      • You can still "leak" in Java, for instance by forgetting to null object references that then keep a huge section of the heap graph lying around without realising it. Common ways to do that include not disconnecting event listeners, and having an object provided by a library refer to one of your objects, not realising that the library object is being tracked in some global variable somewhere.
      • The literature tends to define garbage as any object that cannot affect the results of any future computation. Pointer reachability is simply a useful conservative approximation. There are some proposed GC systems (I don't believe any have been implemented in a real system) that attempt to do better and collect some objects that are still reachable.

        So those objects in your program are garbage, and you do have a memory leak.
      • It's not just you. A lot of people in this discussion are confusing fundamentally different concepts whose implementations often happen to coincide.

        In particular, whether or not something is ever cleaned up is different from whether or not it is cleaned up promptly. Also, releasing memory is not the same as destroying/finalising an object that happens to be stored in that memory.

        Garbage collection addresses exactly one of the four possible combinations: making sure that memory is always released.

        The m

    • On the other hand, it's so cool that "young whippersnappers" can write a perl one-liner that productively creates a several-megabyte hash table these days, as compared to 10 years ago when in the "640kb should be enough for anybody" days, doing anything that took any memory at all was exceedingly painful.
    • ``And the 2nd biggest con is that a lot of developers can't believe you can have memory leaks in a GC-based language''

      If you can, your garbage collector is broken. The whole point of garbage collection is that it reclaims the memory used by objects that can no longer be accessed. If your collector is doing it's job, your program can't leak memory.
      • That's a great theory... Unfortunately it relies on a very narrow definition of "memory leak". In the real world, running out of memory 'cause you screwed up is a real problem (and the basis for a set of development products) in Java, and it's not the fault of the GC algorithm. You leak memory in a GC-based language by having things that can be accessed, but shouldn't be (and because they shouldn't be, you usually don't know about it until your big development effort starts crashing with out of memory error
        • We can argue back and forth about the definition of memory leak, but let's not do that. Instead, I'll argue a different point with you, namely that you can't blame the language for retaining objects that you don't need anymore. These are programming errors, and if programmers keep making them, because they are not aware of the fact that they can, they are just incompetent.

          The article you link to provides a nice example of how to write incredibly bad software. Instead of fixing the problem at its root by put
  • Manual memory management is similar to assembly language in a certain way: everybody should know how to use it, but they should strongly avoid actually having to use it in most cases. Even though I like to write code in C, I still understand the value of garbage collection. This goes back to the old adage: "Programmer time is more valuable than processor time." On the other hand, there are still a few instances where the manual method is the best tool for the job.

    But then, the question is rather amb
  • It all depends (Score:3, Interesting)

    by unr_stuart ( 883885 ) on Monday November 28, 2005 @11:27PM (#14134884)
    I've always had the philosophy, "use what makes the job easiest." Typically, this involves garbage-collection. However, one of the biggest problems I have with garbage collection is that you can't have your cake and eat it too. Meaning, you can get all the memory you want, but you can only access it at a high level (think Objects in Java). In C/C++ however, you can call malloc/new, create a big pool of memory (or just a single object), and then do whatever the heck you want with it. But again, as the subject says, it all depends on which method helps get the job done, and so far neither has been perfect for everything.
  • As a programmer-come-sysadmin, I vote both. Which has its issues all its own...

    When I programmed professionally, I craved the control of memory management. Objects did _exactly_ what was _explicitly_ told to do.

    Now I'm a ruby junkie, and love the OO, GC, Etc.

    Still, yes, for performance reasons, there are good reasons to do it yourself.

    For programming reasons, there are reasons to go GC.

    all in all, GC tends to be great. wouldnt work without it. But there are times I'm mystified as to why an object left
    • Short run applications generally want GC

      Short run applications don't neccessarily need any memory management - since the application is going to exit shortly, you can just leak the memory and let the kernel reclaim it on exit. Might actually be faster that way than wasting time freeing memory that's going to be reclaimed soon anyway.

      After all, if the application is short-run enough, the GC doesn't neccessarily have time to run even once.

      Long running, RAM intensive, frequent paging, or frequently sh

  • Check this out. (Score:3, Informative)

    by dslauson ( 914147 ) on Monday November 28, 2005 @11:41PM (#14134961) Journal
    There was a pretty good discussion and article [slashdot.org] on slashdot not too awful long ago about dispelling common myths regarding garbage collection and performance in Java.

    It's definitely worth checking out before people go spouting off the traditional rants against garbage collection.

    Of course, determining which one is best always depends on your application and your available resources, among other things. There are good arguements for both in various situations. I code C++ for embedded devices for a living, which means that I am working with the new/delete/malloc/free model, but for school projects I really like to work with Java, because it lets me focus entirely on implementing an algorithm without having to spend any time thinking about memory allocation or the underlying hardware.


  • aside from the performance issues, predictable memory management can be used for controlling access to files and similar resources, creating safer thread locking code and even providing better error messages.

    This is silly. None of these have any connection to garbage collection; you can write "destructor's" in a garbage collected language, and do everything in them just as you would have in a non-GC language.

    The advantage comes from the RAII style of coding, not from the absence of a garbage collector.

    • >> you can write "destructor's" in a garbage collected language

      You mean like Java's Object.finalize()?

      The same one that causes significant performance problems fundamental to how GCs work, and is not guaranteed to execute in any specific order, or even at all?
  • C++ and others.... (Score:5, Interesting)

    by try_anything ( 880404 ) on Monday November 28, 2005 @11:48PM (#14135004)
    C++'s constructor/destructor paradigm with predictable object destruction has the benefit of enabling the RAII (Resource Acquisition Is Initialization) idiom. RAII and exceptions greatly simplify resource management in the presence of error handling. Still, even as someone who knows C++ better than I know any other language, I have to admit that for many applications a garbage collected language puts the least mental burden on programmers and produces the fewest memory errors. The burden of arranging all the extra try/catch blocks in Java (because it lacks RAII) has to be weighed against the burden of investigating and fixing memory management errors in C++, and for people using new/delete, Java wins, IMHO.

    C++ programmers should be making very little use of new and delete, though; they should be using smart pointers. I think the article poster misunderstands smart pointers. boost::shared_ptr is a reference counted pointer, but std::auto_ptr and boost::scoped_ptr have nothing to do with garbage collection - they certainly aren't "faked garbage collection" and they certainly aren't unpredictable. They use C++'s object scoping and copying mechanisms to manage memory in a way completely unlike garbage collection. scoped_ptr is the simplest and most predictable memory management tool of all. Taking programmer error into account, it's more predictable than using delete. Even shared_ptr is predictable; when the reference count falls to zero, the object is immediately destroyed, not just marked for destruction.

    Sadly, although C++ is a very powerful language and can be used to write code with few errors, the language as used by beginners is as dangerous as C, perhaps even more dangerous. It takes programmers years to become proficient in all the methods and idioms that make C++ a usable language.

    (I would love to see a language that allows programmers to choose scoped allocation, smart pointer heap allocation, or garbage-collected heap allocation, and uses types to avoid dangerous combinations such as garbage-collected objects pointing to scoped objects or an object pointing to an object in an unrelated scope. Every object would have two types - the object type (int, file, circle, etc.) and the memory management type (scoped with scope S1, scoped with scope S2, garbage-collected, etc.))
    • Why do C++ people use the acronym RAII "resource acquisition is initialiation" to talk about when the object is uninitialized? The acronym is just completely wrong, because languages like Java are far more "RAII" than C++ (in C++ you can actually allocate resources without initializing them). It really should be something more like RAIS, "resource acquisition is scope", or LSILS "lexical scope is logical scope", or ODOL "object destroyed on leaving", or RROL "resource released on leaving", or something th
      • Why do C++ people use the acronym RAII "resource acquisition is initialiation" to talk about when the object is uninitialized?

        Because the concept is more fundamental than merely "resource release is destruction", although the latter is arguably the most important aspect of it.

        What you're doing in C++ is tying the period between allocation and release of a resource to the lifetime of an object. If you like, the resource-owning class's invariant conditions include the fact that the resource is allocated

    • I've never understood why none of the GC languages (none that I know of: Java and .NET) allow this pattern to work. It is EXTREMELY useful! Just because the runtime uses GC doesn't mean you couldn't mark an object for immediate disposal when it leaves scope. Both Java and C# have a mechanism for doing this, but it requires explicitly freeing the object (which defeats the purpose of the pattern).
  • usually, I prefer GC (Score:2, Informative)

    by Anonymous Coward

    Most new programming languages are using garbage collection

    You mean like Lisp and Smalltalk? ;-)

    The advantages are obvious: programmers no longer have to worry about forgetting to delete allocated memory, leading to far fewer memory leaks.

    In other words: the computer is perfectly capable of figuring out what to do, so let it! This is almost always the best thing.

    When using a manual memory management language, when do you consider the performance and syntactic overhead of faked garbage collectio

  • GC (Score:5, Informative)

    by Dr. Photo ( 640363 ) on Tuesday November 29, 2005 @12:00AM (#14135054) Journal
    Pros and cons of garbage collection?

    If you don't CONS, you never need to collect garbage. *rimshot*

    More seriously, GC isn't so much about pros and cons, as it is about tradeoffs between the various GC algorithms: time vs. space, low-latency vs. high-throughput, parallelism, etc.

    If you're designing a new language, it should include garbage collection, or nobody will use it (i.e., your target audience can already program in C). You may wish to have multiple GC implementations available for different purposes, perhaps to be selected at compile-time.

    For a good overview of what's available, see http://www.memorymanagement.org/ [memorymanagement.org]

    My personal favorite is the good old Cheney semi-space collector (and Ephemeral/Generational Garbage Collectors, which are more advanced versions designed to generally have low latency), as it is very straightforward (both to understand and to implement), compacting (it defragments memory, and can perhaps improve cache locality by grouping related objects), and it has high throughput (work is proportional to the amount of live data, not total data).

    If memory usage is of more concern than fragmentation and throughput, a mark-sweep collector may be more your style.

    There are also "real-time" (and "soft-real-time", i.e. bounded latency [see Henry Baker's Treadmill]) collectors, parallel collectors [including an interesting case for reference counting, usually considered a dog performance-wise, as a viable parallel/remote GC method], "conservative" collectors for C/C++ (see Hans-J Boehm's libgc), collectors for real and hypothetical computers with special hardware and/or OS support for GC features, and some collectors that are just plain weird.

    Note also that garbage collection algorithms are considered hard to measure for performance, especially with regard to wall-time latency, so just because a paper(*) claims that a certain GC has certain performance characteristics, be sure to benchmark if it really matters.

    (*) Did I mention papers? If you're serious about implementing GC, getting comfortable reading CS research papers is a must. The book "Garbage Collection" [kent.ac.uk] is your best friend here, as it provides a very good overview/survey of said papers and algorithms, and it discusses a lot of pros and cons between various algorithms, and useful variants or adaptations that have been applied to previously-published work.

    Also check out Henry Baker's papers, because he is a memory management demigod: http://home.pipeline.com/~hbaker1/home.html [pipeline.com].

    • VM aware GC (Score:3, Interesting)

      by renoX ( 11677 )
      A paper I've found interesting is on a GC which communicates with the VM systems to avoid putting too much load on the VM system.
      It needs a modification of the VM, but IMHO this is better than having to handtune the memory used by the GC. (Note: I'm not an expert in GC)

      http://www.cs.umass.edu/~emery/pubs/04-16.pdf [umass.edu]
  • by mccoma ( 64578 ) on Tuesday November 29, 2005 @12:21AM (#14135147)
    Apple / NeXT takes a reference counting approach [apple.com]. It is not automatic, but it works well once you understand the rules [apple.com].
    • As an old Obj-C coder, let me respectfully disagree with this. Reference counting does not handle cycles: that's a huge flaw. It forces Apple to promote a notion of "ownership" of object graphs which only works on a small scale. It does not work well: it merely reduces what you need to keep track of (but doesn't eliminate it) in return for a considerable amount of manual labor.
  • by GileadGreene ( 539584 ) on Tuesday November 29, 2005 @12:28AM (#14135180) Homepage
    All of the reasons given for manual memory management seem to boil down to a desire to have support for the Resource Acquisition Is Initialization (RAII) idiom, which is hard to pull off in GC languages. But, the alternative idiom Resource Acquisition Is Invocation [c2.com] provides the desired capability in GC languages. Same capability, no chance of memory leaks. So tell me again why manual memory management might be a good idea?
  • RAII techniques like mutex locks and the error message stuff can be implemented without deterministic collection using something like:

    void AcquireAndRun(Resource r, Function f) {
    r.Acquire();
    try {f()} finally {r.Release();}
    }

    void doSomething(Resource r) {
    AcquireAndRun(r, lambda () {
    // Here, r has been acquired.
    read(r);
    write(r);
    });
    // Once we get here, r has been released.
    }

    (Note: Those comments were indented properly, but Slashdot messed

    • Of course, this will be hard to use unless your language supports closures. Sadly, most imperative languages (e.g. Java) do not. But, hey, I'm a functional language person so I'm not going to try to defend them.

      Java supports closures just fine, as objects. A java anonymous class is a closure. If that really offends you, try calling them multi-closures, since they combine multiple functions with a lexical scope instead of being limited to just one function.

      In some languages you have a closure of just one f
      • Yes, anonymous inner classes work... but the are awfully cumbersome to use.

        they combine multiple functions with a lexical scope instead of being limited to just one function.

        I prefer to add such complication only when it is actually needed. In a typical functional language you can always group a bunch of closures into an aggregate if that's what you want.
  • by Pseudonym ( 62607 ) on Tuesday November 29, 2005 @12:39AM (#14135233)

    The answer, as always, is "it depends". I'm firmly inside the "right tool for the job" camp.

    Manual memory management is not free. In some circumstances, it can be quite expensive. There is a group of programmers who are best described as "rabidly anti-GC". These people are almost all completely unaware of the costs that manual memory management can impose on your code.

    A multi-threaded program, for example, can allocate memory from any arena, but it MUST return a block to the arena from whence it came, which can cause all sorts of difficult lock contention problems, making free() much more expensive than malloc(). (Ask anyone who has written high-performance memory-intensive multi-threaded programs.)

    In some languages, like C, the situation is even worse. In structure-hungry programs, you can end up structuring your code around data lifetimes, which precludes you from using the most natural, maintainable and efficient algorithms. Garbage collection frees you from this, as the GCC people have discovered.

    I do recommend reading Paul Wilson's excellent survey paper [utexas.edu] on the topic. It answers a lot of your questions, though it's by no means the final word.

  • The original post provided 3 examples of the supposed utility of programmer-controlled memory management and treated them as opposed to automatic garbage collection, but none of them served the original poster's argument: They were all examples of automatic variables being used with RAII, not to manage heap memory, but to manage non-memory resources by lexical scope. Cluestick sez: Languages with GC still have stacks. *Whack*.
  • GC is DRY (Score:2, Interesting)

    by PBPanther ( 47660 )
    Not using GC requires that you to write code to free those resources repeatedly. That goes against the DRY (Don't Repeat Yourself) principle.

    I wonder how many of the people who use the "C++ model" bother to unit test that they have freed all their resources.
  • C has problems too (Score:3, Informative)

    by countach ( 534280 ) on Tuesday November 29, 2005 @01:40AM (#14135496)
    C memory management is not completely deterministic either, since a fragmented heap will not always take the same amount of time to allocate in. To make it completely deterministic you would have to pre-allocate objects. But if you're going to do that, you could do it in a GC language and turn the GC off.
    • by be-fan ( 61476 ) on Tuesday November 29, 2005 @04:22AM (#14136056)
      It's frightening the allusions programmers have about manual memory management. They seem to think that malloc() and free() are cheap functions, when in reality they can take hundreds of clock cycles. They think that malloc() is deterministic, when in reality, a badly fragmented freelist can cause most malloc() implementations to traipse through the entire heap, just like a GC.

      The weirdest thing is C++ programmers. They freak out about every single cycle, but modern C++ idioms push the use of smart pointers, which are usually quite slow compared to a good generational GC.
  • I had this job interview where I was asked if I started a new project right now what programming language would I write it in.

    I said "That would of course depend heavily on the project"

    This got me the job, because I was the only person who didn't answer "Java" or "PHP" - a clear indicator that the prospective employee was either feeding you the line they'd gotten in CSCI 102 at the local university, or reacting against that line.

    The same thing about garbage collection. Come on, if I'm writing a web applicat
  • by Anonymous Coward
    Having used languages with and without garbage collection, my view is that
    garbage collection is often very nice... but I don't really mind the "lack" of garbage collection in C and especially don't miss it in C++.

    My opinion is that it takes some effort on the programmer's part to learn to use C safely. I'm not sure why, but this answer seems to suprise some people. Do they seriously expect that in the real world-- of software or of anything else-- that they should be able to pick up any tool they want and u
  • by g-san ( 93038 ) on Tuesday November 29, 2005 @03:22AM (#14135891)
    I prefer garbage collection. At most, I take the cans to the edge of the driveway and some guy in a noisy truck with a cool robotic arm just hauls it away. Yeah, there is a landfill somewhere that isn't good for the overall environment but I accept that tradeoff. I also don't throw old car batteries into the trash.

    Sure the hell beats me keeping the trash around, remembering where it is, and putting it in my truck and hauling it to the heaping landfill myself. I'm not here to manage trash, I'm here to get something done.

    Is this post about programming?
  • False dichotomies (Score:4, Interesting)

    by Eric Smith ( 4379 ) * on Tuesday November 29, 2005 @04:03AM (#14136010) Homepage Journal
    Some of the cited advantages of not using garbage collection are red herrings. For instance the "controlling access to files and similar resources" by RAII [wikipedia.org] works fine with garbage collection. In most cases, the compiler can determine by static analysis that a particular object is allocated within a scope and no referenes are propogated upward out of scope, and can remove the reference so the garbage collector will deallocate it (possibly calling a destructor). Depending on the type of GC and its implementation, the compiler may generate code that forces the object to be deallocated immediately.

    For cases where static analysis can't do this automatically, it isn't that hard to use a design methodology that achieves the same result; it's certainly still much easier than doing manual allocation and deallocation and ensuring that the deallocation is done (or not done) correctly in all cases.

    And if you are using a reference-counting GC, or a hybrid GC that includes reference-counting, you don't have to do anything special at all.

    The same applies to the claimed mutex and error message disadvantages, since those are just specific uses of RAII.

  • by samjam ( 256347 ) on Tuesday November 29, 2005 @04:27AM (#14136069) Homepage Journal
    Right now someone I know is trying to track down a Java memory leak.

    No doubt some reference is left in a persistent collection of some sort (hash, list, array, etc)

    Just As C/C++ programmers must remember to free when done, so Java programmers must remember do undo such "life maintaining" references when they are done.

    Sam
  • Successfull programs evolve over time.
    So they get refactored. Classes get reused at unexpected places. References to objects are kept on places where it was not anticipated. Calling delete now is unapropritated at the old point as it can't take the new references and the changed lifetime into account.

    So the memory management needs to get refactored just because you "reuse" a class?

    Simple example (controverse because it shows where GC leads to problems also :D ) :
    For some reason you implement a cash for a ce
  • For short-lifetime, stateless programs, like queries, scipts, macros, and such GC is fine because if has a finite end, at which everything can be cleaned up together.

    But for longer running programs which launch other programs like root processes, server processes, and such; they might hang around long enough to run out of memory.

    To me the crux is, how does a garbage collector itself allocate memory? Somewhere down the line something has to keep solid track of resources, GC is an option for many subsystems,

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...