Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Programming IT Technology

Commenting and Documentation in Free Code? 27

ckotchey asks: "Being new to the Linux/GPL world over the last few months, I'm amazed at the lack of informative comments that I'd like to find at the begining of each source file, and within the code itself. At a minumum, I'd like to see summary lists of functions, parameters, return values (and their meanings), etc., that are within the file I'm trying to dissect, along with some descriptive comments within the code to help understand what is happening. Without such comments, it seems counter-productive to the whole open-source concept of allowing others to see the code, understand the code, and fix/enhance the code (at least, in a timely manner!). Should there be, at a minimum, some sort of 'commenting standard' to be (at least voluntarily) followed by developers in the Linux/Open Source community?"
This discussion has been archived. No new comments can be posted.

Commenting and Documentation in Free Code?

Comments Filter:
  • by jaa ( 22623 )
    has standards, here [fsf.org] and here [fsf.org]. Most of the Open Source projects have similar directives.
  • Stefan from Berlin [sourceforge.net] has been working on Synopsis [sourceforge.net].
  • A few hints can be found in various code style guides, but the only way to learn to write good code and comments is to read good and bad examples of other people's code, and to practice.
    One such guide, Rob Pike's [bell-labs.com] Notes on Programming in C [lysator.liu.se], includes examples of good and bad styles.
  • by rjh ( 40933 ) <rjh@sixdemonbag.org> on Monday August 28, 2000 @01:38PM (#821108)
    In response to your first question (ought there to be a unified commenting standard), the answer is a loud, resounding "no". There is no single standard for documentation; the best you can say is there ought to be a standard for applications written in C, kernel modules written in C, <type of app> written in <language>, etc., and that's not a very satisfying answer.

    There are some very good guidelines to follow, though. Guidelines are different from a standard in that they only prescribe general rules and principles, without any attempt to enforce strictness--the assumption being that the programmer is smart enough to know when to deviate from the guidelines.

    I'll present my own guidelines for good comment practice here:

    0. ENSURE APPROPRIATENESS
    Writing your own code to descramble CSS is admirable, but that's not the place to include anti-MPAA screeds. Similarly, the documentation for function foo ought to talk about foo, not function bar, unless there are some states of bar which affect foo's behavior.

    1. MAKE IT CONCISE
    When I write code, I generally write two documents; the first is the source file (with comments), and the second is the design file (in SGML). The former has concise comments; the latter is where I go into detail as to what considerations went into the design.

    Nobody wants to read a thirty-line comment in a source file. If a function is so complex as to require that degree of verbosity, it's a sure bet that you've made the function too Byzantine.

    2. VERSION-CONTROL YOUR COMMENTS.
    More times than I can count, I've been misled by comments. The original coder may have commented wonderfully, but subsequent coders made big revisions and never updated the comments or the design document. Something as simple as "Written 06 Jan 2000, last revised 01 Mar 2000" can be a great help... if your code has a datestamp of August 2000 and the last time the comments were revised was in March, that's a strong hint the comments may be out of date!

    3. LEAVE BREADCRUMBS FOR FUTURE DEVELOPMENT.
    If in your design document (if you have one) you say, "it'd be nice if we could include foo", then leave a breadcrumb in your code saying "I think foo could fit best in here". It doesn't have to be about functionality; it can be something as simple as "let's see about cleaning this function up, it's icky and slow".

    Remember that comments are not written for you; they're written for the people who come after you (who, often as not, are you--just a couple of years older). Use comments to give them hints as to where known bugs exist, where things can be fixed, where bottlenecks are and so on.

    4. DON'T OVERCOMMENT
    Nothing annoys me quite so much as bondage-and-discipline documentation styles which require that every variable in a function be described. One-letter identifiers like i, n, j and so on do not need descriptions. If you're using them to do anything more than mere placeholding--keeping track of state for a loop, for instance--then you're using them incorrectly.

    Your code ought to be clear and straightforward enough that you don't need to write reams of comments saying "the variable i stores state for foo which is passed through three levels of pointer indirection to bar".

    5. USE TAGS
    Tags are useful because they make your comments indexable. Try arranging your bugs as BUG_WISHLIST, BUG_NORMAL and BUG_CRITICAL. Then it makes it simple to search your code to find bugs which you already know about.

    This sounds worthless, but it's surprisingly handy. The more you program, the more tags you'll find. The important thing is that you keep them reasonably short and consistent.
  • Should there be, at a minimum, some sort of 'commenting standard' to be (at least voluntarily) followed by developers in the Linux/Open Source community?"

    Perhaps this would help those who are trying to read and understand the code, but the problem is that each different coder has their own styles and practices. This includes commenting -- some don't comment at all, some write long, verbose descriptions of what they are doing, and some are in between. You could ask them to follow a standard such as this, but some coders may think that their style is better and thus refuse.

    =================================
  • It would be nice to see some of the open/free software packages move to a literate programming environment like web (or better yet noweb [harvard.edu]). The code is the documentation, or alternately, the documentation is the code with these systems. I.e. LaTeX + Code in one file. If you've never coded this way, you should definitely give it a try.

  • Personally, I don't mind documentation so much -- call me a nut. But I KNOW that most people don't enjoy doing lots of documentation; at most, a section of code will be documentation just enough that the author knows what's going on.

    Since most OSS is developed in people's "free time" -- not all of us are lucky enough to be paid for developing OSs -- it makes the most sense to maximize the "fun" and minimize the "work". Documentation = "work". Code = "fun"...

    do the math...
  • by Anonymous Coward
    Nobody wants to read a thirty-line comment in a source file.

    Not unless they want to understand the code, no. Your advice is totally against the principle of self-documenting code. 30-line comments are necessary in ALL source files; at least the first 30 lines of a file should be comment.

  • In my coding experience, there is nothing more annoying than excessive comments--with the possible exception of wrong comments. And therein lie the perils of your suggestions.

    Wrong comments are an obvious plague, especially in free software where contributers may not have much stake in long-term maintenance. The saying goes: if the code and comments disagree, both are probably wrong. Excessive comments are a more subtle issue, at very least because "excessive" is a judgement call. One thing is clear: with more comments, you can fit less code on the screen or page, which is a penalty not to be ignored (I think this is part of the reason for the productivity of high-level languages). More subjectively, I tend to find that people who write copious comments can't express their ideas concisely, or aren't clever enough to express their meaning directly in code. In many cases, I would rather the author work on code quality than commenting.

    When working with other people's code, there is nothing I prefer over code that is so clear, concise, and well organized that I don't notice a lack of comments. Unfortunately, this isn't possible for all problems. I recall an anecdote from Donald Knuth in which he claims he would have been unable to successfully implement an algorithm without literate programming, and I believe it.

    However, I was hacking on a piece of free software (vgetty) this weekend, and when I read this article, I realized I couldn't remember whether the code had many comments. I just checked, and found that (the parts I worked with) did not (although the debug output was a partial substitute). But I never missed them, because the code just made sense internally and fit into the larger system in an obvious way.

    One final point. At advogato [advogato.org], there is a discussion of how to encourage more contribution to existing projects, instead new projects. What I think was missed there is that an influx of casual novice programmers isn't necessarily a boon to most projects (bug reports and fixes, maybe, new content, less likely). Linux is good largely because it is optimized for sharp programmers who are willing to study the design (and the development process). This doesn't mean eschewing documentation (the in-line documentation system in 2.4 is encouraging), it means placing less value on beginner-level documentation. This is like all things a trade-off, and I think a good one. Ability to understand kernel design is a good test of who I want futzing with my kernel!

  • Well, the simple answer is obvious: change the colour of comments to the background colour...
  • More important is writing good code in the first place. I've seen both really good code and really awful spaghetti code. Really good code is so clear and understandable that it almost doesn't need documentation, while no amount of documentation could possibly improve badly written code. I've also seen decent code that has been polluted by useless comments such as:

    x++; // add 1 to x

    Even if it was possible to impose a set of style guidelines on the Open Source community, I wouldn't want to. Good code, and good documentation in the code is a fine art. A few hints can be found in various code style guides, but the only way to learn to write good code and comments is to read good and bad examples of other people's code, and to practice.

  • Not to say that programmers shouldn't comment the code when they write it, but don't forget, this is open source. There's nothing saying you can only contribute code. Next time you are poring over some non-documented code and figure out what it does, why not add some comments about it and send the diff back to the author?
  • In my coding experience, there is nothing more annoying than excessive comments--with the possible exception of wrong comments. How about comments full of unfathomable abbreviations, further complicated by being derived from an unknown language (i.e. not English), as spoken natively by one of the four developers of the project? Not knowing which one of the developers was responsible for this comment doesn't help!

    The main reason why it took around a year for UMSDOS support to get fixed during the 2.1.x Linux kernel series was that the code was commented in such a way, and the original developer was no longer maintaining it.

  • Why is a several page proof necessary for a 10 line function? Should the function not be self explanatory if it's that short?
  • Hmm... maybe I just ain't so smart. I do at least a LITTLE coding in everything I do - fun or not - to help me remember what's going on next time. Doesn't take a lot of time provide a simple list saying what the parameters are, what they do, and what the return values are - sometimes I can't decipher whether a returned zero is good or bad!
  • I'd expect comments and documentation to be *better* in free software than anything written for work.

    I derive pleasure from writing quality, beautiful code. Beautiful code comes from the algorithm and the degree to which it solves is problem right down to its formatting and clarity.

    Code without sufficient documentation is not complete and is not of high quality.

    At work these values are often lost for the sake of something "good enough," but with free software the priorities are different.
  • Why is a several page proof necessary for a 10 line function? Should the function not be self explanatory if it's that short?

    Not neccessarily. For example, it's still an unsolved problem as to whether a program like this terminates for all positive integer values of i:

    $i=$ARGV[0];

    while($i != 1) {

    ($i % 2 == 0) ? ($i /= 2) : ($i = 3 * $i + 1);

    }
  • But without comments, I must spend an hour or two reviewing the code, just to get a rough understanding of how it all works. Pardon me, but that is time wasted.

    No, it's code review, and most people would agree that most code doesn't get enough review, so it's hardly time wasted. Plus, now you have a deeper understanding of the code. Maybe not your top priority an the time, but a long-term benefit.

    Further, do you know how much it sucks to maintain code that has had many changes made by people who looked at the code just long enough to make their one change?

    Quality comments are a wondrous thing: comments that provide a roadmap to the code, an overview of the data structures and how they fit together, clear explanations of the subtle operations, and warnings of the pitfalls. But quality comments take more thought to write and are more likely to fall out-of-date (since they're high level and don't map directly to the bits of code that might be changed), so they require talented and consciencious programmers. Most people I've known who have been told to comment heavily leave poor comments, which I feel are worse than useless as I argued before.

    I've heard time and time again that good code should document itself. That's purest bull.

    It's bull that all good code should document itself. I claim that a lot of it should. (Maybe not even most; it's hard to measure code space.)

    I'm aghast that anyone would be advocating less comments in code.

    I'm not advocating fewer comments, per se. I'm suggesting that emphasizing comment quantity and uniformity is not the best route to overall better software. The fact that it seems like a good idea at first blush makes it more insidious.

  • would it be considered "rude" to comment the code and send it back to the original developer in hopes that he'd include it in his next release?

    On the contrary! I think that any developer would be tickled pink to know that you had been reading his code. (It's open source for a reason!) So long as you're polite about it, I think that any developer who believes in open source would appreciate comments, suggestions, re-writes, additions, beautifications, and questions about what they've done.

  • A follow-up question I've thought about is... Let's say I download the source to a particular package, finally decipher it myself - would it be considered "rude" to comment the code and send it back to the original developer in hopes that he'd include it in his next release? I suppose in the end it's up to who's code it is - and whether he sees it as helping him or insulting him. ;-)
  • I did this, commenting open-source code, writ large with my book Linux IP Stacks Commentary for the 2.0.34 TCP/IP code. One reason I did it as a book instead of in "open source" is that the work required was huge...and I needed to eat. As it was, it consumed almost a year of my life, and the advance hasn't covered the cost of living during the time the book was being written. I won't talk about royalties to date...

    To be honest, the book format was a mistake -- the Linux code is moving altogether too fast for a book to be useful when published. It was what Coriolis offered, though, and I did it.

    The job of code documentation done right is not an easy nor quick one, and as a coder myself I can understand why people hate the job.

    That said, my co-author and I will be starting over with 2.4.0 [fluent-access.com] (as soon as I get permission to republish the code from Linus) and do it on the Web...but not for free. To do it right, I also need to develop a revenue model that makes the project pay for itself...and I don't think a book publisher will be willing to do that.

  • In my college courswork it was not uncommon for me to write a several page proof for a 10 line function. When I code something that should be proven, I have found that putting the proof of an alogithm in front of the function that implements that algorithm is a great practice. In such cases, having them in the same place makes it easy to detect errors in one or the other.
  • Maybe we should start a Government forced "code commenting system", call it "Social Coding Security"

    Nah. Someone will write a DeSCS program which will be the cause of an unending war on slashdot.
    --
  • This is how the open source world florishes.

    Be sure to post the comments back where we can all get them.

    IMHO, code documentors are just as important to the cause as the folks who come up with the stuff, and need to be lauded about the same.

  • by stubob ( 204064 )
    Is there a plugin (possibly for emacs) that would allow you to hide/unhide comments? they already handle coloring and all that stuff for comments, so how hard would it be to show them to people who want to see them, but hide them from those who know/don't care? I'm an anti-commentor, but mostly because I think it clutters up code. I know this doesn't help with people who don't comment in the first place, but maybe more people like me would do it if they didn't have to look at it.

    -----
  • ...is "Enough Rope to Shoot Yourself in the Foot" by Allen Holub. He has many things to say about programming style in general and he devotes an entire chapter to formatting and documentation. I find I actually agree with most of what he has to say. To me, the cardinal rule of commenting is:
    In a comment, explain not what the code does, but why it does what it does.
    In other words, assume your reader knows the language the code is written in. Explain to him your thought processes as you were writing the code. I have found many of my own bugs this way before the code was even compiled.

    Another point that Holub makes is that whitespace is a (very effective) form of comment. I like to group several lines together that are logically related and separate them from the rest of the code with a blank line. Think of them as paragraphs. Spaces before and after operators makes the code read better as well, eg, y = a + b;

    Of course you can get too hung up on this stuff too. I'm sure holy wars have been waged over K&R braces vs. indented braces vs. outdented braces, not to mention how many lines to indent (correct answer: 4 *g*). The important point is to pick a style and go with it; be consistent.
  • Ok, so i've been offline for weeks now. This is an old post and no one will probably read my response... bah! it's hardly worth the effort of typing this. Or this.

    Stefan's software makes commenting code easy. It seperates from the kludges of comment lines. I've been using it and it's quite nice, really.

    I wasn't responding to the root post. I was responding to the post I responded to.

"Ninety percent of baseball is half mental." -- Yogi Berra

Working...