Commenting and Documentation in Free Code? 27
ckotchey asks: "Being new to the Linux/GPL world over the last few months, I'm amazed at the lack of informative comments that I'd like to find at the begining of each source file, and within the code itself. At a minumum, I'd like to see summary lists of functions, parameters, return values (and their meanings), etc., that are within the file I'm trying to dissect, along with some descriptive comments within the code to help understand what is happening. Without such comments, it seems counter-productive to the whole open-source concept of allowing others to see the code, understand the code, and fix/enhance the code (at least, in a timely manner!). Should there be, at a minimum, some sort of 'commenting standard' to be (at least voluntarily) followed by developers in the Linux/Open Source community?"
FSF (Score:1)
Auto document0r (Score:1)
Re:I agree - good documentation is important but.. (Score:2)
HOWTO: Comment (Score:4)
There are some very good guidelines to follow, though. Guidelines are different from a standard in that they only prescribe general rules and principles, without any attempt to enforce strictness--the assumption being that the programmer is smart enough to know when to deviate from the guidelines.
I'll present my own guidelines for good comment practice here:
0. ENSURE APPROPRIATENESS
Writing your own code to descramble CSS is admirable, but that's not the place to include anti-MPAA screeds. Similarly, the documentation for function foo ought to talk about foo, not function bar, unless there are some states of bar which affect foo's behavior.
1. MAKE IT CONCISE
When I write code, I generally write two documents; the first is the source file (with comments), and the second is the design file (in SGML). The former has concise comments; the latter is where I go into detail as to what considerations went into the design.
Nobody wants to read a thirty-line comment in a source file. If a function is so complex as to require that degree of verbosity, it's a sure bet that you've made the function too Byzantine.
2. VERSION-CONTROL YOUR COMMENTS.
More times than I can count, I've been misled by comments. The original coder may have commented wonderfully, but subsequent coders made big revisions and never updated the comments or the design document. Something as simple as "Written 06 Jan 2000, last revised 01 Mar 2000" can be a great help... if your code has a datestamp of August 2000 and the last time the comments were revised was in March, that's a strong hint the comments may be out of date!
3. LEAVE BREADCRUMBS FOR FUTURE DEVELOPMENT.
If in your design document (if you have one) you say, "it'd be nice if we could include foo", then leave a breadcrumb in your code saying "I think foo could fit best in here". It doesn't have to be about functionality; it can be something as simple as "let's see about cleaning this function up, it's icky and slow".
Remember that comments are not written for you; they're written for the people who come after you (who, often as not, are you--just a couple of years older). Use comments to give them hints as to where known bugs exist, where things can be fixed, where bottlenecks are and so on.
4. DON'T OVERCOMMENT
Nothing annoys me quite so much as bondage-and-discipline documentation styles which require that every variable in a function be described. One-letter identifiers like i, n, j and so on do not need descriptions. If you're using them to do anything more than mere placeholding--keeping track of state for a loop, for instance--then you're using them incorrectly.
Your code ought to be clear and straightforward enough that you don't need to write reams of comments saying "the variable i stores state for foo which is passed through three levels of pointer indirection to bar".
5. USE TAGS
Tags are useful because they make your comments indexable. Try arranging your bugs as BUG_WISHLIST, BUG_NORMAL and BUG_CRITICAL. Then it makes it simple to search your code to find bugs which you already know about.
This sounds worthless, but it's surprisingly handy. The more you program, the more tags you'll find. The important thing is that you keep them reasonably short and consistent.
Different coders = diff. styles (Score:2)
Perhaps this would help those who are trying to read and understand the code, but the problem is that each different coder has their own styles and practices. This includes commenting -- some don't comment at all, some write long, verbose descriptions of what they are doing, and some are in between. You could ask them to follow a standard such as this, but some coders may think that their style is better and thus refuse.
=================================
web/noweb would be nice..... (Score:2)
Least Fun (Score:2)
Since most OSS is developed in people's "free time" -- not all of us are lucky enough to be paid for developing OSs -- it makes the most sense to maximize the "fun" and minimize the "work". Documentation = "work". Code = "fun"...
do the math...
Re:HOWTO: Comment (Score:1)
Not unless they want to understand the code, no. Your advice is totally against the principle of self-documenting code. 30-line comments are necessary in ALL source files; at least the first 30 lines of a file should be comment.
dangers of comments (Score:2)
Wrong comments are an obvious plague, especially in free software where contributers may not have much stake in long-term maintenance. The saying goes: if the code and comments disagree, both are probably wrong. Excessive comments are a more subtle issue, at very least because "excessive" is a judgement call. One thing is clear: with more comments, you can fit less code on the screen or page, which is a penalty not to be ignored (I think this is part of the reason for the productivity of high-level languages). More subjectively, I tend to find that people who write copious comments can't express their ideas concisely, or aren't clever enough to express their meaning directly in code. In many cases, I would rather the author work on code quality than commenting.
When working with other people's code, there is nothing I prefer over code that is so clear, concise, and well organized that I don't notice a lack of comments. Unfortunately, this isn't possible for all problems. I recall an anecdote from Donald Knuth in which he claims he would have been unable to successfully implement an algorithm without literate programming, and I believe it.
However, I was hacking on a piece of free software (vgetty) this weekend, and when I read this article, I realized I couldn't remember whether the code had many comments. I just checked, and found that (the parts I worked with) did not (although the debug output was a partial substitute). But I never missed them, because the code just made sense internally and fit into the larger system in an obvious way.
One final point. At advogato [advogato.org], there is a discussion of how to encourage more contribution to existing projects, instead new projects. What I think was missed there is that an influx of casual novice programmers isn't necessarily a boon to most projects (bug reports and fixes, maybe, new content, less likely). Linux is good largely because it is optimized for sharp programmers who are willing to study the design (and the development process). This doesn't mean eschewing documentation (the in-line documentation system in 2.4 is encouraging), it means placing less value on beginner-level documentation. This is like all things a trade-off, and I think a good one. Ability to understand kernel design is a good test of who I want futzing with my kernel!
Re:also (Score:1)
I agree - good documentation is important but... (Score:2)
More important is writing good code in the first place. I've seen both really good code and really awful spaghetti code. Really good code is so clear and understandable that it almost doesn't need documentation, while no amount of documentation could possibly improve badly written code. I've also seen decent code that has been polluted by useless comments such as:
Even if it was possible to impose a set of style guidelines on the Open Source community, I wouldn't want to. Good code, and good documentation in the code is a fine art. A few hints can be found in various code style guides, but the only way to learn to write good code and comments is to read good and bad examples of other people's code, and to practice.
OSS - contribute back (Score:2)
Re:dangers of comments (Score:1)
The main reason why it took around a year for UMSDOS support to get fixed during the 2.1.x Linux kernel series was that the code was commented in such a way, and the original developer was no longer maintaining it.
Re:HOWTO: Comment (Score:1)
Re:Least Fun (Score:1)
Re:Least Fun (Score:1)
I derive pleasure from writing quality, beautiful code. Beautiful code comes from the algorithm and the degree to which it solves is problem right down to its formatting and clarity.
Code without sufficient documentation is not complete and is not of high quality.
At work these values are often lost for the sake of something "good enough," but with free software the priorities are different.
Re:HOWTO: Comment (Score:2)
Not neccessarily. For example, it's still an unsolved problem as to whether a program like this terminates for all positive integer values of i:
$i=$ARGV[0];
while($i != 1) {
($i % 2 == 0) ? ($i
}
Re:Cleaning up others' code (Score:1)
No, it's code review, and most people would agree that most code doesn't get enough review, so it's hardly time wasted. Plus, now you have a deeper understanding of the code. Maybe not your top priority an the time, but a long-term benefit.
Further, do you know how much it sucks to maintain code that has had many changes made by people who looked at the code just long enough to make their one change?
Quality comments are a wondrous thing: comments that provide a roadmap to the code, an overview of the data structures and how they fit together, clear explanations of the subtle operations, and warnings of the pitfalls. But quality comments take more thought to write and are more likely to fall out-of-date (since they're high level and don't map directly to the bits of code that might be changed), so they require talented and consciencious programmers. Most people I've known who have been told to comment heavily leave poor comments, which I feel are worse than useless as I argued before.
I've heard time and time again that good code should document itself. That's purest bull.
It's bull that all good code should document itself. I claim that a lot of it should. (Maybe not even most; it's hard to measure code space.)
I'm aghast that anyone would be advocating less comments in code.
I'm not advocating fewer comments, per se. I'm suggesting that emphasizing comment quantity and uniformity is not the best route to overall better software. The fact that it seems like a good idea at first blush makes it more insidious.
Re:Different coders = diff. styles (Score:1)
On the contrary! I think that any developer would be tickled pink to know that you had been reading his code. (It's open source for a reason!) So long as you're polite about it, I think that any developer who believes in open source would appreciate comments, suggestions, re-writes, additions, beautifications, and questions about what they've done.
Re:Different coders = diff. styles (Score:2)
Re:OSS - contribute back (Score:1)
To be honest, the book format was a mistake -- the Linux code is moving altogether too fast for a book to be useful when published. It was what Coriolis offered, though, and I did it.
The job of code documentation done right is not an easy nor quick one, and as a coder myself I can understand why people hate the job.
That said, my co-author and I will be starting over with 2.4.0 [fluent-access.com] (as soon as I get permission to republish the code from Linus) and do it on the Web...but not for free. To do it right, I also need to develop a revenue model that makes the project pay for itself...and I don't think a book publisher will be willing to do that.
Re:HOWTO: Comment (Score:2)
Re:3 months later - read your own code? (Score:1)
Nah. Someone will write a DeSCS program which will be the cause of an unending war on slashdot.
--
If you don't like it, then change it. (Score:1)
Be sure to post the comments back where we can all get them.
IMHO, code documentors are just as important to the cause as the folks who come up with the stuff, and need to be lauded about the same.
also (Score:1)
-----
A good reference.... (Score:2)
In a comment, explain not what the code does, but why it does what it does.
In other words, assume your reader knows the language the code is written in. Explain to him your thought processes as you were writing the code. I have found many of my own bugs this way before the code was even compiled.
Another point that Holub makes is that whitespace is a (very effective) form of comment. I like to group several lines together that are logically related and separate them from the rest of the code with a blank line. Think of them as paragraphs. Spaces before and after operators makes the code read better as well, eg, y = a + b;
Of course you can get too hung up on this stuff too. I'm sure holy wars have been waged over K&R braces vs. indented braces vs. outdented braces, not to mention how many lines to indent (correct answer: 4 *g*). The important point is to pick a style and go with it; be consistent.
Re:Auto document0r (Score:1)
Stefan's software makes commenting code easy. It seperates from the kludges of comment lines. I've been using it and it's quite nice, really.
I wasn't responding to the root post. I was responding to the post I responded to.