Tools For Understanding Code?

Follow Slashdot stories on Twitter

Tools For Understanding Code? 383

Posted by kdawson on Friday January 18, 2008 @12:35PM from the getting-it dept.

ewhac writes "Having just recently taken a new job, I find myself confronted with an enormous pile of existing, unfamiliar code written for a (somewhat) unfamiliar platform — and an implicit expectation that I'll grok it all Real Soon Now. Simply firing up an editor and reading through it has proven unequal to the task. I'm familiar with cscope, but it doesn't really seem to analyze program structure; it's just a very fancy 'grep' package with a rudimentary understanding of C syntax. A new-ish tool called ncc looks promising, as it appears to be based on an actual C/C++ parser, but the UI is clunky, and there doesn't appear to be any facility for integrating/communicating with an editor. What sorts of tools do you use for effectively analyzing and understanding a large code base?"

This discussion has been archived. No new comments can be posted.

Tools For Understanding Code?

Load All Comments

Search 383 Comments Log In/Create an Account

Comments Filter:

Wait for cenqua's solution (Score:5, Funny)

by ccguy ( 1116865 ) * writes: on Friday January 18, 2008 @12:37PM (#22094856) Homepage

I hear that the commentator [cenqua.com]guys are finishing a new product that instead of commenting your code is able to comment other's.

Share
twitter facebook
- Been there... (Score:5, Insightful)
  
  by seanadams.com ( 463190 ) * writes: on Friday January 18, 2008 @02:40PM (#22097548) Homepage
  
  There are two kinds of hard problems in programming: problems that are hard because they require ingenuity and deep thought, and problems that are hard because they require weeks of unraveling someone else's garbage.
  
  There are some horrible programmers out there and I have on many occasions been tasked with cleaning up their messes. In your situation I would suggest either a) try to figure out if it would take less time for you to implement it in a clean and maintainable way or b) find someone else you can hire who knows the code base or at least is more familiar with the specific problem.
  
  If you can't do a or b then you're screwed. In that situation, personally, I would either quit, ask for a different project, or print out the whole source code and sit back with a pen and start studying and commenting - one of the few tasks for which I still prefer dead trees.
  
  Parent Share
  twitter facebook
  - Re:Been there... (Score:5, Insightful)
    
    by skiflyer ( 716312 ) writes: on Friday January 18, 2008 @03:31PM (#22098578)
    
    a) is so often the wrong choice and can really submarine a company because they keep getting a cycle of a)'s ... every 5th release becomes a complete rewrite as the new team says "we need a refactoring of the code, no one here is familar with it and/or it's spaghetti code, just give us 5 months we'll maintain the behavior 100% and we'll clean up a lot of bugs and we promise in the future maintenance will be a breeze"
    
    Parent Share
    twitter facebook
  - Been there, will never return (Score:3, Interesting)
    
    by owndao ( 1025990 ) writes:
    
    I had a very wise undergrad EE prof who said on the first day of design class that we needn't worry about the many "complicated" things that we would have to design during the course because we had already completed all of our circuit analysis courses. He said it's much harder to figure out the details of someone's design than to design it yourself. Same applies here in software. I've been there working with other's undocumented code and quite frankly it was infrequently that I left the project with more re
  - - Re: (Score:3, Funny)
      
      by seanadams.com ( 463190 ) * writes:
      
      Not everything is a duality.
      
      Ah, so really there are two kinds of things: those which are dualities and those which are not?
      - Re: (Score:3)
        
        by SnowZero ( 92219 ) writes:
        
        Ah, so really there are two kinds of things: those which are dualities and those which are not?
        Finally, someone who understands both sides of the sphere.
Stepping Through (Score:5, Insightful)

by blaster151 ( 874280 ) * writes: on Friday January 18, 2008 @12:38PM (#22094862)

I've always found that stepping through the debugger at runtime is a decent way to start making sense of a large code base. Easier, anyway, than trying to read static code printouts. Just set a breakpoint at a point of interest, fire up the application, and use it as a starting point. You get a sense for program flow and it's a great way to generate questions--lots of them. (What does class SuchAndSuch do? It looks like the application is handling remoting in such-and-such a fashion; is that right?) You can also choose one aspect of the architecture and selectively ignore or step over other aspects, building up your understanding one aspect at a time. In my case, with Visual Studio as a development environment, I can hover the mouse cursor over variable names to see their current values. In the case of variables of a certain type, like datasets or XML structures, I can use realtime visualizers to browse the contents and get a much better feel for what's going on.

If there's no one at your company that can help answer your questions and bring you up to speed, I feel for you - your employers ought to know enough to give you some extra margin. It can be very hard to take over a large code base without some human-to-human handover time.

Also, is it an object-oriented system? I assume that it's not, based on your post, but you don't say either way. If it is, the important aspects of program flow often live in the interactions between classes and objects and the business logic is decentralized. OO is great, but it can be harder to reverse-engineer business logic because it's distributed among various classes. A debugger that lets you step through running code is almost essential in this case.

Share
twitter facebook
- Re:Stepping Through (Score:5, Insightful)
  
  by daVinci1980 ( 73174 ) writes: on Friday January 18, 2008 @12:47PM (#22095068) Homepage
  
  This post is dead on.
  
  Place a breakpoint somewhere you think will get hit (e.g. main), and then start stepping over and into functions. I usually attack this problem as follows:
  
  Place breakpoint. Use step-in functionality to drop down a ways into the program, looking at things as I go. What are they doing, how do they work, etc.
  
  Once I feel like I understand how a section of code works, I step over that code on subsequent visits. If I feel like this isn't taking me fast enough, I let the program run for a bit, then randomly break the program and see where I am.
  
  Lather, rinse, repeat.
  
  Also, this should go without saying, but you should ask someone who works with you for a high-level overview of what the code is doing. The two of these combined should get you up to speed as quickly as possible.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Informative)
    
    by Nethemas the Great ( 909900 ) writes:
    
    Clearly you don't write (or at least read source for) applications of any substance as that would be mildly described as tedious if not impossible.
    
    One of the best ways to understand code is to do so visually with the software equivalent of blueprints. UML is generally considered a very capable way of modeling/communicating both static structures and dynamic behavior of software. There exist any number of tools that are capable of reverse-engineering existing source into UML. Two tools that I consider t
    - Re: (Score:3, Insightful)
      
      by jgarra23 ( 1109651 ) writes:
      
      Clearly you don't write (or at least read source for) applications of any substance as that would be mildly described as tedious if not impossible.
      
      I have no idea how you formulated this from parent based from 2 or 3 sentences.
      
      One of the best ways to understand code is to do so visually with the software equivalent of blueprints. UML is generally considered a very capable way of modeling/communicating both static structures and dynamic behavior of software.
      
      A lot of times a programmer is stuck without those t
- Re: (Score:3, Informative)
  
  by The_reformant ( 777653 ) writes:
  
  Absolutely since joining the real world I have found the visual studio debugger my most prized tool. Somehow I managed all through my degree to never come into contact with one (probably because all the free ones are rubbish and most schools won't shell out for visual studio). I now extol the virtues of debugging to all and sundry!
  - - Re:Stepping Through (Score:5, Informative)
      
      by plover ( 150551 ) * writes: on Saturday January 19, 2008 @03:31AM (#22106278) Homepage Journal
      
      (Warning: you asked!)
      Well, the learning curve is certainly important in the real world, although I expect a professional to know his or her tools before they arrive on the job. But there are a metric crapload of things I like better about Visual Studio that make it a much more effective debugger than gdb, in my opinion. (Note that I am not a big gdb user, so I may be cutting it a bit short in the feature set here. My apologies in advance if I do so.)
      Things I've found I prefer include many tool windows simultaneously showing the states of registers, memory, the call stack, an object or seven (expanded to show a few properties), and automatic resolution of virtually every symbol and name, including the operating system (although you have to download the symbol files for your OS version from Microsoft.) And you still have full navigation through the source.
      Simply hovering the mouse over a symbol will bring up a tool-tip to display the contents. If you highlight an entire expression such as pFoo->pBar->Blah.count+7 and hover, the tooltip will display the calculated result.
      You can set a temporary breakpoint by setting the cursor on a line of code and clicking "run to cursor." You can run, single step, run to the current cursor, or run till function return. That last one is great for re-entering a function multiple times to test different conditions.
      The variables window contains the current call stack as a dropdown list -- changing the stack lets you see the newly-local variables. Watch windows can display data as hex or decimal, just right click and select. Watch entries can even be used as calculators (enter a literal value, such as 0xf0 + 12, and it will display the results.)
      In the watch windows, you can also call arbitrary functions (good for testing without driving your code to that point) or other functions in your memory space, such as the C runtime memory checkers. If you're trying to track an errant pointer, create a debug build, start running and break, type _CrtCheckMemory() into a watch window, and every time the watch window is refreshed, it will check all your fenceposts. You might get lucky and spot your corruption as it happens. The /GZ compiler option will perform a similar task at the function level, but this would let you do it at a line level.
      There are also dozens of possible formats it can display your watch variables in -- suffix a pointer with ,s and it'll display the contents as an ASCII string. Only see one byte because of Unicode? Suffix the pointer with ,su and you'll see the unicode string. A ,wm suffix displays window messages by name. ,hr suffix displays HRESULTs by name.
      The memory windows will highlight in another color any data that's changed since the last time it was refreshed, whether it be a single step or a previous breakpoint. You can have memory displayed as bytes, shorts, or longs. And with the newer visual studios, you can have multiple memory windows, so you can keep track of two, three or four arrays simultaneously. You simply drag and drop them wherever they're convenient, then step through the code and watch for colored variables indicating change.
      Again, all these windows are automatically updated every time the debugger drops from the program to your control. I've got two 17" monitors, and I can fill them both. The problem with debugging is that sometimes you are really starting blind, and the faster you can get more information, the less time you waste debugging.
      There's a cute "magic trick" I like to show people with the memory window and the disassembly window. Let's say you've had a crash, and attached the debugger to the running program. You're looking at a corrupt stack in the call stack window -- just one line of garbage data. What to do? Where did it break? Enter @ESP in the memory window. Change the view to 'long' and it displays the memory as 8-digit numbers. If y
      Read the rest of this comment...
      
      Parent Share
      twitter facebook
      - Re: (Score:3, Informative)
        
        by plover ( 150551 ) * writes:
        
        Thanks for the compliment!
        I used to teach a course in debugging with Visual Studio, and I basically trawled through my syllabus looking for the cool tricks. Using the stack-crash demo to drop into the source code of the crashing module is a real attention-grabber.
        I found debugging in gdb to be a lot like debugging in WinDBG. You have to learn a lot of esoteric commands that you don't use very often, so it takes a lot of practice to learn them. And if you aren't constantly searching for the side effec
- Mod parent up (Score:5, Insightful)
  
  by mccrew ( 62494 ) writes: on Friday January 18, 2008 @01:28PM (#22095944)
  
  Sorry, no points today to mod you up myself.
  I would suggest a slight variation on the theme. Fire up the application, start it on one of its typical tasks, and then interrupt it in the debugger to catch it. While the process is stopped mid-flight, take note of the call stack to see which classes and methods are being used. Maybe step through a few calls, then let the program run some more.
  By doing this repeatedly, you will quickly get a sense for which parts of the code see the most action, and would provide the most obvious places to start studying the code base, and provide the best bang-for-buck return on your time.
  
  Parent Share
  twitter facebook
  - Re:Mod parent up (Score:5, Informative)
    
    by Lazerf4rt ( 969888 ) writes: on Friday January 18, 2008 @01:53PM (#22096506)
    
    Fire up the application, start it on one of its typical tasks, and then interrupt it in the debugger to catch it. While the process is stopped mid-flight, take note of the call stack.
    
    Good advice -- breaking randomly. However, it works best in CPU-intensive applications. If the app is mostly idle and event-driven, you're best off searching the code and looking for a place to set breakpoints.
    
    Also, when I use the debugger to help understand some new code, often I'll open a text file and build a "trace" as I go. As I explore things in the debugger and find new call stacks, I add more detail to the trace, in a hierarchical (indented) style. Then I save the traces in case I forget something later.
    
    As for the original question, I would recommend staying focused. Don't go all over the program trying to understand every system at once. Pick a specific part you really need to understand (say, based on a task you have to do) and focus on understanding that.
    
    Unfortunately, the best tool for understanding code is experience. Not theory and not some fancy visualization program. Once you've seen a lot of different code, you come to recognize what each person was thinking when they wrote it. Once that kind of thing comes easily, you no longer find it necessary to bitch about each different programmer's coding style (as some do). So in a way, the guy who posts this question is lucky to have such a big pile of code in front of him.
    
    Parent Share
    twitter facebook
    - Re:Mod parent up (Score:4, Insightful)
      
      by ckaminski ( 82854 ) writes: <slashdot-nospam@ ... m ['r.c' in gap]> on Friday January 18, 2008 @04:19PM (#22099386) Homepage
      
      The best way to learn the code is to start fixing some low or medium severity bugs. Something that's not a sev1 is either not so endemic to the system that changing it breaks everything, nor is it likely to be some random data corruption issue that will be impossible to find. It will be stupid user-input problems, or interaction issues.
      
      Most of my productive code learning was in the first three months of bug-fixing. I think that's why most newhires end up on bug fixing as a rule - it's the fast-path to comprehension.
      
      Parent Share
      twitter facebook
  - Re:Mod parent up (Score:4, Insightful)
    
    by Just Some Guy ( 3352 ) writes: <kirk+slashdot@strauser.com> on Friday January 18, 2008 @06:34PM (#22101844) Homepage Journal
    
    By doing this repeatedly, you will quickly get a sense for which parts of the code see the most action, and would provide the most obvious places to start studying the code base, and provide the best bang-for-buck return on your time.
    
    If only there were some way to automatically generate this information, this "profile" of the running code, if you will.
    
    Parent Share
    twitter facebook
- Not for a "large" codebase... (Score:4, Insightful)
  
  by smitth1276 ( 832902 ) writes: on Friday January 18, 2008 @01:39PM (#22096182)
  
  That doesn't always work for a code base with millions of lines of atrociously written code. I've worked with code where it is absolutely not feasible to step through everything.
  
  It seems like in those cases I end up working from effects... I note some program behavior and then try to find exactly what causes that behavior, which can be surprisingly difficult if you are dealing with the "right" kind of code. After a while, though, the patterns begin to emerge in the system as a whole.
  
  Parent Share
  twitter facebook
  - Re:Not for a "large" codebase... (Score:5, Insightful)
    
    by ChrisA90278 ( 905188 ) writes: on Friday January 18, 2008 @03:45PM (#22098808)
    
    "That doesn't always work for a code base with millions of lines of atrociously written code. I've worked with code where it is absolutely not feasible to step through everything"
    
    You are correct. All these people talking about using a debugger and so on... That does NOT work on larger projects any on fairly simple ones. "Large" projects might have 250 source code files and thousands of functions or classes and likely a dozen or so interacting executable programs. I've seen print outs of source code that fill five bookcase shelves. No one could ever read that.
    
    I've had to come up to speed on million+ lines of code projects many times. The tool i use is pencil and paper
    
    The first step is to become an expert user of the software. Just run the thing, a lot and learn what it does. Looking at code is pointless untill yu know it well as a user.
    
    Parent Share
    twitter facebook
  - Re: (Score:3, Insightful)
    
    by no-body ( 127863 ) writes:
    
    Run it through a profiler - giving function names, times called, cpu time used, calling hirarchy/tree
    
    if... there is such an animal around still for the environment in question.
- Re:Stepping Through (Score:4, Insightful)
  
  by JesterXXV ( 680142 ) writes: <jtradke.gmail@com> on Friday January 18, 2008 @01:50PM (#22096430)
  
  I don't think there's any replacement for talking to the real-live developers who wrote it. Failing that, any design documentation they left behind. Failing that, just get a task to do, and try to get it to work. Nothing like learning by doing.
  
  Parent Share
  twitter facebook
- Re: (Score:3, Insightful)
  
  by smittyoneeach ( 243267 ) * writes:
  
  I think unit tests are actually better, for code that is suited to being driven externally.
  Pick a tool to wrap something, start writing little bits to excercise the code.
  You can comment and version unit tests, giving a sense of history.
  Debuggers, on the other hand, mostly exist in the present tense.
  Sure, you learn something now, but how about some breadcrumbs for later?
  - Tests (Score:4, Interesting)
    
    by gerddie ( 173963 ) writes: on Friday January 18, 2008 @03:04PM (#22098084)
    
    Tests are indeed very good to understand a code base- Nearly all the last year I was working on a code base that nobody understood completely, although I had someone to ask about the general code structure. Writing tests helped me to understand what some parts of the code actually do. And where I needed to change things I could make myself sure that I didn't break anything.
    
    Another great tool is valgrind+KCachegrind - it gives you really nice call trees. Vtune can do something similar as well, but IMHO the output is not as good as in KCachegrind. The only problem, of course, is that valgrind makes your program very slow and, it is, AFAIK, not available on MS Windows.Vtune, OTOH, runs the program at normal speed, but it's calltree output is ugly, at least on Linux.
    
    If these two options are not for you than you might add a trace output to each function. IMO this is better than using a debugger - especially in C++ with BOOST and STL, where a lot of stepping goes through inline functions.With proper logging levels you can get a very useful output to see what's going on. It helps to understand the code, and it also helps, if you hit a bug.
    
    Parent Share
    twitter facebook
- Re: (Score:3, Insightful)
  
  by Assmasher ( 456699 ) writes:
  
  I certainly think that stepping through is by far the most valuable method; however, it can be difficult when dealing with asynchronicity and/or parallelism. In those cases, commenting is the only solution that seems to help me... LOL.
- Re:Stepping Through (Score:4, Insightful)
  
  by superwiz ( 655733 ) writes: on Friday January 18, 2008 @02:57PM (#22097920) Journal
  
  The guy asked about a large code base. I am assuming that means on the order of at least half a million lines. Stepping through the code won't even get you into most modules of something that big. Never mind that it will do nothing to help you understand that a certain chunk of the code is a module that gets used only under certain extraordinary conditions. To be sure, what you suggest is what you do on day 1. The post was essentially asking what do you do three weeks into it after you've understood what the loop in main does and yet you still don't know what's tied to what and how.
  
  Parent Share
  twitter facebook
- - Re:Stepping Through (Score:5, Insightful)
    
    by orclevegam ( 940336 ) writes: on Friday January 18, 2008 @02:10PM (#22096858) Journal
    
    Much as I would love to agree with you, unfortunately the world isn't always so accommodating. Sometimes you have to suck it up and stay with a job till you can find something better, and most employers won't let you toss anything out, let alone a major chunk of their code base. Doesn't matter if it's utter crap, they paid for it, and as far as their concerned turd polishing is better then starting from scratch even if starting from scratch would be a hell of a lot cheaper. Can't expect MBAs to understand the difference between good code and bad code, to them it's all just code, and as far as their concerned, the more the better. It's the old idiotic idea that more lines of code means a better product, therefor anything that reduces lines of code must be a bad thing.
    
    Parent Share
    twitter facebook
Doxygen (Score:5, Informative)

by Raedwald ( 567500 ) writes: on Friday January 18, 2008 @12:39PM (#22094886)

For C++ code, Doxygen [stack.nl] can be useful, as it shows the class inheritance. As requested, it uses a (rudimentary) parser. It works with several other languages too, although I can't vouch for its utility for them.

Share
twitter facebook
- Re: (Score:2)
  
  by PetriBORG ( 518266 ) writes:
  
  Doxygen I thought did java-doc like parsing for C++? I was thinking he should look for something able to build a UML diagram based on the code... I hate UML, but if there isn't any documentation telling you the structures of the code it might be a place to look.
  
  I would google for that, but I'm under deadline myself... (but yet still reading /. - I think its an addition).
  - Re: (Score:3, Informative)
    
    by zeekec ( 795504 ) writes:
    
    Doxygen can produce UML diagrams for undocumented code. (UML_LOOK and EXTRACT_ALL)
  - Re:Doxygen (Score:5, Informative)
    
    by Bill_the_Engineer ( 772575 ) writes: on Friday January 18, 2008 @02:10PM (#22096854)
    
    Doxygen I thought did java-doc like parsing for C++? I was thinking he should look for something able to build a UML diagram based on the code... I hate UML, but if there isn't any documentation telling you the structures of the code it might be a place to look.
    
    Doxygen is more than a javadoc replacement.
    I like Doxygen + Graphviz. Just set Doxygen to document all (instead of just the code with tags) and set it to generate class diagrams, call trees, and dependency graphs and allow it to generate a cross reference document that you can read using your web browser. Set the html generator to frame based, and your browsing of code will be easier. I would also set Doxygen to inline the code within the documentation.
    I've use Doxygen to reverse engineer very large programs and had good luck with it. I will say Doxygen is not going to do all your work for you, but it will make your job easier. Especially if you add comments to the code as you figure each section out.
    Now if you like to see the logical flow of each method then try JGrasp (jgrasp.org). It has a neat feature called CSD that allow you to follow the logic of the code a little better. It's a java based IDE so that may be a turn off for you. I do whole heartedly recommend the Doxygen (w/ Graphviz).
    Good luck.
    
    Parent Share
    twitter facebook
- Re:Doxygen, and Extracting Software Architectures (Score:5, Informative)
  
  by Mr.Bananas ( 851193 ) writes: on Friday January 18, 2008 @12:50PM (#22095130)
  
  I use Doxygen for C code, and it is really helpful. One of its most useful features is that it generates caller and callee graphs for all functions. You can also browse the code itself in the generated HTML pages, and the function calls are turned into links to the implementation. Data structures and file includes are also pictorially graphed for easy browsing.
  
  If the system you need to understand has a really big undocumented architecture, then this presentation [uwaterloo.ca] might be useful to you (there is a research paper, but it's not free yet). In it, the authors present a systematic method of extracting the underlying architecture of the Linux kernel.
  
  Parent Share
  twitter facebook
- - Re:Doxygen (Score:4, Informative)
    
    by mhall119 ( 1035984 ) writes: on Friday January 18, 2008 @02:28PM (#22097296) Homepage Journal
    
    Only problem is, it is a pain to configure. Also, windows versions don't look very stable.
    Windows version has been very stable for me, I've not had any problems with either Doxygen or Graphviz. It also includes a configuration wizard that is both easy to understand and powerful. There is also an Eclipse plugin that lets you configure and run Doxygen.
    
    Parent Share
    twitter facebook
When I was your age... (Score:2, Interesting)

by russotto ( 537200 ) writes:

I use the Mark I eyeball, grep, emacs, and of course, the little gray cells.

(and GET OFF MY LAWN).
- Re: (Score:3, Funny)
  
  by Mr. Underbridge ( 666784 ) writes:
  
  When I was your age...I use the Mark I eyeball, grep, emacs, and of course, the little gray cells.
  (and GET OFF MY LAWN).
  They have lawns at the old folks' homes these days?
Paper (Score:2, Insightful)

by raddan ( 519638 ) writes:

You should really be sitting down and attempting to understand the code, ASAP. Asking Slashdot for fancy tools isn't really going to help you. The real barrier here is your own brain.
- Re: (Score:2)
  
  by plopez ( 54068 ) writes:
  
  Damn. You beat me to it. I would also suggest developing domain knowledge. Reading code is useless unless you understand the what and why of the problems being solved.
- Re: (Score:3, Interesting)
  
  by bunratty ( 545641 ) writes:
  
  I don't think I've ever been able to understand a large body of code by simply looking at it. I've always found that attempting to make modifications (fixing bugs, adding features) to the code gets me to understand it fairly quickly. Often, I'll find myself adding comments or cleaning the code up as I go. There have been times when I've just thrown all the code away and reimplemented the same functionality form scratch. That may not be an option here, but perhaps writing an implementation of part of the cod
  - Re: (Score:3, Funny)
    
    by sdpuppy ( 898535 ) writes:
    
    Well when all else fails, look at the variable/function/structure names.
    Obviously a program with labels such as "Frodo" Sam" "Gondor" must be doing something Lordly with rings
    and if you have labels such as "string1" "string2", then the program must be solving some particle physics problem involving string theory.
    ... and when that fails, you go back to your old college, find the smartest CS geek and slip him/her a few dollars to figure it out.
    Need I add :-) :-) :-) ?
- Re: (Score:3, Insightful)
  
  by cjonslashdot ( 904508 ) writes:
  
  I agree. I have found that it is fairly easy to uncover program structure. But UNDERSTANDING the intention of each line or function is another matter. This is where one wishes that there were documentation of design decisions. This is why whenever I build something I simultaneously maintain a design document in which I record each decision that I make and each pattern that I devise and use. As I revisit decisions, I do it in the design, and only when I have worked out the design do I try to code it. This
- Absolute tosh ! (Score:5, Insightful)
  
  by golodh ( 893453 ) writes: on Friday January 18, 2008 @01:43PM (#22096258)
  
  An interesting post, even if it's absolute tosh. No-one in his right mind tackles a new code-base of any size or complexity with nothing but a printout. Not if he's expected to understand how it works and/or maintain it in a responsible way.
  In fact, it nicely highlights the difference between "software engineers" and "code monkeys". Code monkeys just dive in; they never pause to think. In fact ... they tend to avoid thinking. It's not their strong point. After all ... they're paid to code, right? Not to think. Software engineers on the other hand, look before they leap and spot the places where they need to pay attention first. And they're systematic about it.
  In fact, a software engineer will happily spend a day or two putting the right tools in place, *including* a full backup and a proper version management system for when he's going to have to touch anything.
  The first thing you want to know about a new code base (after you find out what it's supposed to be doing) is its structure. Tools like Doxygen (see previous posts) show you that structure *far* quicker and *far* more reliably than any amount of dumb code-browsing can. And besides ... once you do it, you've got that documentation stashed away securely instead of milling around incoherently in your head (you'll have completely forgotten most of what you read by next month) or on disorganised pieces of note paper.
  The second thing is to figure out if it calls any "large" functionalities like subroutine libraries or even stand-alone programs like databases, let alone if it makes operating system calls. The call-tree will give you an excellent view, and the linker files can complete the picture. You wouldn't be the first maintenance programmer who found out after months that his application critically depends on some other application he wasn't told about.
  The third thing is to see where your code does dirty things. Let the compiler help you. Just compile your application with warnings on and have a look at what the compiler comes up with. You might be surprised (and horrified). Then compile with the settings used by your predecessor and check that your executable is bit-for-bit identical to what's running (you wouldn't be the first sucker who's given a slightly-off code base).
  If performance is at all important, then running the whole thing for a night on a standard case under a good profiler will also tells you lots of important things. Starting with where your code spends its time, where it allocated memory and how much, and where the heavily-used bits of code are. All neatly written down in the profiler logs.
  Finally, run your application with a tool to detect memory management errors the first chance you get. Useful tools are Valgrind (in a Linux environment), Purify (expensive, but probably worth it) under Windows, and sundry proprietary utilities under Unix. Just about 90% of the errors made in C programs come from memory management problems, and half of them don't show up except through memory leakage and overwritten variables (or stacks .. or buffers .. or whatever). You'll need all the help you can get here, and as far as these errors are concerned, dumb code browsing is useless. Just keep your head when looking at reports from such tools ... they can throw up false positives. Ask around on a forum with specific questions if you're allowed, or ask your supervisor. After all ... you showed due dilligence.
  When you know all that (if you have the tools in place, all of this can be done within 1 day + 1 overnight run + 1 hour reading the profiler output), go ahead and trace through the code in a debugger. You'll be in a *far* better position to judge what you should be reading.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Informative)
    
    by mabhatter654 ( 561290 ) writes:
    
    I'd agree. He's being considered a "code monkey" and not a software engineer. Typical situation is that they'll drop some random user problem on his desk after a week "to familiarize himself" then expect him to figure out what program it is and why it broke and suggest a process improvement. Then tell him he's all wrong because "they already tried that 5 years ago."
    
    The question he's trying to answer is what does the code "do"? why does it exist? what problem does it solve? When you inherit some homegrown E
  - - About "new tools" (Score:3, Interesting)
      
      by golodh ( 893453 ) writes:
      
      @KZigurs,
      Well ... some good points, and some I'd say are too detailed at this point.
      I totally agree with point (1). I forgot to mention it since I assumed (always a bad thing) that the author actually could compile and run the thing. An important point to keep in mind. Thanks for bringing it up.
      Points (2)-(5) however all come after you've understood the basic structure of your code base.
      Next, I'd say that a fairly junior software engineer trying to tackle a large unknown code-base without proper too
doxygen (Score:3, Informative)

by greywar ( 640908 ) writes: on Friday January 18, 2008 @12:41PM (#22094922) Journal

If its in a language that doxygen can understand, thats the tool I would HIGHLY recommend.

Share
twitter facebook
Ctags (Score:3, Insightful)

by pahoran ( 893196 ) * writes: on Friday January 18, 2008 @12:42PM (#22094948)

google exuberant ctags and learn how to use the resulting tags file(s) with vim or your editor of choice

Share
twitter facebook
Old School (Score:5, Funny)

by geekoid ( 135745 ) writes: <dadinportlandNO@SPAMyahoo.com> on Friday January 18, 2008 @12:42PM (#22094958) Homepage Journal

Printouts and colored markers.

Share
twitter facebook
Understand C++ (Score:5, Informative)

by SparkleMotion88 ( 1013083 ) writes: on Friday January 18, 2008 @12:43PM (#22094978)

Sorry I don't have an open source tool for you, but I've used Understand for C++ [scitools.com] in the past and it was pretty helpful. To me, the most useful piece of information for understanding a large codebase is a browseable call graph. I'm sure there are simpler tools out there that generate a call graph, but this is the only one I've used with C++.

Share
twitter facebook
- Re:Understand C++ scitools.com (Score:2)
  
  by John Sokol ( 109591 ) writes:
  
  Understand for C++ : from (Scientific Toolworks, Inc ) is the best I have ever seen.
  
  I highly recommend it. Well worth the $500 for it.
RR & EA (Score:3, Informative)

by Anonymous Coward writes: on Friday January 18, 2008 @12:44PM (#22094988)

Sometimes tools like Rational Rose [ibm.com] or Enterprise Architect [sparxsystems.com.au] are successful at reading in the code an building a UML model that you can then attempt to parse through. I'm not familiar with the use of either, but I know it can be done, with mixed results depending on the size and complexity of the code being analyzed. Both tools are fairly expensive though, I believe.

Share
twitter facebook
- Re: (Score:2)
  
  by wfeick ( 591200 ) writes:
  
  I've used Borland's Together in the past and found it really helpful for C++/Java code. It can be really helpful for coming up to speed a code base's class hierarchy. Unfortunately, when I tried it on a large C++ code base where I'm currently working, after loading the code base in it seemed to go into some sort of a analysis phase and then eventually crashed.
  
  I'm not sure what the problem was. A sales droid called to check in on my download, passed the crash info on to a techie, and said I'd get a call ba
Reverse Engineer? (Score:2)

by dotpavan ( 829804 ) writes:

For Java, would reverse engineering the code to UML diagrams help? Any good open source tools one could recommend to understand a large code base?
- Re: (Score:2)
  
  by samkass ( 174571 ) writes:
  
  For Java, he probably wouldn't be having this problem as acutely in the first place. The reduced syntax compared to C++ makes many of the hacker types hate Java, which makes Java twice as good in my book. It also makes everything a lot clearer. In addition, the dynamic nature of the language combined with the compact syntax means even the free tools like Eclipse have excellent analysis capability, and tools like IntelliJ offer phenomenal ability to introspect the code.
  
  But yes, C is a few percent faster a
You must have inherited my old project (Score:5, Funny)

by theophilosophilus ( 606876 ) writes: on Friday January 18, 2008 @12:47PM (#22095062) Homepage Journal

Sorry about that.

Share
twitter facebook
When I am particularly frustrated (Score:2)

by antifoidulus ( 807088 ) writes:

I find that a hammer works well. Not so much for understanding the code, but it CAN help relieve computer-created stress!
What I do (Score:5, Informative)

by laughing_badger ( 628416 ) writes: on Friday January 18, 2008 @12:48PM (#22095078) Homepage

SourceNavigator : A good visualisation package http://sourcenav.sourceforge.net/ [sourceforge.net]
ETrace : Run-time tracing http://freshmeat.net/projects/etrace/ [freshmeat.net]
This book is worth a read http://www.spinellis.gr/codereading/ [spinellis.gr]
Draw some static graphs of functions of interest using CodeViz http://freshmeat.net/projects/codeviz/ [freshmeat.net]
Write lots of notes, preferably on paper with a pen rather than electronically.

Share
twitter facebook
Non-sequitur time (Score:2)

by 14erCleaner ( 745600 ) writes:

I'm not exactly answering your question, but in my experience nothing helps you learn about somebody else's code like having to find and fix bugs in it. Just diving in with a specific goal in mind. The next best thing is having somebody who's familiar with it draw you a diagram of the overall structure. Comments in the program, or external documentation, are usually too much to hope for.
- Re: (Score:2)
  
  by TigerNut ( 718742 ) writes:
  
  The next best thing is having somebody who's familiar with it draw you a diagram of the overall structure.
  I find the best thing is to do the drawing myself. It might take a couple of attempts, but in the process you have to dig into the details and discover the structure. The extra interaction with the code gives a more indepth understanding. If I can draw it, then I can create it, fix it, and explain it to someone else. If I can't draw a particular thing to a desired level of detail (whether it's a piece
Answer (Score:5, Funny)

by hey! ( 33014 ) writes: on Friday January 18, 2008 @12:49PM (#22095116) Homepage Journal

Yes. Understanding code is one of thing things you hire tools for.
...

Wait, were you talking about software?

Share
twitter facebook
- Re: (Score:2)
  
  by WK2 ( 1072560 ) writes:
  
  Yes. Understanding code is one of thing things you hire tools for.
  
  Yes, but what happens when the tool asks Slashdot how to understand the code?
  - Re: (Score:2)
    
    by hey! ( 33014 ) writes:
    
    He gets gratuitously mocked for a cheap laugh by people with a pathetic need to be perceived as clever.
doxygen - with full source option (Score:3, Interesting)

by mhackarbie ( 593426 ) writes: on Friday January 18, 2008 @12:50PM (#22095122) Homepage Journal

I agree with the previous recommendations for Doxygen. A while back I wanted to become familiar with the source code for a game engine and tried various tools to help with the 'grok' factor. I found the doxygen docs, with full source code generation in html, to be the fastest and most convenient way to walk around the code. After a while, it just clicked.

Creating small demo apps that use the code can also help.
mhack

Share
twitter facebook
GNU Global (Score:4, Informative)

by Masa ( 74401 ) writes: on Friday January 18, 2008 @12:50PM (#22095134) Journal

GNU Global is able to generate a set of HTML pages from C/C++ source code. This tool has helped me several times. All member variables, functions, classes and class instances are hyperlinks. It provides an easy way to examine source code. It also provides tags for several text editors (for Vim and Emacs especially). http://www.gnu.org/software/global/ [gnu.org]

Share
twitter facebook
Umm.. documentation? (Score:5, Insightful)

by Anonymous Coward writes: on Friday January 18, 2008 @12:51PM (#22095144)

Seriously folks, having spent large chunks of my working life having to decipher the mess of those who came before me I cannot stress enough the importance of clear comments, variable/function names, and consistent and readable syntax. AND WRITE F@#$%ing HUMAN READABLE DOCUMENTS DESCRIBING FUNCTIONAL REQUIREMENTS, ALGORITHMS USED, LESSONS LEARNED, ETC.
Calling all your variables "pook" or the like may be very cute, but does not help me figure out what the heck the function is supposed to do or why I would ever want to call it. Yes it's a pain. Yes we're all under time deadlines and want to get it working first and go back and document it later. And yes, it WILL bite you in the ass (ever heard of karma? your own memory can go and then you have to decipher your OWN code!).

That said, if you have inherited a code base from someone who ignored the above, go through and generate the documentation yourself. Write flow charts and software diagrams showing what gets called where and why. Derive the equations and algorithms used in each piece and figure out why the constant values are what they are. Finally, start at the main function or reset vector (I do a lot of microcontroller development) and trace the execution path.

Share
twitter facebook
- Re: (Score:3, Funny)
  
  by Skewray ( 896393 ) writes:
  
  Why? I can write crap and you can clean it up. This is Division of Labor, which is the basis of our civilization.
Osmosis (Score:3, Insightful)

by Greyfox ( 87712 ) writes: on Friday January 18, 2008 @12:51PM (#22095150) Homepage Journal

If the original developer made useful comments that will help immensely. If there's a design document showing how the program fits together that helps a lot. If there's a process document explaining the business logic the application implements, that helps a lot. On average you'll start with a marginal code base with no comments, no design documents and no explanation of what the application is attempting to accomplish.
Get the guys who use it to explain what they're trying to do, read the code for a couple of days and then have them show you how they use the application. Then plan on six months to a year to get to the point where you can look at buggy output and know immediately where the failure is occurring. In the mean time just work in it as much as you can and don't try to redesign major parts of it until you know what it's doing.

Share
twitter facebook
Last time I had to do something similar... (Score:2)

by ByOhTek ( 1181381 ) writes:

I had to do something similar a while ago with a poorly documented piece of software, I pulled out visio (it's what we have here, I'm sure there are better tools for the job, but it worked well enough), and made a diagram of what-called-what. Even without the why/conditionals, that helped me a lot (the names made more sense), on parts where I had trouble, I'd go to the lower levels, figure out what they did, and document those functions in the visio diagram.

That is what I would do in your situation, except:
- - Re: (Score:2)
    
    by ByOhTek ( 1181381 ) writes:
    
    to each his/her own. I had the source editor on one desktop, and Visio on the other. A key command switched desktops, so I could read something, edit the visio diagram, and go back fairly quickly. Code more resembles that kindof diagram in my head anyway, so I didn't have trouble.
    
    I don't see dead trees as being any easier for me personally. Too much erasing makes them hard to read, and there was a lot of moving/erasing.
perl and graphviz (Score:2)

by Speare ( 84249 ) writes:

I had to do this sort of "unfamiliar code analysis" with an ancient FORTRAN application written by non-software guys in the 1980s. It was some of the worst spaghetti I'd seen in some time.

To make any sense of it, I asked the compiler for a call tree report, and then I fed this through Perl to make a GraphViz "dot" file of it. After a few shuffles, I could start to determine some architecturally related areas and refactor slightly to decouple them into a more clear arrangement of modules. It was still
Don't attempt the impossible... (Score:4, Insightful)

by namgge ( 777284 ) writes: on Friday January 18, 2008 @12:52PM (#22095174)

and an implicit expectation that I'll grok it all Real Soon Now

It is unlikely that your job is really to 'grok it all'. Most likely there are specific issues that need to be solved - stop panicking and pick the simplest one on the list and start working on it.

In a similar position to you, I followed Brook's advice to study on the data structures and found it good. Also just running the application under a debugger, inserting breaks in important looking code and then having a look at the call stack when that code was used also proved enlightening. A good debugger also lets you explore the data structures.

When smart-asses tell you "Bill would have fixed that in ten minutes." I recommend replying "I never met Bill, why do you think he left?"

Namgge

Share
twitter facebook
- Re: (Score:2)
  
  by Lumpy ( 12016 ) writes:
  
  When smart-asses tell you "Bill would have fixed that in ten minutes." I recommend replying "I never met Bill, why do you think he left?"
  
  that one works great, Problem is most of the time smart-asses are not the ones doing that, but incredibly stupid managers.
  
  "manager of Marketing, XXX did it in 15 minutes... why are you taking so long?"
  
  The parent's response is perfect for these situations... it shuts them up instantly.
Etags (Score:3, Interesting)

by __david__ ( 45671 ) * writes: on Friday January 18, 2008 @12:52PM (#22095176) Homepage

Emacs and etags are your friend. Meta-. zips to the function under the cursor. C-s for incremental search. Meta-x grep-find for any other search.

Also, run the program with a debugger and step through it. Or put some print statements in key places and see what it produces.

I find that's all I ever need.

Share
twitter facebook
Muhahaha (Score:2)

by roman_mir ( 125474 ) writes:

I hate people who refuse writing requirments / design documentation stating that good code must be self-explanatory.

Now you can hate them too!
hmm. (Score:2)

by apodyopsis ( 1048476 ) writes:

Thats like a carpenter asking for a nail gun because the hammer is too complicated to use. As with all trades get to grips with the basics first, if you really cannot make a dent on your code mountain then are you sure you should be doing the job? No disrespect intended.

I find, when in similar situations, start in main() and stroll down the call tree. I also make a beeline for interrupt handlers and pointers - but then I specialize in embedded software so bear in mind that my advice might be as useful as
Understand the design first, then the code (Score:5, Informative)

by Anonymous Brave Guy ( 457657 ) writes: on Friday January 18, 2008 @01:01PM (#22095364)

I'm afraid you've set yourself an almost impossible task. IME, there are no shortcuts here, and it it's going to take anywhere from a few months to a couple of years for a new developer to really get their head around a large, unfamiliar code base.

That said, I recommend against just diving in to some random bit of code. You'll probably never need most of it. Heck, I've never read the majority of the code of the project I work on, and that's after several years, with approx 1M lines to consider.

You need to get the big picture instead. Identify the entry point(s), and look for the major functions they call, and so on down until you start to get a feel for how the work is broken down. Look for the major data structures and code operating on them as well, because if you can establish the important data flows in the program you'll be well on your way. Hopefully the design is fairly modular, and if you're in OO world or you're working in a language with packages, looking at how the modules fit together can help a lot too. Any good IDE will have some basic tools to plot things like call graphs and inheritance/containment diagrams, if not there are tools like Doxygen that can do some of it independently.

If you're working on a large code base without a decent overall design that you can grok within a few days, then I'm afraid you're doomed and no amount of tools or documentation or reading files full of code will help you. Projects in that state invariably die, usually slowly and painfully, IME.

Share
twitter facebook
- Re: (Score:2)
  
  by pavera ( 320634 ) writes:
  
  As someone who has inherited quite a few code bases in the state you describe in your last paragraph (and trades in turning around large projects which have gone off the tracks), I can completely agree with you. If there isn't a decent design behind the system, something that can be explained in a few days or a week, that details the major modules of the system, the major code/data paths in the system, and the overall design philosophy, then it gets very difficult.
  
  In general projects in that state have/had
kcachegrind (Score:2)

by Akatosh ( 80189 ) writes:

kcachegrind [sourceforge.net] is very nice for a lot of languages. It makes an easy to read function call map, among other things.
Look at doxygen/umbrello (Score:3, Informative)

by Yiliar ( 603536 ) writes: on Friday January 18, 2008 @01:04PM (#22095458)

See:
http://www.stack.nl/~dimitri/doxygen/ [stack.nl]
and:
http://uml.sourceforge.net/index.php [sourceforge.net]

These tools allow you to 'visualize' a codebase in several very helpful ways.
One important way is to generate connection graphs of all functions.
These images can look like a mess, or a huge rail yard with hundreds of connections.
The modules, libraries, or source files that are a real jumble of crossconnected lines are a clear indication of where to start clean up activities. :)

Good luck!

Share
twitter facebook
Wait 'till you get to reading the specs... (Score:3, Interesting)

by crovira ( 10242 ) writes: on Friday January 18, 2008 @01:08PM (#22095546) Homepage

That should be good for a laugh or three.

They'll be out of date, full of inconsistencies and incomplete.

Then you'll be reading the code only to discover that people's idiosyncrasies and personalities definitely affects their coding styles. (There's even some gender bias where women tend to set a lot of flags [sometimes quite needlessly] and decided what to do later in the execution while men code as if they knew where they were going all the time, just that when they get there, they're missing some piece of information or other.)

If you read code developed by a whole team of people, you'll get to know them, intimately.

Good luck. You'll be at the bar in no time... I kept the stool warm for you.

Share
twitter facebook
- If there's an online component (Score:2)
  
  by crovira ( 10242 ) writes:
  
  http://media.libsyn.com/media/msb/msb-0195_Rovira_Diagrams_PDF_Test.pdf [libsyn.com]
  
  might help.
  
  Its a technique I used successfully, wherever the client was, whatever the client was up to and with whatever staff was on hand. Its domain independent too.
  
  Enjoy.
The Classics (Score:2)

by Dunx ( 23729 ) writes:

In you situation, the thing you need to use most is your voice: talk to people who already understand the code.

The last time I had to do this (with no documentation, meaningful code comments, or engineering support - no voice option!) it was in a mixed-language code base too.

My tools of choice were:

* etags - like ctags, but supporting pretty much any block-structured language. So navigating from Delphi code into C# code actually worked.

* vim - reads etags files, and of course it is my editor of choice.

* gre
The Slashdot attitude (Score:3, Insightful)

by gaspyy ( 514539 ) writes: on Friday January 18, 2008 @01:19PM (#22095734)

I'm appalled by some of the comments that imply that the poster may not be fit for the job.

A few years back I had to maintain a large module written in C#. I had about 200K lines of code, 50 classes, zero documentation, zero comments, zero error logging support, and I was expected to find and fix bugs and add functionality the day after the module was handled over.

So if you were never in this position, just STFU. Yeah, the code is there, but is this flag for? Is this part really used, or is obsolete? What are the side-effects of using that method? And so on...

Eventually, I learned it, especially after some intensive debugging sessions, but it was frustrating to say the least. I would have loved to have some aiding tools.

Share
twitter facebook
Where be dragons? (Score:2, Informative)

by mm4 ( 1089615 ) writes:

Apart from Understand for C++, I'd also suggest SourceMonitor - http://www.campwoodsw.com/sm20.html [campwoodsw.com] It will at least quickly point you to potentially problematic parts (long functions, deep nesting, etc.).
Reverse Engineering Tools (Score:2)

by kaladorn ( 514293 ) writes:

Rational Rose and Enterprise Architect both allow you to reverse engineer OO projects to produce a model. Of course, the product depends a lot on the complexity of the architecture. I've tried with EA and found that it didn't like (at least the version we had) STL. And the COM stuff through it for a bit of a loop too. But it did show some interesting (and correct) relationships. I've seen MFC reverse engineered in Rational Rose and, with some tweaking, provided some useful insights.

I also second the recomme
Explain the code to someone else. (Score:2)

by Organic Brain Damage ( 863655 ) writes:

You're going to have to read the code. Most programmers love to write code and hate to read code. If you cannot read code, you cannot do maintenance programming.

One technique I've found helpful when confronted with something to big, ugly and important to rewrite....
Find someone, anyone, who will sit in a room with a PC and projector and you explain what the code does to them, in detail.

If you need to diagram, use a whiteboard, Rose is useless. You'll wind up with a huge pile of ineffable UML if you try t
HTML based cross reference (Score:3, Interesting)

by NullProg ( 70833 ) writes: on Friday January 18, 2008 @01:32PM (#22096034) Homepage Journal

Run these commands (or put them in a script):

ctags *
gtags
htags -Fan

It will create a ~\HTML folder with all the function/variables cross-referenced. Open the file index.html or mains.html in your browser. If your not running Linux, I think these utilities are included in cygwin http://www.cygwin.com/ [cygwin.com]

Enjoy,

Share
twitter facebook
Browse-by-Query (Score:3, Informative)

by mmacdona86 ( 524915 ) writes: on Friday January 18, 2008 @01:32PM (#22096040)

I'll plug my own open-source project for this:
Browse-by-Query [sourceforge.net]-- it won't help with C/C++(sorry for the original questioner), but it will handle Java or C#.
It dumps the code into a database and lets you query it to find the relationships.
I'm biased, of course, but I've found it's just the thing to understand how a particular piece of functionality in an unfamiliar code base fits into the big picture.

Share
twitter facebook
Headers (Score:2)

by 12357bd ( 686909 ) writes:

If it's a C/C++ project, start trying to understand the headers, after the docs/comments they are most descriptive part.
Use UML, and focus on the interfaces (Score:3, Informative)

by davide marney ( 231845 ) writes: on Friday January 18, 2008 @01:53PM (#22096512) Journal

If your project is object oriented, you may be able to get your UML modeling tool to import the code and visualize the classes. When you do this, you'll probably get a HUGE diagram that seems just as unwieldy as looking at the code. The trick is to apply a filter to the model, so you're not overwhelmed with detail. Your UML tool should be able to do that for you.

I recommend focusing on all interface classes first. This can give you a remarkably sane picture of a system, and will help you divide up the code into more conceptually meaningful chunks.

The tool I use is Enterprise Architect [sparxsystems.com], which does quite a lot of heavy lifting yet is still inexpensive enough for me to own a personal copy.

Share
twitter facebook
Solution (Score:5, Funny)

by Chapter80 ( 926879 ) writes: on Friday January 18, 2008 @01:54PM (#22096530)

I've always found that the most effective method of learning code is to inject a random line of code somewhere, and see what breaks. Two techniques: 1) print some official-looking error message, and 2) add a large value (a million or greater) to a number somewhere. Keep a nice chart of what you added, where:
Error 'Format Conversion Error, converting from Y2K to Z2L' added to module x1
Error 'Out of Memory Banks' added to module x2
Error 'Object Expected; found adjective instead' added to module x3
Error 'bitbucket 95% full; please empty' added to module x4
Added 1,000,042 to some random value in module x5
Added 5,555,555 to some random value in module x6
Not only will you learn about the code, you'll make a great impression on your boss, when, within minutes, you are able to resolve some mysterious problem that has never happened before.

Share
twitter facebook
More than tools (Score:5, Informative)

by sohp ( 22984 ) writes: <.moc.oi. .ta. .notwens.> on Friday January 18, 2008 @02:31PM (#22097350) Homepage

The best tool is your brain, applied liberally. Here's some thoughts to put in it

Feathers, Michael. Working Effectively with Legacy Code [amazon.com], Chapter 16 especially.

Spinellis, Diomidis. Code Reading: The Open Source Perspective [amazon.com], Chapter 10 lists some tools for you.

My own thoughts now. First, don't trust the comments, they are probably outdated. Second, if it's a big code base, forget the debugger. Write some little unit test cases that exercise the sections of code you need to understand, and assert what you think the code is supposed to do.

Finally, unless you are cursed with a codebase which is not kept in version control (in which case, ugh, time to start the jobhunt up again maybe), then take a look at the revision history. See what changes have been made to the area you are working on. With luck, someone will have put in a revision message that points you towards greater understanding of why a change was made, which will in turn nudge you towards knowing the purpose of the section of code that was change.

Share
twitter facebook
I had a pile of C++ dropped in my lap 2 years ago. (Score:3, Informative)

by Richard Steiner ( 1585 ) writes: <rsteiner@visi.com> on Friday January 18, 2008 @03:02PM (#22098046) Homepage Journal

My main tool for figuring it all out was to use exuberant ctags [sourceforge.net] to create a tags file, and Nedit [nedit.org] to navigate through the source under Solaris, with a little grep thrown in. I also used gdb with the DDD [gnu.org] front-end to do a little real-time snooping.
I've since added both cscope [sourceforge.net] and freescope [sourceforge.net], as well as the old Red Hat Source Navigator [sourceforge.net] for good measure.

Share
twitter facebook
Source Insight (Score:3, Informative)

by Effugas ( 2378 ) * writes: on Friday January 18, 2008 @06:22PM (#22101634) Homepage

It's inexpensive, and scales astonishingly. I've spent the last two years in it, and it's just how I audit code nowadays.

Share
twitter facebook
Shameless plug for CodeSurfer (Score:3, Interesting)

by mmcdouga ( 459816 ) writes: <mmcdouga@@@saul...cis...upenn...edu> on Friday January 18, 2008 @06:50PM (#22102062) Homepage
My company makes a code understanding tool called CodeSurfer [grammatech.com]. It's not open source, and it's not free (though it is free for academic use).

You can browse your code, following dependences and definitions. You can also construct queries, do isolate what statements can affect a particular variable, and a bunch of other tricks based on static analysis. There's a programming interface too.

Other good ways to get your head around code (speaking as a software engineer, rather than a guy promoting his company):
- I agree with whoever suggested breaking in a random spot and stepping through the code.
- Talk to the other developers, if they are around. Don't suffer in silence for the sake of doing it on your own.
- Pick a minor throwaway feature (eg every button should be blue) and modify the code to add that feature. This forces you really learn the code, but without the pressure of making a real product-worthy feature.
Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by wampus ( 1932 ) writes:
  
  Sometimes its hard to follow execution, especially in a large codebase. Its made even more difficult when a smug jackass wrote it to be as terse as possible.
- Re:How / why did you get the job... (Score:5, Insightful)
  
  by Jeremi ( 14640 ) writes: on Friday January 18, 2008 @12:44PM (#22094992) Homepage
  
  One might as well ask, why are you posting smarmy retorts when you clearly didn't understand the question? The question was about understanding the program, not the underlying language.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by geekoid ( 135745 ) writes:
  
  I have seem some pretty awfully used languages.
  I started at one company, and they had functions that were 1600 lines long, with gotos.
  
  Not easy to understand, and very complex.
  - Re:How / why did you get the job... (Score:5, Funny)
    
    by PetriBORG ( 518266 ) writes: on Friday January 18, 2008 @01:32PM (#22096024) Homepage
    
    Only 1600 lines?
    I used to work at a company with a lot of Pascal and C code... It was extremely common (as in, all but a few) for programs to be written entirely in one code file. These files would go on for 20,000 lines or more. So many lines in fact that after the compiler had imported the header files at the top of the file that they would be over 65,000 lines long and the debugger would crap out because it had exceeded the int that it used for line number counting.
    Sadly this isn't a joke.
    
    Parent Share
    twitter facebook
- Re: (Score:3, Insightful)
  
  by swillden ( 191260 ) writes:
  
  ...if you don't understand the language?
  
  Yes, it's hard to understand questions when you don't understand the language.
  I'm sure you can find some remedial English classes if you look.
- Re: (Score:2)
  
  by AmaDaden ( 794446 ) writes:
  
  I just started working a company with a horrifying code base and was using Eclipse. Eclipse did a fantastic job of helping me jump around the code (oh how I love you CTRL + left click) but the code it self was still hard to read. I figured out that in Eclipse you can do a LOT more color coding than is used by default. This seems trivial but now with a glance I can get a good deal more information on the scope and type of a variable or function then before. I highly recommend looking in to it. I have to note
- Re: (Score:2, Insightful)
  
  by Anonymous Coward writes:
  
  The best programmers I've ever worked with didn't have degrees. But some of the worst ones did.
- Re: (Score:2)
  
  by Schraegstrichpunkt ( 931443 ) writes:
  
  4- Talk to them about getting things replaced with proper solutions. Maintaining that MS access nightmare that some guy in Marketing created 5 years ago is not a real solution, it needs to be replaced with a real solution, let them know.
  
  Here is a useful bit of vocabulary for explaining why this is so: technical debt [wikipedia.org].

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Wait for cenqua's solution (Score:5, Funny)

Been there... (Score:5, Insightful)

Re:Been there... (Score:5, Insightful)

Been there, will never return (Score:3, Interesting)

Re: (Score:3, Funny)

Re: (Score:3)

Stepping Through (Score:5, Insightful)

Re:Stepping Through (Score:5, Insightful)

Re: (Score:3, Informative)

Re: (Score:3, Insightful)

Re: (Score:3, Informative)

Re:Stepping Through (Score:5, Informative)

Re: (Score:3, Informative)

Mod parent up (Score:5, Insightful)

Re:Mod parent up (Score:5, Informative)

Re:Mod parent up (Score:4, Insightful)

Re:Mod parent up (Score:4, Insightful)

Not for a "large" codebase... (Score:4, Insightful)

Re:Not for a "large" codebase... (Score:5, Insightful)

Re: (Score:3, Insightful)

Re:Stepping Through (Score:4, Insightful)

Re: (Score:3, Insightful)

Tests (Score:4, Interesting)

Re: (Score:3, Insightful)

Re:Stepping Through (Score:4, Insightful)

Re:Stepping Through (Score:5, Insightful)

Doxygen (Score:5, Informative)

Re: (Score:2)

Re: (Score:3, Informative)

Re:Doxygen (Score:5, Informative)

Re:Doxygen, and Extracting Software Architectures (Score:5, Informative)

Re:Doxygen (Score:4, Informative)

When I was your age... (Score:2, Interesting)

Re: (Score:3, Funny)

Paper (Score:2, Insightful)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:3, Funny)

Re: (Score:3, Insightful)

Absolute tosh ! (Score:5, Insightful)

Re: (Score:3, Informative)

About "new tools" (Score:3, Interesting)

doxygen (Score:3, Informative)

Ctags (Score:3, Insightful)

Old School (Score:5, Funny)

Understand C++ (Score:5, Informative)

Re:Understand C++ scitools.com (Score:2)

RR & EA (Score:3, Informative)

Re: (Score:2)

Reverse Engineer? (Score:2)

Re: (Score:2)

You must have inherited my old project (Score:5, Funny)

When I am particularly frustrated (Score:2)

What I do (Score:5, Informative)

Non-sequitur time (Score:2)

Re: (Score:2)

Answer (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

doxygen - with full source option (Score:3, Interesting)

GNU Global (Score:4, Informative)

Umm.. documentation? (Score:5, Insightful)

Re: (Score:3, Funny)

Osmosis (Score:3, Insightful)

Last time I had to do something similar... (Score:2)

Re: (Score:2)

perl and graphviz (Score:2)

Don't attempt the impossible... (Score:4, Insightful)

Re: (Score:2)

Etags (Score:3, Interesting)

Muhahaha (Score:2)

hmm. (Score:2)

Understand the design first, then the code (Score:5, Informative)

Re: (Score:2)

kcachegrind (Score:2)

Look at doxygen/umbrello (Score:3, Informative)

Wait 'till you get to reading the specs... (Score:3, Interesting)

If there's an online component (Score:2)

The Classics (Score:2)

The Slashdot attitude (Score:3, Insightful)