Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Programming Software IT Technology

Reverse Engineering Large Software Projects? 104

stalebread queries: "Me and a team of other students have been tasked with reverse engineering a massive C/C++ (mostly C) computer game of about half a million lines. We have most of the source, but no clue of how to approach a task of this magnitude. Anyone have suggestions of programs, or techniques we could use to understand the structure of the game?"
This discussion has been archived. No new comments can be posted.

Reverse Engineering Large Software Projects?

Comments Filter:
  • Legal? (Score:1, Interesting)

    by TheCarlMau ( 850437 ) on Monday October 10, 2005 @10:13PM (#13761263) Homepage
    Just curious... is this something legal? For example, isn't it illegal to reverse engineer Windows?
  • by linuxtelephony ( 141049 ) on Monday October 10, 2005 @10:15PM (#13761283) Homepage
    It sounds like you are wanting to refactor the code, or port it to another platform. If you are missing some of the code, then you'll have to reverse engineer that portion of it.

    As for how to approach it - I think it depends on the size of your team, and what goals you set for the effort. Are you just wanting to learn? Or do you want to improve performance? Or make it work on another platform? What are the goals for this project?

    Once you know those details, they might give you an idea where to begin.
  • Re:Legal? (Score:2, Interesting)

    by redelm ( 54142 ) on Monday October 10, 2005 @10:22PM (#13761317) Homepage
    I would presume that the code came from a liquidation/auction/takeover and the human capital the produced it is no longer available. First, I would try to hire one of the original sw architects to do some consulting. Who knows? They might have some email files that could be considered "part of the software".

  • by oopsdude ( 906146 ) <oopsdude@gmai[ ]om ['l.c' in gap]> on Monday October 10, 2005 @10:52PM (#13761421)
    If he already has the source, then this problem may be easy enough to make asking Slashdot unnecessary. However, there are instances in which asking Slashdot is necessary. If they didn't have most of the source, for example. Or, for example, in this article [slashdot.org], where an IT guy was asked to make an infrastructure for over one million email accounts that must scale perfectly and have 99.9% uptime. Show me a university that trains students for that.
  • Use our tool :) (Score:2, Interesting)

    by mr_tenor ( 310787 ) on Monday October 10, 2005 @11:22PM (#13761669)
    www.cse.unsw.edu.au/~drt

    Not that I'm biased or anything. The idea is to monitor the program while it's running and use the call graph to generate sequence diagrams and such. Feedback and ideas for further reasearch welcome :)
  • by rgbe ( 310525 ) on Monday October 10, 2005 @11:53PM (#13761921)
    There are some automatic UML generators that will give you an overview of the code, or parts of the code:
    http://droogs.org/autodia/ [droogs.org]
  • Massive? (Score:2, Interesting)

    by idries ( 174087 ) on Tuesday October 11, 2005 @02:56PM (#13767063) Homepage
    First of all this is not a massive code base for a commercial computer game, it's about average. Many games get into the 1-2 million lines of code. Having said that most games also have teams that are probably much larger than your group of students.

    I'm not exactly sure what you're trying to do here. As many ppl have said reverse engineering something that you already have the source for is not really reverse engineering at all. However if I make the (somewhat suspect) assumption that your objective is to examine the code and extract some kind of high-level understanding of the entire engine which you can then demonstrate in some way, I would advise you to think again. Most games (again, I am assuming that you have a commercially developed code base of some kind) are a giant mess with no overall design or direction in the code.

    Generally you'll find that a few sub-systems have been implemented with some kind of clean design (although not necessarily in a coordinated manner) and then the rest of the game is just a mass of glue code that holds these pieces together. During the original implementation no-one will have had the kind of general overview that you're looking for, each member of the team will know their specific area or areas, and how that part interfaces to the next, but no-one will know how all of them work together. Trying to summarize how all the systems work together will either give you something very high-level (and essentially meaningless) or something so complex that it's almost as hard to understand as the source (and not suitable to give to your professor as 'proof of understanding').

    My advice would be to choose one or more parts of the game and try to gain an understanding (in whatever manner you choose) of those areas. One of the best ways to choose these areas is to look at the USP (unique selling points) of the game itself. Some areas of the game will have been very important to the final product, while others will have been done just because they had to. For example, if the game is an RTS with a focus on the tactical aspect of the single player experience, then the scripting and ai systems will have been very important (and made as good as possible) while the sound engine will not have been very important (and made just good enough). The parts of code which are important to the actual gameplay will have had much more time and attention spent on them and will probably be far more interesting. Having said that the most important parts of the game will also have had more ppl working on them and they may well contain much less readable code.

    Perhaps you should give us some more info on what exactly you want to do, so that we can give you more relevant advice?

"If it ain't broke, don't fix it." - Bert Lantz

Working...