Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Programming Technology

Ultra-Stable Software Design in C++? 690

null_functor asks: "I need to create an ultra-stable, crash-free application in C++. Sadly, the programming language cannot be changed due to reasons of efficiency and availability of core libraries. The application can be naturally divided into several modules, such as GUI, core data structures, a persistent object storage mechanism, a distributed communication module and several core algorithms. Basically, it allows users to crunch a god-awful amount of data over several computing nodes. The application is meant to primarily run on Linux, but should be portable to Windows without much difficulty." While there's more to this, what strategies should a developer take to insure that the resulting program is as crash-free as possible?
"I'm thinking of decoupling the modules physically so that, even if one crashes/becomes unstable (say, the distributed communication module encounters a segmentation fault, has a memory leak or a deadlock), the others remain alive, detect the error, and silently re-start the offending 'module'. Sure, there is no guarantee that the bug won't resurface in the module's new incarnation, but (I'm guessing!) it at least reduces the number of absolute system failures.

How can I actually implement such a decoupling? What tools (System V IPC/custom socket-based message-queue system/DCE/CORBA? my knowledge of options is embarrassingly trivial :-( ) would you suggest should be used? Ideally, I'd want the function call abstraction to be available just like in, say, Java RMI.

And while we are at it, are there any software _design patterns_ that specifically tackle the stability issue?"
This discussion has been archived. No new comments can be posted.

Ultra-Stable Software Design in C++?

Comments Filter:
  • by neo ( 4625 ) on Saturday February 04, 2006 @11:42PM (#14644222)
    1. Write the whole thing in Python.
    2. Once it's bullet-proof, replace each function and object with C++ code.
    3. Profit.
  • by Pentclass ( 710578 ) on Saturday February 04, 2006 @11:43PM (#14644226)
    Follow NASA's advice... http://www.fastcompany.com/online/06/writestuff.ht ml [fastcompany.com]
  • by merlin_jim ( 302773 ) <.James.McCracken. .at. .stratapult.com.> on Sunday February 05, 2006 @12:06AM (#14644304)
    I was going to post pretty much the same thing - managed code approaches C++ efficiency close enough that it shouldn't matter (I've seen figures of 80-95%)

    And, in visual studio .net 2005 there are built in high performance computing primitives - all the management of internode communication and logical data semaphore locking are handled by the runtime - presumably debugged and stable code...
  • Try the following (Score:1, Interesting)

    by Anonymous Coward on Sunday February 05, 2006 @12:18AM (#14644333)
    If you really want stable programs using C++, be sure of the following basics -
    1. Hire good programmers.
    2. Make sure that EVERY function is defined with a specification, describing everything within the function. This allows you debug much easier.
    3. Make sure that you've got all requirements written.
    4. Try not to use fancy stuffs such as function pointers, de-referencing pointers, etc. Not all programmers are genius.
    5. 1 good practice, if you allocate memory in an object, make sure that the same object is responsible for de-allocating memory. This is commonly practical.
    6. For IPC, try not to use shared memory. Using message queue makes your work easier because of its guarenteed nature. Try to use MQ Series or something similar. They provides a robust mechanism for transferring and retrying data. It is the money worth spending. It is also compatible with Windows and Linux as well.
    7. Stick to ANSI C++ functions to ensure compatibility.
    8. Use a portable UI language such as qt.
    9. Test, test and test. Peer review codes.
    10. Establish a naming convention for variables and classes.
  • Good call (Score:2, Interesting)

    by lifeisgreat ( 947143 ) on Sunday February 05, 2006 @12:19AM (#14644342) Homepage
    Good call. I'm not sure why C++ is being mandated for something that has stability as a top priority. Though there are some language-independent things that should be taken into account:

    Executive summary of this post: Keep it simple. As simple as it can be while getting the job done. The more buzzwords you think about implementing, the more you need to reconsider whether you really need that whiz-bang feature.

    You need to abstract your design into really independent layers, such that the backend processing can be done across linux, windows and even beos slaves simultaneously, and the frontend is viewable via a web interface, fed into excel or whatever. You can't look at this as one big project, but many independent (and more easily verifiable!!) applications cooperating with each other.

    My impression from the description is that you want a system like folding@home for corporate customers - they have a whole heap of data they want analyzed (parallel workload across many clients) and a small subset of results they're interested in. Don't make things any more complicated than they have to be - the data sets could simply be files that are partitioned by a master, sent out when requested to client workhorse computers, getting there by http, nfs or whatever, processed, and the results returned into an incoming directory for a simple frontend to tabulate.

    The biggest mistake you could make is having one gargantuan application in charge of everything. The race conditions will drive you mad, be they in data access, allocation, retrieval, dispatch or anything else you're trying to manage that the OS could do for you.

    Just look at Froogle. Their millions upon millions of store/price listings are fed by people ftp'ing a feed of tab-separated text values.

  • test with valgrind! (Score:5, Interesting)

    by graveyhead ( 210996 ) <fletchNO@SPAMfletchtronics.net> on Sunday February 05, 2006 @12:25AM (#14644364)
    valgrind -v ./myapp [args]

    It gives you massive amounts of great information about the memory usage of your program.

    The other day I spent nearly 3 hours trying to decode what was happening from walking the backtrace in gdb. Couldn't for the life of me figure out what was happening. Valgrind figured out the problem on the first run and after that, I had a solution in a few minutes.

    Highly recommended software, and installed by default on several distributions, AFAIK.

    Enjoy!
  • Re:Forget it. (Score:3, Interesting)

    by yamla ( 136560 ) <chris@@@hypocrite...org> on Sunday February 05, 2006 @02:04AM (#14644652)
    There's no excuse for buffer overflows and memory leaks in C++, not with TR1's smart pointers and not with the standard library's containers. That's not even considering garbage collectors which have been available in C++ for years.
  • robust software (Score:5, Interesting)

    by avitzur ( 105884 ) on Sunday February 05, 2006 @02:26AM (#14644709) Homepage
    Way back in 1993, thanks to a three month schedule delay in shipping the original Apple Power PC hardware, Graphing Calculator 1.0 had the luxury of four months of QA, during which a colleague and I added no features and did an exhaustive code review. Combine that with being the only substantial PowerPC native application, so everyone with prototype hardware played with it a lot, resulted in that product having a more thorough QA than anything I had ever worked on before or since. It also helped that we started with a mature ten year old code base which had been heavily tested while shipping for years. Combine that with a complete lack of any management or marketing pressure on features, allowed us to focus solely on stability for months.

    As a result, for ten years Apple technical support would tell customers experiencing unexplained system problems to run the Graphing Calculator Demo mode overnight, and if it crashed, they classified that as a *hardware* failure. I like to think of that as the theoretical limit of software robustness.

    Sadly, it was a unique and irreproducible combination of circumstance which allowed so much effort to be focused on quality. Releases after 1.0 were not nearly so robust.
  • by The_Wilschon ( 782534 ) on Sunday February 05, 2006 @03:57AM (#14644909) Homepage
    Some literature (well, anecdote) on why you should use lisp: http://www.paulgraham.com/avg.html [paulgraham.com]
  • by SanityInAnarchy ( 655584 ) <ninja@slaphack.com> on Sunday February 05, 2006 @04:11AM (#14644936) Journal
    We use C# for its nice GUIs; C++ for cross-platform portability.

    *cough* *choke* WHAT?

    C++ takes a lot of platform-specific work to become portable, unless you're using portable libraries to do it for you. C#, if done well, is already portable -- and what's more, has Java's "Compile once, run anywhere", to some extent -- look at Mono.

    C++ has lots of nice GUIs, and so does C. C# has some nice GUIs, but is lacking a few major this, like OpenGL. You can write bindings for them, but in C++, nobody has to write bindings, because almost all GUIs are natively C or C++, and must have bindings written for anything else.

    And a well designed C# program is easier to write than a well designed C++ program. If your algorithm is broken, sure, nothing will work, but if it's your implementation that needs work, C# is a big help.

    Now, if only someone could tell me why C# is better/worse than Java?
  • Re:inline code (Score:3, Interesting)

    by jadavis ( 473492 ) on Sunday February 05, 2006 @06:34AM (#14645206)
    Once I did a hybrid python/C project because C by itself was impossible for me to maintain in the given time constraints.

    It worked amazingly well. There's a little bit of interfacing work that needs to be done, but I found that, in that project at least, the C code didn't need to be modified very often.

    It very often DOES simplify to use two programming languages.
  • by Handyman ( 97520 ) * on Sunday February 05, 2006 @07:27AM (#14645275) Homepage Journal
    In my experience decoupling and automatic restarting is a recipe for failure. You set yourself up for all sorts of race conditions. For instance, if a module is unresponsive for a while but not crashing, do you restart it? And if you do, what if the original module finishes its grand execution plan and comes back up after a minute?

    No, I'd go for:
    * A "monolithic" application with module separation provided by OO design. At least you know that either your whole application is there, or it isn't. No inconsistencies between modules because of individual module re-starts, and if the app breaks, restart the whole thing. Starting the app is the code path you've tested, restarting separate modules usually isn't (and even if it were, there's usually 2^27324 different situations to test, i.e., all possible combinations of modules failing in any sort of way).
    * Use smart pointers exclusively, preferably Boost's shared_ptr. Use weak pointers (Boost provides an implementation for that as well) to automatically break reference cycles.
    * For error handling, use exception handling exclusively. Incredibly many bugs are caused by ignored return codes.
    * Use "auto" objects for all resources that you acquire and that need to be released at the end of a code section. Cleanup that doesn't happen when a code path encounters an exception can cause resource leakage, instability and hangups (locks, anyone?). In my programming practice, when I allocate a resource (memory, refcount, open/close a recordset, etc.), I always wrap it in an auto object immediately, so that I can forget about managing it through all the code paths that follow.
    * Use the correctness features that the language provides: write const-correct code from the start.
    * Use automated testing right from the start, both unit testing and integration testing. If you don't do this, you will be forever tied to whatever bad design decisions you make in the first months of the project. Automated testing allows you to always make large implementation changes, giving you confidence that it will not break existing behaviour.
  • by Anonymous Coward on Sunday February 05, 2006 @11:52AM (#14645913)
    Purely functional languages have two big advantages applicable in this case:

    * No (or very, very limited) side-effects. In other words the result of a function is not dependent on the current program state. Once it is exhaustively verified in testing, that function will forever more return the correct results because the run-time state won't affect it.

    * The language itself can often be treated as a specification of correctness, and even formally proved through static analysis. As a trivial example if you write an implementation of factorial in Haskell, it strongly resembles the mathematical definition of factorial -- the code is more of a description of what the correct result is, rather than a set of low-level steps for carrying out the computation as in C.

    Haskell is nice, however I think the original questioner is better off with something like Erlang, which was designed for just this kind of situation. If it's good enough for telephone switches...
  • by EsbenMoseHansen ( 731150 ) on Sunday February 05, 2006 @12:01PM (#14645949) Homepage
    C++ takes a lot of platform-specific work to become portable,

    Where do people get this idea? I have ported quite a few applications, and usually the porting done by locating the libraries you need on the new platform, and fix a few oddities in the current platform (like closing sockets in z/OS or switching to unsafe multitasking (p-threads) on windows. Porting to linux is so trivial that I often do it just to get access to the superior tools available there, especially valgrind. GUI is the exception, of course, unless you use a x-platform kit from the beginning.

    Which leads me to my recommendations, in no particular order

    • Use Valgrind. A lot
    • Use a good toolkit. If GPL is acceptable, consider QT.
    • Consider a "packet" or a "transaction" based approach, that is design every application to take in a package, process it and return/store the result. This sort of applications are easier to automatical test
    • Avoid huge application anywhere... no more than 100 classes per application, no more than 1000 lines per class
    • Use automatic, integrated unittest
    • Use automatic, daily run integration/function tests
    • Do not accept complication designs
    • Avoid close sourced libraries. Be aware and fix library issues.
    • Avoid incompetent developers on critical components. Let them make the GUI/statistic/other fringe components.

    The above approach works for me. You mileage may vary.

  • by The_Wilschon ( 782534 ) on Sunday February 05, 2006 @12:18PM (#14646006) Homepage
    More: http://www.cs.indiana.edu/~jsobel/c455-c511.update d.txt [indiana.edu] about a guy who wrote the "Fast Multiplication" algorithm very simply in scheme, and then transformed it (using correctness preserving transformations, which are much much easier to do in "Haskell or one of the other functional languages" than in C/C++ and friends) into scheme code that was as optimized as he could come up with, and which furthermore had a pretty much 1-1 correspondence with C statements. He then rewrote it in C (including perfect "goto"s!), and beat all but one person in his class on the speed of the algorithm. Furthermore, he spent significantly less time working on (read debugging) his code than anyone else in the class.
  • Re:Bullshit (Score:3, Interesting)

    by TheNetAvenger ( 624455 ) on Sunday February 05, 2006 @12:53PM (#14646122)
    Bullshit. C++ written well is portable by default (between windows and linux). There are a few minor issues between linux and sgi.

    I agree.

    This is by nature one of the biggest strengths of C and C++, how someone could conclude that by using C++ adds some sort of complexity in cross platform development actually amazes me.

    If it adds complexity, in comparison to what? I would like to see the poster above you explain what is actually easier to use for diverse application development that is actually better at cross platform.

    And if they start with Java, la la, then they need to get a life and see what JAVA is built upon itself.

    C and C++ is a great solution for cross plaform development, look at the nature of Linux, BSD, and even NT and then ask why they are as portable as they are. Do people think these OSes would be more portable in another language?

    Take Care.
  • by Nataku564 ( 668188 ) on Sunday February 05, 2006 @01:08PM (#14646198)
    I figured people would get all wound up when anyone says anything contrary to their mantra.

    Do some googling, and see that Microsoft says that "managed code" is something that is executed by the .NET framework. Therefore, if you compile it down, its no longer "managed code". My issue isn't with non-compiled languages - heck, my primary language is Perl of all things - my issue is with Microsoft coming up with a new term for an old idea so they can brainwash people into forgetting what it means.
  • by Wolfier ( 94144 ) on Sunday February 05, 2006 @03:24PM (#14646712)

    I'm sorry, but I just can't agree. It might appeal to a mathematician who wants to see everything use functional notation and hates every language except lisp, but to a non-abstract-elite-ivory-tower-mathematician this is absurd. cin is not an array of integers and the use of the adapter obfuscates the fact that you are using a conversion from a char array to an int. The back_inserter also makes it harder to see where the data is going by losing "v" in it. Many would also frown at it for taking a non-const reference, although since it is a standard adaptor it is probably ok


    Not understanding something is one thing, but not understanding something so let's reject it as being "elite-ivory-tower" and "academic" is another. I've seen a lot of buggy C++ code being rewritten employing this style in obvious places - many defects were automatically addressed.


    Reality is algorithmic, not functional, and so are user specifications for the things they want done. Trying to cram them into an abstract mathematical functional model is insanity.


    I disagree. Reality is reality. Algorithmic or Functional are just ways people look at it. Aren't "algorithmic" also abstract? Isn't "object-oriented" abstract as well?

    By the way, using your vocabulary, I view the world as a mixture of "algorithmic" and "functional". No pure anything can describe the world, in my opinion.


    Reality is algorithmic, not functional, and so are user specifications for the things they want done. Trying to cram them into an abstract mathematical functional model is insanity.


    Being functional or algorithmic has *NOTHING* to do with one being "more mathematical" and the other "less mathematical". I advise you, that your use of the common peoples' fear for mathematics in your arguments is not going to help.


    C++ programmers are often unnaturally attached to efficiency and have to be watchful for template bloat. Your copy generates 88 instructions, whereas an equivalent iterative solution is only 33 instructions long.


    Templates, being code generators, differ by nature to hand-tuned codes. So your code generates only 33 instructions vs the template's 88. Great - now tell me - which architecture? What compiler? What version of that compiler, and whose STL are you using, and which version of THAT?
    And before you count the instructions, did you realize that this code is waiting for keyboard inputs, therefore what you're doing is unnecessary (and obviously premature) optimization?


    While I would not deny the utility of auto_ptr in localized situations manipulating the object state during reallocation, its constant use indicates lack of understanding of object lifecycle in the program.


    How does the constant use of auto_ptr relates to the understanding (or the lack thereof) of object lifecycle? Sorry, but understanding object lifecycle the liberal use smart pointers are not mutually exclusive.


    It is fashionable in Java to create objects left and right, without consideration of who is supposed to own them. Hey, just let the garbage collector take care of it! Who cares how long the object lives? Obviously, such immature mentality produces plenty of memory leaks for which Java is so infamous.


    It is fashionable *among incompetent* Java developers to create objects left and right which make their programs memory hogs. It is also fastionable for *incompetent* C++ programs to forget deallocations leaking memories. What's your point? This mentality, immature or not, is not unique to managed languages.

  • Sorry, *not* in C++ (Score:4, Interesting)

    by HermanAB ( 661181 ) on Sunday February 05, 2006 @08:55PM (#14647680)
    You cannot write highly stable code in C++, due to design flaws in the language. For this reason, the FAA doesn't allow C++ for use in aircraft systems. You can improve the situation with the use of a garbage collector though, but if stability and safety is critical, then you should use ANSI C. See this: http://www.hpl.hp.com/personal/Hans_Boehm/gc/issue s.html [hp.com]
  • by Chemisor ( 97276 ) on Monday February 06, 2006 @11:35AM (#14650824)
    > I guarantee you that I rather encounter a
    > for_each(components.begin(), components.end(), _1.disable())

    It is never that simple. The fact that you can't do what you've typed is one of the reasons I dislike it so much. What you really need is:

    for_each (components.begin(), components.end(), mem_fun_ref (&CComponent::disable));

    Things suddenly got uglier, didn't they? But wait, what if you need to call a function with an argument? Gotta use a bind2nd adaptor to wrap it, and then it becomes:

    for_each (components.begin(), components.end(), bind2nd (mem_fun_ref (&CComponent::SetParameter), value));

    Wait 'till you try to explain to some maintaining programmer how to untangle that! Oh, and just for laughs, try to debug this thing. Put an assert in SetParameter, and you get a lovely callstack from gdb:

    (gdb) run
    Starting program: /home/user/tmp/tes
    tes: tes.cc:18: void CComponent::SetParameter(int): Assertion `!"Check out the callstack!"' failed.

    Program received signal SIGABRT, Aborted.
    0xffffe410 in __kernel_vsyscall ()
    Current language: auto; currently c
    (gdb) where
    #0 0xffffe410 in __kernel_vsyscall ()
    #1 0xb7d36126 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:67
    #2 0xb7d37b40 in *__GI_abort () at ../sysdeps/generic/abort.c:88
    #3 0xb7d2f610 in *__GI___assert_fail (assertion=0x6 <Address 0x6 out of bounds>, file=0x6 <Address 0x6 out of bounds>,
    line=6, function=0x80495a0 "void CComponent::SetParameter(int)") at assert.c:83
    #4 0x080485f6 in CComponent::SetParameter (this=0x804b008, arg=42) at tes.cc:18
    #5 0x08048ac3 in std::mem_fun1_ref_t<void, CComponent, int>::operator() (this=0xbfe2cacc, __r=@0x804b008, __x=42)
    at stl_function.h:826
    #6 0x08048ae8 in std::binder2nd<std::mem_fun1_ref_t<void, CComponent, int> >::operator() (this=0xbfe2cacc, __x=@0x804b008)
    at stl_function.h:446
    #7 0x08048b0c in std::for_each<__gnu_cxx::__normal_iterator<CCompon ent*, std::vector<CComponent, std::allocator<CComponent> > >, std::binder2nd<std::mem_fun1_ref_t<void, CComponent, int> > > (__first={_M_current = 0x804b008}, __last=
    {_M_current = 0x804b00c}, __f=
    {<> = {<No data fields>}, op = {<> = {<No data fields>}, _M_f = {__pfn = 0x80485c4 <CComponent::SetParameter(int)>, __delta = 0}}, value = 42}) at stl_algo.h:158
    #8 0x08048740 in main () at tes.cc:26
    (gdb)

    Now that's something to scare newbie programmers with! Oh, and forget about putting a breakpoint inside the loop; templated functions aren't targetable until executed.

    > in some code I need to maintain then to encounter
    > for(i = 0; i < components.count(); ++i) components[i].disable()

    So why not just use an iterator loop? for_each does not have a monopoly on it:

    foreach (compvec_t::iterator, i, components)
    i->disable();

    (foreach is a macro I wrote because I use this construct so often)

    > first form permits, for instance, components to be a linked list or even a hash.
    > The second is implementation-dependent and if you change the underlying data
    > structure, you'll have extra work to refactor.

    If you use iterator loops, this wouldn't happen to you.

    > I once worked, changing all instances of SomeObject* to auto_ptr
    > eliminated altogether 35 bugs we had lurking in the BTS for a long, long time,
    > with less than one day of work (strange, delayed, errors were suddently
    > transformed in EARLY null-pointer dereferences

    Why were you using SomeObject* in the first place? When I was advocating moderation in the use of auto_ptr, I wa

  • How I do it... (Score:1, Interesting)

    by Anonymous Coward on Tuesday February 07, 2006 @09:31AM (#14659141)
    I write this kind of software for a living, although perhaps less complex than yours, since the critical part of mine doesn't have a GUI. On the other hand, I have to survive power-cycline with no problems. But basically there are two parts:
    1. Pepper your code with assert() calls to make sure that if anything goes wrong, it crashes quickly.
    2. Make sure that it can restart seamlessly.
    What's important is that the crash/resume cycle is faster than the necessary response time. If you have hard real-time constraints, life gets a whole lot nastier, but if an occasional half-second glitch is okay, things can go well.

    You have to think really hard about state, and do everything as a two-phase commit. Network connections to applications that don't resume cleanly are particularly tricky; you have to save and restore the sequence numbers across the crash/reboot. This requires NOT using the OS TCP implementation, or hacking it heavily to not send the ack until you've committed the state produced by the packet recpetion.

    I have crashing bugs happily running in production, becuase it gets back up and keeps going with no problems. The bad problems are reboot loops, when the "resume from crash" code crashes. You have to be very paranoid there.

    But it really does work remarkably well. Oh, one more tip: add a version number to your state files. Any time you don't change it, you can perform a software upgrade in-place by crashing the old version and letting the new one resume. Otherwise, you have to have scheduled downtime for every upgrade.

If all else fails, lower your standards.

Working...