Why Do Computers Still Crash?

Please create an account to participate in the Slashdot moderation system

Why Do Computers Still Crash? 1533

Posted by Cliff on Tuesday May 20, 2003 @08:55PM from the shouldn't-this-have-been-fixed-by-now dept.

geoff lane asks: "I've used computers for about 30 years and over that time their hardware reliability has improved (but not that much), but their software reliability has remained largely unchanged. Sometimes a company gets it right -- my Psion 3a has never crashed despite being switched on and in use for over five years, but my shiny new Zaurus crashed within a month of purchase (a hard reset losing all data was required to get it running again). Of course, there's no need to mention Microsoft's inability to create a stable system. So, why are modern operating systems still unable to deal with and recover from problems? Is the need for speed preventing the use of reliable software design techniques? Or is modern software just so complex that there is always another unexpected interaction that's not understood and not planned for? Are we using the wrong tools (such as C) which do not provide the facilities necessary to write safe software?" If we were to make computer crashes a thing of the past, what would we have to do, both in our software and in our operating systems, to make this come to pass?

This discussion has been archived. No new comments can be posted.

Why Do Computers Still Crash?

Load 500 More Comments

Search 1533 Comments Log In/Create an Account

Comments Filter:

Simple ... (Score:4, Insightful)

by Vilim ( 615798 ) writes: <ryanNO@SPAMjabberwock.ca> on Tuesday May 20, 2003 @08:57PM (#6003281) Homepage

Well, basically as software systems get more complex there is more things to go wrong. That is why I like the roll-your-own-kernel of linux. Don't compile the stuff you don't need and fewer things can break.

Share
twitter facebook
- Re:Simple ... (Score:5, Insightful)
  
  by Transient0 ( 175617 ) writes: on Tuesday May 20, 2003 @09:01PM (#6003316) Homepage
  
  More specifically... As hardware gets more complex, software gets more complex to fill the available space. More complex software not only means more things to go wrong but also means that the hardware never really gets a chance to outpace the needs of the software.
  
  Also, as I'm sure someone else will point out, it is very hard to right code that will not crash under any circumstances. Even if you are running a super-stripped down linux kernel in console mode on an Itanium, you can still get out of memory errors if someone behaves rudely with malloc().
  
  Parent Share
  twitter facebook
  - - Re:Simple ... (Score:4, Funny)
      
      by Zach Garner ( 74342 ) writes: on Tuesday May 20, 2003 @09:09PM (#6003396)
      
      I find that is really easy to wrong code. I do it all the time...
      
      Parent Share
      twitter facebook
- Re:Simple ... (Score:5, Funny)
  
  by cscx ( 541332 ) writes: on Tuesday May 20, 2003 @09:01PM (#6003319) Homepage
  
  Actually the Zaurus he mentions crashing in the article runs a roll-your-own Linux kernel... [linuxdevices.com] ;)
  
  Parent Share
  twitter facebook
- Re:Simple ... (Score:5, Interesting)
  
  by The Analog Kid ( 565327 ) writes: on Tuesday May 20, 2003 @09:07PM (#6003377)
  
  Yes, on my parents computer, which has 2000 on it(tried Linux it didn't work for them). I set most of the services to manual that aren't needed. Disabled Auto-update. Put it behind a router ofcourse. The only problem remained was Internet Exploder, well I just installed Mozilla with an IE theme, haven't noticed a difference). I think killing most of the services keeps it up. Haven't had a problem with it. This was done before KDE 3.1.x so who knows Linux might work after all.
  
  Parent Share
  twitter facebook
- Simple, yes, for other reasons (Score:5, Insightful)
  
  by jabber01 ( 225154 ) writes: on Tuesday May 20, 2003 @09:32PM (#6003614)
  
  Software crashes because it's complex, yes, but that's just part of it.
  
  Jets are complex too. So is the Space Shuttle. Cruise ships. CARS are pretty complex.
  
  While all these things do suffer catastrophic failure from time to time, it is far from the norm. Defective cars get recalled. Space shuttles ALL get grounded at the mere possibility of defect.
  
  If Q/A as stringent as this was applied to software, Microsoft - and in fact most of the software industry - would be out of business. Can you imagine a Windows recall?
  
  There is software out there that does not fail. Mind-bendingly complex software of the sort that "drives mere mortals mad" to boot. It is tested and retested, through all possible situations - not just the "likely 80%" of them. It is proved correct, and then verified again.
  
  COTS software is crap because neither the market nor the regulatory forces (such as they are, but that's a separate discussion) do not require it to be. Nor could they.
  
  A 747 Jumbo costs a whole lot, and while much of that cost is in the manufacture of the "big and complex thing" that it is, a significant chunk of that cost is also due to the design process, the testing, the modeling and simulation of it.
  
  Software is easy to scale, everyone can have a copy of the product once one is built. Cake. But spread out the cost of an error free design - tested to exhaustion, passed through V&V and so on, and you have a completely different market landscape with which to contend.
  
  Consumers, in the COTS context, don't mind "planned obsolescence" in their software. The current state of things proves this. People would rather have pretty features on a flaky system, than a solid system.
  
  Parent Share
  twitter facebook
  - Re:Simple, yes, for other reasons (Score:4, Insightful)
    
    by Surazal ( 729 ) writes: on Tuesday May 20, 2003 @09:44PM (#6003690) Homepage Journal
    
    Consumers, in the COTS context, don't mind "planned obsolescence" in their software. The current state of things proves this. People would rather have pretty features on a flaky system, than a solid system.
    
    This is not necessarily true... it's a bad generalization besides. Most people I work with in the IT industry would give their arm, leg, spleen, right lung, part of their left lung, lower intestine, and maybe even their occipital lobes for a reliable system that WORKS. Features are secondary.
    
    The "features over stability" myth is just that: a myth. Show me an admin that prefers only the latest and greatest in "features" and I'll show you an admin that will lose all her/his hair within six months (a little after all their hair turns white).
    
    Well, ok, I work primarily with IT people admittedly. Perhaps the folks in management are a little different. But I've noticed that IT people have ways of making management's lives miserable (in ways that are downright creative) when a bad decision is made with software purchases. I've done it, myself. ;^)
    
    Parent Share
    twitter facebook
  - Re:Simple, yes, for other reasons (Score:4, Interesting)
    
    by Chris Carollo ( 251937 ) writes: on Tuesday May 20, 2003 @10:25PM (#6004001)
    
    Jets are complex too. So is the Space Shuttle. Cruise ships. CARS are pretty complex.
    
    Then again, if one of the overhead bin latches get stuck, or my overhead light burns out, or my seatbelt gets stuck, the entire plane or car doesn't instantly explode. The issue isn't complexity, it's fragility.
    
    Software is incomprehensibly fragile -- any single thing can cause a crash, taking the whole system or application down. And even those critical parts of things like airplanes have multiple redundancies, something that's hard to build into software. You can do things like catching exceptions, but you typically can't recover as gracefully as if there was never a problem at all.
    
    The shuttle is actually not a bad analogy -- it's also very fragile due to the stresses it endures. And we've effectively had two crashes in 100 runs. Most software is more stable than that.
    
    Parent Share
    twitter facebook
  - Re:Simple, yes, for other reasons (Score:5, Funny)
    
    by drinkypoo ( 153816 ) writes: <drink@hyperlogos.org> on Tuesday May 20, 2003 @10:39PM (#6004097) Homepage Journal
    
    Can you imagine a Windows recall?
    
    I must be able to, I'm feeling flushed and my nipples are hard.
    
    Parent Share
    twitter facebook
- - Re:Simple ... (Score:5, Funny)
    
    by orbbro ( 467373 ) writes: on Tuesday May 20, 2003 @09:13PM (#6003434) Homepage
    
    And, when the cocaine that let's YOU do all these things wears off, you'll crash!
    
    Parent Share
    twitter facebook
  - Re:Simple ... (Score:5, Insightful)
    
    by fishbowl ( 7759 ) writes: on Tuesday May 20, 2003 @09:13PM (#6003436)
    
    "However, if I'm trying to download a huge file while opening and closing lots of windows, programming some web pages, uploading them to the web, listening to some tunes, talk to 80 different people on AIM, and enjoying a flash animation at the same time, the computer might crash."
    
    Was it, or was it not, designed to be used in this way? If it was not, why does the system let you try it?
    
    Parent Share
    twitter facebook
    - Re:Simple ... (Score:5, Insightful)
      
      by DarkZero ( 516460 ) writes: on Tuesday May 20, 2003 @11:03PM (#6004230)
      
      Was it, or was it not, designed to be used in this way? If it was not, why does the system let you try it?
      
      Your microwave isn't designed to let you put an AOL CD or a piece of tinfoil in it and turn it into a box-shaped firecracker, but it still lets you try it. So the simple answer would be that it lets you do it because it can't control absolutely everything that it interacts with. A download manager isn't designed to be run at the same time as an MP3 player, AIM, ten browser windows, an IRC client, and downloads in other programs at the same time, but it still lets you try it because it has no control over those programs, no different than the microwave's lack of control over your hand and your AOL CD.
      
      Parent Share
      twitter facebook
- - it DOES cause an error (Score:5, Interesting)
    
    by ChrisCampbell47 ( 181542 ) writes: on Tuesday May 20, 2003 @11:17PM (#6004308)
    
    Interesting that the first two posts in the thread had English syntax errors in their first sentences. We can still understand it, but compilers/CPUs would have problems. Seems that the real problem is the difference in the natures of wetware and hardware.
    Actually, "syntax errors" like this DO cause a problem for wetware systems -- they cause the brain (well, mine at least) to kind of glaze over and take the remainder of the sentence/thought much less seriously. Kind of like aborting/returning out of a subroutine.
    Here in the Slashdot world of "definately" and "righting", I've learned that any posted comment that makes high-school-level grammatical or spelling errors is not worth my time and I immediately skip the post. I've been doing this quite rigorously lately -- blah blah blah "seperate" PAGE DOWN.
    OK now, everybody nod and think I'm talking about someone else's posts ...
    
    Parent Share
    twitter facebook
    - Re:it DOES cause an error (Score:5, Funny)
      
      by fucksl4shd0t ( 630000 ) writes: on Wednesday May 21, 2003 @02:50AM (#6005238) Homepage Journal
      
      Here in the Slashdot world of "definately" and "righting", I've learned that any posted comment that makes high-school-level grammatical or spelling errors is not worth my time and I immediately skip the post. I've been doing this quite rigorously lately -- blah blah blah "seperate" PAGE DOWN.
      You are just asking for it. :) Yes, you are. So here it is:
      "high-school-level" should not be hyphenated. That is a High School level grammatical error.
      That sound you hear is the toilet flushing your shit away.
      
      Parent Share
      twitter facebook
Easy (Score:4, Funny)

by PerlGuru ( 115222 ) * writes: <michael@thegrebs.com> on Tuesday May 20, 2003 @08:57PM (#6003283) Homepage

Same reason cars crash.... people ;-)

Share
twitter facebook
Whose computers still crash? (Score:5, Funny)

by fishbowl ( 7759 ) writes: on Tuesday May 20, 2003 @08:59PM (#6003296)

Crash? What crash?

radagast% uptime
8:56pm up 582 day(s), 12:45, 22 users, load average: 0.00, 0.00, 0.01

Share
twitter facebook
- Re:Whose computers still crash? (Score:5, Funny)
  
  by Anonymous Coward writes: on Tuesday May 20, 2003 @09:04PM (#6003358)
  
  load average: 0.00, 0.00, 0.01
  
  easy to keep a computer up if you never use it ;)
  
  Parent Share
  twitter facebook
  - Re:Whose computers still crash? (Score:5, Informative)
    
    by toddestan ( 632714 ) writes: on Tuesday May 20, 2003 @09:49PM (#6003721)
    
    Even with my uptime experiments, which consisted of taking an old but reliable hardware, installing Windows 95/OSR2/98/98SE/ME, and then letting the computer idle and do nothing never resulted in more than about 25 days before I came over and windows was fubar'ed or the computer was simply locked hard.
    
    Windows 3.1 actually did quite well if I remember right, as it seemed perfectly content sitting idle doing nothing seemly forever. Windows 9x always seemed to randomly thrash the HDD, even after a clean install, which led me to believe that Windows 9x is never truly idle, it's always up to something (virtual memory?), and that something eventually will bring it down.
    
    Windows 9x actually has a bug in it that would lock the computer after 46 days of uptime, but it took years to catch it because no one ever got close to that mark.
    
    Parent Share
    twitter facebook
    - - Re:Whoops, bullshit alert. (Score:5, Informative)
        
        by rabidcow ( 209019 ) writes: on Wednesday May 21, 2003 @12:40AM (#6004761) Homepage
        
        Microsoft says so. [microsoft.com]
        
        Actually it's in some driver, not the core OS, so it's not surprising that it doesn't happen to everyone. (There's a few other things [google.com] with similar problems.)
        
        Parent Share
        twitter facebook
    - - Re:Whose computers still crash? (Score:5, Funny)
        
        by fucksl4shd0t ( 630000 ) writes: on Wednesday May 21, 2003 @03:17AM (#6005350) Homepage Journal
        
        Kind of like how my 2 year old daughter carrying dishes to the sink. She's trying to be helpful, but occasionally she drops one.
        HEh. My daughter's 4 and she's never accidentally dropped a dish. That doesn't mean she's never broken one, though....
        My son's two, and it's impossible to tell if he drops dishes on purpose or on accident, because he does it so much.
        Should've named my daughter Linux and my son Windows. Now we're having another one, what should I name him? BSD? What's he gonna do? Sit there and whine about how nobody loves him 'cuase he's the only true eunich left? Or is he gonna spend his time crying because right after he's born they're gonna cut him into three pieces and each person will claim their piece is better than the whole?
        Wow, first time I've ever trolled BSD. I feel strangely liberated...
        
        Parent Share
        twitter facebook
- Re:Whose computers still crash? (Score:4, Insightful)
  
  by stefanlasiewski ( 63134 ) * writes: <slashdot AT stefanco DOT com> on Tuesday May 20, 2003 @09:19PM (#6003501) Homepage Journal
  
  Crash? What crash?
  
  up 582 days
  
  Reboot? What reboot?
  
  Now, when was the last time you tested those init scripts? :)
  
  -= Stefan
  
  Parent Share
  twitter facebook
  - Re:Whose computers still crash? (Score:5, Funny)
    
    by Dr. Photo ( 640363 ) writes: on Tuesday May 20, 2003 @09:57PM (#6003774) Journal
    
    Reboot? What reboot?
    
    Now, when was the last time you tested those init scripts? :)
    
    Init scripts? You heathen!!
    
    Rebooting is a special occasion, signalling the coming of the harvest season, or the installation of a new kernel. Accordingly, the High Priest shall bring the system up by hand, typing in the ancient incantations from the sacred scrolls.
    
    Init scripts are for the weak of faith. Let ye not be tempted by the daemons of rc-dot-d!
    
    Parent Share
    twitter facebook
    - Re:Whose computers still crash? (Score:5, Funny)
      
      by Guppy06 ( 410832 ) writes: on Tuesday May 20, 2003 @10:58PM (#6004199)
      
      "Accordingly, the High Priest shall bring the system up by hand, typing in the ancient incantations from the sacred scrolls."
      
      Would those sacred scrolls, perchance, be small, yellow, and stuck all around the monitor screen?
      
      Parent Share
      twitter facebook
    - - OT: Electric overconsumption (Score:5, Insightful)
        
        by maynard ( 3337 ) writes: on Tuesday May 20, 2003 @11:08PM (#6004267) Journal
        
        I used to leave all sorts of machines running 24/7 in my apartment. Several Suns, a couple PCs running Linux and BSD, an SGI, blah blah blah. I did take care to turn monitors off though. I kept this up until I turned off all my systems (except the mail server) for a two week vacation: I was shocked to discover the next electric bill arrived a good $80 cheaper. I've since cut back to a single machine which I turn off at night. No more crazy uptimes, but honestly - I'll take the money. I wish there was consumer demand for low power destop computing. I guess I'll just have to migrate to a good laptop for the low power option. But you're absolutely right: a few computers can suck up a lot of power, with damaging results to one's electric bill. --M
        
        Parent Share
        twitter facebook
        
        Re:OT: Electric overconsumption (Score:5, Interesting)
        
        by doorbot.com ( 184378 ) writes: on Wednesday May 21, 2003 @12:54AM (#6004811) Journal
        
        I wish there was consumer demand for low power destop computing.
        
        My mail/web server would run fine off of something rediculously small, like a Sharp Zaurus. Here are my requirements, and I will pay for one if it is available.
        
        Non-x86 hardware designed for lower power -- extra speed is nice, but not required; Pentium 200 speeds or better
        
        Low power, with 9V or AA-based battery backup (changeable while system is running)
        
        3" - 4" LCD (with manual switch to turn off) at 640 x 480, or some sort of LED array/VFD, because all I really need is a low power terminal supporting 80 x 24 characters.
        
        USB port for keyboard
        
        Serial port
        
        Two or three 10/100 NICs
        
        Full (Debian) Linux support of all hardware
        
        Some sort of expansion (PCMCIA maybe, or via USB)
        
        Support for CompactFlash for backups
        
        Hardware encryption would be a nice goodie but not required
        
        Yes, I could probably build this with PC104 components, but I want a pre-built product, and I'm willing to pay for it (maybe $300 - $400).
        
        Parent Share
        twitter facebook
- Re:Whose computers still crash? (Score:4, Insightful)
  
  by EvilTwinSkippy ( 112490 ) writes: <yodaNO@SPAMetoyoc.com> on Tuesday May 20, 2003 @09:29PM (#6003593) Homepage Journal
  
  So what Kernel is that you are running? Hmmm. If it's a linux box that would barely by 2.4. More likely 2.2.
  (Digging through my pile of vulnerabilities...)
  Say, could we get an address on that box? Muhuahahahaha
  My uptime is largely limited by kernel upgrades and the fact I cycle the power once per month to prevent the drive head from sticking.
  
  Parent Share
  twitter facebook
  - - Re:Whose computers still crash? (Score:5, Funny)
      
      by UserGoogol ( 623581 ) writes: on Tuesday May 20, 2003 @09:50PM (#6003727)
      
      Well... in my day I had to write games with just seven transistors and a piece of cheese! And I thought I was lucky. Kids today. Geez.
      
      Granted, I'm 16, but that's not the point.
      
      Parent Share
      twitter facebook
- Check this out -- lets talk some SERIOUS UPTIME. (Score:4, Interesting)
  
  by deathcow ( 455995 ) writes: on Tuesday May 20, 2003 @10:55PM (#6004189)
  
  We had a Cisco router wigging out the other week. Our Network Admin decided to reset it, and it offered this up:
  
  Kodiak_Rtr uptime is 6 years, 9 weeks, 3 days, 10 hours, 43 minutes
  
  System restarted by power-on
  
  Parent Share
  twitter facebook
AS LONG AS YOU CAN TEST EVERY STATE... (Score:5, Insightful)

by drink85cent ( 558029 ) writes: on Tuesday May 20, 2003 @08:59PM (#6003301)

As I've always have heard with computers you can't prove something works, you can only prove it doesn't work. As long as there are an almost astronomical number of states a computer can be in, you can never test for every possible case.

Share
twitter facebook
- Re:AS LONG AS YOU CAN TEST EVERY STATE... (Score:5, Insightful)
  
  by innosent ( 618233 ) writes: <jmdority&gmail,com> on Tuesday May 20, 2003 @09:54PM (#6003752)
  
  Not exactly. Assuming that the hardware is ok, you can prove that a system is reliable for any given finite input (including, most importantly, all possible finite substrings of inputs, however it is not possible to test all possible inputs, since a portion of those are infinite), it's just that doing so in large systems takes enormous amounts of time, and of course, time = money. Take Microsoft, for example. It takes a team years to develop a product like Windows XP, run a few test cases, and fix the major bugs. But just think how long it would take to go through every possible input substring of a given length (and by substring/string I am including non-character inputs [mouse, network, etc]).
  
  Consider a simple program that inputs 10 short strings of text and does some computations on those strings. Say for example that the system that has only a keyboard as input, that all input functions are guaranteed only to input A-Z (caps only), the space bar, and 0-9 (regex ((A-Z)*(0-9)*)*( )*), not to overflow, and that there are 10 inputs with exactly 10 characters for each input (spaces fill end of string). This means that there are 37 possibilities for each digit, totaling 37^100 unique possible inputs, about 6.61E156 possibilities, each 100 characters. Typing a million characters per second would take 2.094E145 years! Keep in mind that this is an extremely simple system.
  
  Therefore, it is not possible to test ALL input cases of any nontrivial program, only a few selected cases, which most will agree is far from proving a program correct. Instead, developers should have detailed mathematical descriptions of how a program is to behave at each incremental step, and verify that the program follows those descriptions accurately. Programs can only be proven correct in the same manner that any discrete mathematic concept can be proven correct, with one of the most common methods of a functionality proof being mathematical induction. Based on a few basic assumptions (like that the functions you call work as documented), the rest of the system can be proven by proving the trivial parts and cases first, and then constructing a complete proof based on the trivial parts.
  
  The problem with this is that a small change can have a big impact on the proof, and nobody actually takes the time to verify that everything still works. Companies don't often spend money on making their software 100% correct, they just need to add the nifty new features that their customers want before their competitors do. I'd be willing to bet that 90% of the bugs found in XP can be traced to a "nifty new feature" that broke code that may have been proven correct at some point.
  
  In other words, the short answer is yes, if you can test every state, you can prove a program correct, but since that's usually impossible, it becomes the developers' responsibility to incrementally prove the system, which is far easier if all functionality is planned ahead of time, but still too time/money consuming for most software companies to bother with. Microsoft doesn't care if your computer crashes, you'll probably still pay them, and as much as I'd like to think otherwise, OSS isn't much different (although it's usually more time than money there).
  
  Parent Share
  twitter facebook
- Debian. (Score:5, Funny)
  
  by twitter ( 104583 ) writes: on Tuesday May 20, 2003 @10:14PM (#6003909) Homepage Journal
  
  Debian [debian.org] tested in every state [debian.org], works good everywhere. I have yet to prove that it does not work anywhere in any way. I can not say the same thing for any other software I've ever run on a PC.
  
  Parent Share
  twitter facebook
- - Re:AS LONG AS YOU CAN TEST EVERY STATE... (Score:4, Insightful)
    
    by TheOnlyCoolTim ( 264997 ) writes: <tim.bolbrock@veriz[ ]net ['on.' in gap]> on Tuesday May 20, 2003 @10:11PM (#6003882)
    
    "We know that i+1 > i"
    
    Are you so sure? Depending on various circumstances, you might find that a little while after you get to 127 or 32767 (or thereabouts) i+1 has become i...
    
    Tim
    
    Parent Share
    twitter facebook
Human Error (Score:5, Insightful)

by Obscenity ( 661594 ) writes: on Tuesday May 20, 2003 @09:00PM (#6003302) Homepage

All programs (for the most part) must be written by people. People crash, they're buggy and they dont have a development team working on them. Computers crash because people cant catch that one little fatal error in 10,000 lines of code. Smaller programs are less succeptable to errors and big scary warning messages that make even the most world-hardend geek worried about his files. Yes, it's getting better with more and more people working on something at once. Mozilla (www.mozilla.org) has a feedback option to help them debug, many software companies are including this. But even with that in place, there is always that small human error, that will screw something up.

Share
twitter facebook
- Re:Human Error (Score:5, Insightful)
  
  by Malcontent ( 40834 ) writes: on Tuesday May 20, 2003 @09:45PM (#6003700)
  
  "People crash, they're buggy and they dont have a development team working on them. Computers crash because people cant catch that one little fatal error in 10,000 lines of code. "
  
  While this statement is true it's also a cop out. In the last twenty years there have been tremendous amount of advances in computer science and languages and yet everybody still programs in C.
  
  That is the reason why programs crash. Why don't people use languages that make programs more failsafe and make programmers more productive.
  
  It would be interesting to do a study of the "bugginess" of programs written in python, java, scheme, smalltak, lisp etc. My guess is that programs written in C crash the most.
  
  Where are all the programs written in scheme or smalltalk or ML?
  
  Use better languages and crash less.
  
  Parent Share
  twitter facebook
  - Re:Human Error (Score:5, Insightful)
    
    by Uller-RM ( 65231 ) writes: on Tuesday May 20, 2003 @10:15PM (#6003918) Homepage
    
    Java programs can still crash -- and believe me, grade homework for undergrad CS students for a few years and you'll see plenty of it. The only difference is that Java tosses an exception that isn't handled, and C either asserts and calls exit(-1) or segfaults.
    
    I don't think it's fair to say that any one language is "safer" than another -- once you reach a certain level of expertise, one can write a stable and robust program in C or C++ or Java or Haskell (my preference) with equal effort. The effort is mental: being persistent enough to define solid logical definitions for each part of the program, failure conditions, etc. and then execute them to the letter in the language of choice. If the program behaves logically, you can prove that it works using logical principles -- induction and so on. (And if you ever do govt contracting or any other project that calls for requirement tracability, you'll need to.)
    
    The difference between languages is merely the way the code is expressed. Java and C++ have exceptions; C does not. For some situations, return codes are better than exceptions, and for some situations the opposite is true. Java has robust runtime safety -- C and C++ do not. C and C++ have templated containers -- Java's just now getting such genericity. All languages and all approaches to problems have tradeoffs: the mark of a good programmer is knowing those tradeoffs and picking which is best for the situation.
    
    Parent Share
    twitter facebook
    - Re:Human Error (Score:5, Insightful)
      
      by ojQj ( 657924 ) writes: on Wednesday May 21, 2003 @03:35AM (#6005420)
      
      Disclaimer: I haven't programmed in Java since my undergrad, but I learned it before C++. I've been programming in C++ professionally for 3 years straight now, not counting internships and class assignments before that.
      I'd rather have an exception than a crash. It gives me more information about what I did wrong. A crash that's not reliably repeatable and only happens in your release version under Windows OT systems with IE 4 installed, is next to impossible to find and fix -- in C++ it's only worse.
      Not only that, but memory management is more than just a nuisance. Just yesterday, I wanted to move some code from one class to another to improve the object-oriented structure of some code which I've taken over from another developer. In that code were a couple of news, and I couldn't find the deletes which matched them. So I asked the original developer. Turns out the deletes were in a base class of the class that I was moving the code to. If I had been programming in Java, this would have been a cut and paste job finished in 30 seconds, plus 15 minutes for testing the change before checking in. In C++, it was 15 minutes trying to find the deletes myself, 15 minutes waiting for the other developer to get to a break point in his work and another 15 minutes assuring myself that the deletes really were called for all cases, and another 15 minutes for testing the change before checking in. That's a factor of 3-4 (depending on if I have something else I can do while waiting) for the C++ program.
      Memory management and other unnecessary tasks which C++ saddles the developer with do make an impact on either development time, program stability, or both. And that is also true for experienced C++ programmers.
      They also make an impact on language learning time, which is not to be underestimated with the number of newbies today, and people moving up from still worse languages like Cobol. In addition, even for an experienced C++ programmer, they make a difference in the time it takes to understand code which was programmed by another programmer.
      I agree with you that there are situations where every language, including C++, is the most appropriate for the problem in question. I just think that C++ is over-used, thus reducing the average stability of modern programs and the average productivity of modern programmers.
      
      Parent Share
      twitter facebook
  - Re:Human Error (Score:5, Insightful)
    
    by GlassHeart ( 579618 ) writes: on Wednesday May 21, 2003 @12:16AM (#6004629) Journal
    
    It would be interesting to do a study of the "bugginess" of programs written in python, java, scheme, smalltak, lisp etc. My guess is that programs written in C crash the most.
    Even that is a worthless statistic. Assuming that bad programs are written by bad programmers, the language that more bad programmers choose will appear the highest in your study as the buggiest language. Bad programmers choose the language du jour, thinking it will land them a cushy job.
    You'll have to disprove the assumption to correctly blame the language.
    Use better languages and crash less.
    Try dividing by zero in your better language of choice.
    
    Parent Share
    twitter facebook
- Re:Human Error (Score:5, Interesting)
  
  by JohnsonWax ( 195390 ) writes: on Wednesday May 21, 2003 @12:00AM (#6004548)
  
  "All programs (for the most part) must be written by people. ... Computers crash because people cant catch that one little fatal error in 10,000 lines of code."
  
  All bridges (for the most part) must be built by people. Bridges collapse because people can't catch that one little fatal error in one or two million components.
  
  The shit coders put out there, I swear... The reason software crashes is that by-and-large it's hacked together, not engineered. You hack a bridge together, and yes, it'll fail. You engineer software, and yes, it will run reliably. It's not fun to do - no easter eggs, no cool tricks, no cramming features in weeks before ship.
  
  I'm stunned at the amount of code that goes out that was written by interns, by unexperienced coders, by people that just don't have a clue. The software industry really has no concept of best practices, no leadership, no authority body. The fact that buffer overflows still happen is stunning.
  
  It's not small projects that work well because out of dumb luck they happen to not fail, or larger projects that work okay because we have 34,000 people looking at the code. If that's 'best practices', then we're doomed.
  
  "Mozilla (www.mozilla.org) has a feedback option to help them debug, many software companies are including this."
  
  Uh huh. Let's translate that to my car: "Hi. Yeah, I'd like to report a bug. I have a Saturn Ion, version 1.1v4. Yeah, when I turn on the left turn signal and then turn on the lights, the car catches on fire. You might want to fix that in the next version. Just though you might want to know. Bye."
  
  Parent Share
  twitter facebook
It's bugs! (Score:5, Insightful)

by madprof ( 4723 ) writes: on Tuesday May 20, 2003 @09:01PM (#6003317)

Applications are getting bigger. Code is growing in size. As computing power grows so does the complexity of the code that is written. This means there is a greater chance of bugs.
You can write in any language you like, but bugs will still get through. Lack of proper planning, non-anticipation of working conditions etc. all combine.
If you can make all programmers perfect then you may eliminate this problem. Otherwise I'm afraid we're going to be stuck with bugs.

Share
twitter facebook
Speed (Score:5, Insightful)

by holophrastic ( 221104 ) writes: on Tuesday May 20, 2003 @09:01PM (#6003318)

Why spend time testing and debugging and designing for situations which are too rare to be profitable?

I program my applications as properly as I can. But when the client wants to save money by testing it themselves and ignoring non-fatal bugs, they save money and are happy doing it.

So in other words, economy.

Share
twitter facebook
- Re:Speed (Score:5, Insightful)
  
  by mooman ( 9434 ) writes: on Tuesday May 20, 2003 @09:48PM (#6003715) Homepage
  
  I think this is one of the few responses to hit upon the crux of the issue.
  
  Most of the other comments all revolve about code and computers or perhaps even the human nature of the programmer itself but I think they all neglect the "one ring that rules them all": Management.
  
  I'm not quite an old fogie yet in the software world, but can at least claim to have been around (professionally) for about a dozen years and worked for about half a dozen Fortune 500 companies (plus a few startups).
  
  In every single case I could go on and on about corners that were cut, testing schedules that were compressed, last minute features that were added without proper design, and an alarming number of times when programmers expressed concern about the quality of some module/functionality only to have it ignored by management.
  
  It's soured me so much that I'm actually considering ditching the IT world outright. In a land where deadlines control everything, you are going to have stability or quality issues - I guarantee it.
  
  Now, I've a known a few apathetic programmers that I personally wouldn't trust with a pencil, much less the code that my company is selling, but by and large, I think programmers all have the potential to make almost entirely stable code, or at least as stable as the tools/libraries that they have to work with. But this relies on a rather more liberal amount of testing than most companies seem willing to invest in.
  
  Most companies seem to use a sort of 80/20 rule. If they get 80% of the bugs out, that's good enough.. They can get their product to market and ship patches later. Having a more robust application just isn't as strong of a selling point, and even harder to prove? How do you show, objectively, that your app is more stable than your competition's? Most companies would flinch from even opening this can of worms because then you have to fess up to a certain non-zero percent failure rate on your own product.
  
  This whole issue touches on the "featuritis" problem as a whole. In order to maintain a revenue stream, software companies have to do one of two things:
  1) Keep making new versions of your app(s) *and* convince everyone they need the newest one [often when they really don't] -or-
  2) Try to make revenue through support contracts.
  The latter mostly goes away if your code is as good as you claim it is, making it a tough sell, or your customer base just doesn't have the budget for an ongoing contract where they really aren't getting anything new over time but still have to pay for it!
  
  Thus, the push for "features", and with the push for features you have deadlines, and with deadlines you have management doing whatever it takes to meet them, even if those choices are detrimental to the product.
  
  Ugh. This is why I hold on to the hope that open source apps, frequently hobbyist in nature, will continue to gain strength. In general, the contributors to those projects are chasing a vision and not market share. These guys and gals seem more willing to keep plugging away at something before they call it "done" and foist it on users. Of course, most of these projects are never "done" by their own admission, and will spend their lives in perpetual development. But at least I know that some of the effort is toward bug fixing and not just new features.
  
  But when Microsoft could have tried to make Word 95 better and more stable, they instead came out with 97.. and instead of making it more stable, they decided the market needed Word 2000, and then... Yawn. You get my drift. I would have been happy with a bulletproof Word 97 myself...
  
  [disclaimer: In the above I've made an outright embarassing number of generalizations and I hope they are identified as such and that I don't get slammed with a litany of examples supposedly to the contrary. Thank you.]
  
  Parent Share
  twitter facebook
  - Re:Speed (Score:4, Interesting)
    
    by mackstann ( 586043 ) writes: on Tuesday May 20, 2003 @10:16PM (#6003921) Homepage
    
    Well said, I would have to agree with the majority of your post. The only thing I have an issue with is:
    I'm not quite an old fogie yet in the software world, but can at least claim to have been around (professionally) for about a dozen years and worked for about half a dozen Fortune 500 companies (plus a few startups).
    
    [..]
    
    It's soured me so much that I'm actually considering ditching the IT world outright.
    
    It took you over a decade? I've been working in "light" IT for about 4 months, and I already have come to this unfortunate conclusion. Writing commercial software just isn't fun, not only do you have to write software that you may not find all that interesting, but you also are denied the opportunity to use your skills to the fullest and create something that you are truly proud of. Corners are cut, and in the end, you realize, that it's just a "product", or an in-house "app", it only needs to work "good enough", nevermind if the code needs cleaning up or whatever other issues there are (they don't (seem to) exist if you're not staring at the code!).
    
    Parent Share
    twitter facebook
In my CompSci class.. (Score:4, Insightful)

by ziggy_zero ( 462010 ) writes: on Tuesday May 20, 2003 @09:01PM (#6003320)

...I remember my teacher saying "Computers do exactly what they're told, not necessarily what you want them to do."

I think the root of the problem is time. Microsoft doesn't have the time to spend going through every possible software scenario and interaction, or every possible hardware configuration. If they did do that, it would probably take a decade to pump out an operating system, and by that time hardware's changed, and it's a neverending cycle.....

We just have to accept the fact that the freedom of using the hardware components we want and the software we want, all made by different people, will result in unexpected errors. I, for one, have come to grips with it.

Share
twitter facebook
- Re:In my CompSci class.. (Score:4, Insightful)
  
  by nick_davison ( 217681 ) writes: on Tuesday May 20, 2003 @09:43PM (#6003683)
  
  ...I remember my teacher saying "Computers do exactly what they're told, not necessarily what you want them to do."
  
  D&D summed it up for me, years ago, with the wish spell: At its purest, it's too powerful to give to players - they'll unbalance and destroy the game. However, it can be balanced by giving them exactly what they ask for.
  
  "A demon lord approaches you out of the shadows."
  "I cast 'wish' - I wish for a +100 sword of almighty vorpal type slayingness."
  "The sword appears in the demon's hand. He thanks you for it, then hits you."
  
  Writing good code is like making a good wish. All you can do is try to cover as many eventualities as possible. The problem is, code gets really slow to run and even slower to write when you have to add out of bounds checks on every argument, error handling and reporting, garbage collection and all the rest. Even then, there'll always be some twisted scenario that you didn't know could exist so didn't plan for. So most people just give up, wish for the damn sword and hope the PC/Dungeon Master doesn't have too evil an imagination this time.
  
  Parent Share
  twitter facebook
  - Re:In my CompSci class.. (Score:5, Funny)
    
    by KrispyKringle ( 672903 ) writes: on Wednesday May 21, 2003 @12:02AM (#6004555)
    
    And just when being into computers was starting to get "cool" (think The Matrix, Hackers, or Swordfish) someone like you comes along and start talking about Dungeons and Dragons. There go my chances of getting laid. There go all our chances of getting laid.
    
    Parent Share
    twitter facebook
- Time is Money. (Score:5, Interesting)
  
  by Rimbo ( 139781 ) writes: <rimbosity@sbcgloba l . net> on Tuesday May 20, 2003 @09:53PM (#6003744) Homepage Journal
  
  I think this is basically the right answer.
  
  A couple of months ago, the company I worked for spent a lot of time and effort developing a robust testing methodology. We had a software product that through blood sweat and tears would not crash unless you basically blasted the hardware in some way.
  
  But that led to two problems. First, we only had so many people working, and resources spent testing and bugfixing were not being used to add new features. Second, the time it took to get it that robust delayed the product's release beyond the point where we could recover the investment. [Time developing] * [Cost of operating] was greater than [expected number of units sold] * [price per unit].
  
  What ended up happening was that we lacked the features to justify the price and number of units we needed to sell to cover the cost of developing it. We had no bugs -- and we could be certain of it -- that would crash the machine.
  
  As of last month, the company could no longer afford to pay me. I'm not there any more.
  
  The moral of the story is that trying to make a bug-free product will bankrupt your company, especially a startup. Software tools have improved, but the benefit largely goes towards adding new whiz-bang features that sell the product for more money, not to being able to fix more bugs.
  
  What we should do as engineers and managers of software products is to not be afraid of getting the product out the door with a few bugs in it if we want our company to do well; this business reality is ultimately why bugs will a big part of software for the forseeable future.
  
  Parent Share
  twitter facebook
because someone was very curious and decided to... (Score:4, Funny)

by null-sRc ( 593143 ) writes: on Tuesday May 20, 2003 @09:01PM (#6003324)

*0;

never follow the null pointer they said... what are they hiding there????

Share
twitter facebook
Reliability and complexity (Score:4, Insightful)

by woodhouse ( 625329 ) writes: on Tuesday May 20, 2003 @09:02PM (#6003327) Homepage

Because reliability is inversely proportional to complexity. Systems these days are generally a lot more complex than those of 10 years ago, and in complex systems, bugs are much harder to find. The fact that you say stability hasn't changed is in fact a pretty impressive achievement if you consider how much more complex hardware and software is nowadays.

Share
twitter facebook
It's not the need for speed (Score:5, Insightful)

by Jeremi ( 14640 ) writes: on Tuesday May 20, 2003 @09:02PM (#6003331) Homepage

It's the need for new features. Every feature that gets added to a piece of software is a chance for a bug to creep in.

Worse, as the number of features (and hence the amount of code and number of possible execution paths) increases, the ability of the programmer(s) to completely understand how the code works decreases -- so the chances of bugs being introduced doesn't just rise with each feature, it accelerates.

The moral is: You can have a powerful system, a bug-free system, or an on-time system -- pick any two (at best).

Share
twitter facebook
- Re:It's not the need for speed (Score:5, Insightful)
  
  by WasterDave ( 20047 ) writes: <[moc.pekdez] [ta] [pevad]> on Tuesday May 20, 2003 @09:28PM (#6003580)
  
  Thank you, at least somebody got it fucking right.
  
  Software doesn't have to crash, but for a given quantity of development resources there's a fairly simple tradeoff between feature-richness and stability.
  
  You want reliable? Strip back features left right and centre, design an elegant architecture, then unit test properly.
  
  Dave (in a ranty mood)
  
  Parent Share
  twitter facebook
Don't forget the hardware... (Score:5, Insightful)

by MightyTribble ( 126109 ) writes: on Tuesday May 20, 2003 @09:03PM (#6003335)

Some crashes aren't the fault of the OS. Bad RAM, flaky disk controllers, CPU with floating-point errors (Intel, I'm looking at *you*. Again. *cough* Itanium *cough*)... all can take down an OS desite flawless code.

That said, some Enterprise-class *NIX (I'm specifically thinking of Solaris, but maybe AIX does this, too) can work around pretty much any hardware failure, given enough hardware to work with and attentive maintainence.

Share
twitter facebook
crashes? (Score:4, Interesting)

by Moridineas ( 213502 ) writes: on Tuesday May 20, 2003 @09:03PM (#6003338) Journal

Well the computers that I manage we've got an OpenBSD server hat never crashes (uptime max is around 6months--when a new release comes out) and a FreeBSD server that has never crashed--max up time has been around 140-150 days, and that was for system upgrades/hardware additions.

On the workstation side they are definitely not THAT stable, but since we've switched to XP/2K on the PC side, those pc's regularly get 60+ days of uptime. Just as a note--I had a XP computer the other day that would crash about two or three times a day. The guy that was using it kept yelling about microsoft, etc etc etc. Turned out to be bad ram. After switching in new ram it's currently at 40 days uptime (not a single crash).

For some reason the macs we have get turned off every night so their uptime isn't an issue, but from what I hear OSX is quite stable.

Share
twitter facebook
Touchy subject (Score:5, Interesting)

by aarondyck ( 415387 ) writes: <aaron AT ufie DOT org> on Tuesday May 20, 2003 @09:04PM (#6003346) Homepage Journal

I remmeber years ago having a conversation with an IT manager at IBM. We were talking about the inability of computer programmers to make their code foolproof. His point was that we don't see problems like this with proprietary hardware. When was the last time someone crashed their Super Nintendo? Of course, with a PC platform (or even Mac, or whatever else) there are problems of unreliability. His idea is that this is because of sloppy programming. The reason we were having this conversation is that I had a piece of software (brand new, I might add) that would not install on my computer. You would think that a reputable software company (and this [sierra.com] was a reputable company) would test their product on at least a few systems to make sure that it would at least install! The end result was that I ended up never playing the game (not even to this day), nor have I purchased another title from that company since that time. Perhaps that is the solution to the root problem?

Share
twitter facebook
Scientific American... (Score:5, Interesting)

by Hanji ( 626246 ) writes: on Tuesday May 20, 2003 @09:04PM (#6003347)

Scientific American actually had an article [sciam.com] on a similar topic. Basically, they seem to be accepting crashes as ineveitable, and were focusing on systems to help computers recover from crashes faster and more reliably...

They also propose that all computer systems should have an "undo" feature built in to allow harmful changes (either due to mistakes or malice) to be easily undone...

Share
twitter facebook
It's expected. (Score:4, Insightful)

by echucker ( 570962 ) writes: on Tuesday May 20, 2003 @09:05PM (#6003361) Homepage

We've lived with bugs for so long, they're a fact of life. They're accepted as part of the daily dealings with computers.

Share
twitter facebook
Complexity, my dear Watson (Score:5, Insightful)

by T5 ( 308759 ) writes: on Tuesday May 20, 2003 @09:05PM (#6003364)

It's all about the bits. There are just so many more of them now, and a great deal more pressure in the marketplace to bring ever newer software and hardware to market. Back in the day of the IBM 360 and the VAX, even though we were mesmerized by the capabilities of these machines, they were years and years in the making, debugged much more thoroughly than we can hope for today, and much, much simpler.

And let's not forget that this was the exclusive realm of the highly trained engineer, not some wannabe type that pervades the current service market. These guys knew these machines inside and out.

Share
twitter facebook
- Complexity, standards, peer review, sanity. (Score:4, Insightful)
  
  by twitter ( 104583 ) writes: on Tuesday May 20, 2003 @09:54PM (#6003755) Homepage Journal
  
  this was the exclusive realm of the highly trained engineer, not some wannabe type that pervades the current service market.
  
  Let's hear it for the "wannabes". I'm not a highly trained engineer by a long shot, but I've got computers that don't go down except for power outages. Then they come right back up. As ERS is so fond of pointing out, complexity kills traditional software. Cosed source can't keep up.
  
  Free software has the answer. Debian [debian.org] has 8,710 packages available to do anything a comercial comercial software does, mostly better. Not just one or two pieces of it, every piece. My systems never crash under their stable release and I run all sorts of services. How is this? It's easy. Free code get's used, fixed, improved and reviewed all the time. The pace of improvement is astounding. I could go on and on about things free software does that common comercial code does not. Code that never sees the light of day is dead.
  
  Parent Share
  twitter facebook
Essence of Software Engineering (Score:5, Insightful)

by Zach Garner ( 74342 ) writes: on Tuesday May 20, 2003 @09:05PM (#6003366)

Read "No Silver Bullet: Essence and Accident of Software Engineering" by Brooks. A copy can be found here [berkeley.edu].

Software is extremely complex. Developed to handle all possible states is an enormous task. That, combined with market forces for commercial software and constraints on developer time and interest for free software, causes buggy, unreliable software.

Share
twitter facebook
Microsoft (Score:5, Insightful)

by eht ( 8912 ) writes: on Tuesday May 20, 2003 @09:08PM (#6003388)

Microsoft has made an extremely stable OS, it's called Windows 2000, as long as you use MS certified drivers the OS should never crash, individual programs may crash under Windows, but you can hardly blame Microsoft for that. I have had Windows machines with months of uptimes and no problems, went down 8 days ago due to power failure too long for my UPS's to handle, which also took down my FreeBSD machines, uptime is matched for all of them, and will one day again be measured in months.

Yes I should probably patch some of my Windows machines, but I have my network configured in such a way that for the most part I don't need to worry and you don't have to worry about my network spewing forth slammer or other nasty junk.

Share
twitter facebook
- Re:Microsoft (Score:5, Interesting)
  
  by VTS ( 673706 ) writes: on Tuesday May 20, 2003 @09:20PM (#6003520)
  
  Some time ago I would have agreed with you, but not anymore, If media player crashes playing some video then the whole system becomes unstable and then even doing something like sending a file to the recyclebin freezes the UI...
  
  Parent Share
  twitter facebook
- Re:Microsoft (Score:4, Interesting)
  
  by CognitivelyDistorted ( 669160 ) writes: on Tuesday May 20, 2003 @10:47PM (#6004140)
  
  Yes, NT5+ is very stable. MS is working on the driver problem. SLAM [microsoft.com] is a tool for verifying drivers. Given a requirement, e.g., after acquiring a kernel lock the driver must release it exactly once on all control paths, and some driver source code, SLAM can find all the ways the driver can fail the requirement. They have specifications for various driver types and are using them to test some drivers. It's a research project by the Software Development Tools group in MSR, but they're working on getting it stable and powerful enough to verify more drivers. If they can get it to work well enough, they'll supply it to hardware vendors.
  
  Parent Share
  twitter facebook
- Nope! Case in point. (Score:4, Informative)
  
  by fireboy1919 ( 257783 ) writes: <rustyp@freeshe l l .org> on Tuesday May 20, 2003 @11:35PM (#6004401) Homepage Journal
  
  I have a Microsoft reference driver for my soundcard (i.e. Microsoft made the driver and approved it themselves). I use it on my computer.
  
  Unfortunately, two things cause it to fail.
  1) It doesn't play nice with other drivers on the same IRQ.
  2) Microsoft's advanced power management driver assigns it to the same IRQ as my USB port and my network card, and that can't be changed without a reinstall of Windows.
  
  So basically, what happens is that the sound card will eventually crap out completely and never work again (until reboot) if it attempts to work at the same time either of the other two devices on that IRQ are working.
  
  Keep in mind:
  1) Microsoft knows about this bug
  2) It causes system instability for lots of drivers - even certified ones
  
  I should also mention that there is nowhere that this bug is reported by the OS; I had to find it through trial, error, and lots of research. Win2K is not as stable as you think
  
  Parent Share
  twitter facebook
Economics? (Score:5, Insightful)

by iso ( 87585 ) writes: <slash@warpzero[ ]fo ['.in' in gap]> on Tuesday May 20, 2003 @09:09PM (#6003395) Homepage

While it's not the whole story, something definitely has to be said about the fact that while people are willing to pay for features, they're rarely willing to pay more for stability. Quite frankly there's little economic incentive to make software that doesn't crash.

If your market will put up with the ocassional crash, and never expects software to be bulletproof, why bother putting the effort into stability? Until people start putting their money into the more stable platforms, that's not going to change.

Share
twitter facebook
The ultimate solution (Score:5, Interesting)

by dsanfte ( 443781 ) writes: on Tuesday May 20, 2003 @09:09PM (#6003401) Journal

The ultimate solution to the problem is to let computers write the software themselves. Give them a goal, set up evolutionary and genetic algorithms, and let them go at it on a supercomputer cluster for a few months.

Of course, you'd need to make sure the algorithms that humans wrote aren't flawed themselves, but once you got that pinned down, you would be more or less home-free.

Even if you didn't take this drastic a step, another solution would be computer-aided software burn-in. Let the computer test the software for bugs. A super-QA Analysis if you will. Log complete program traces for every trial run, and let the machine put the software through every input/output possiblity.

Share
twitter facebook
- Re:The ultimate solution (Score:5, Interesting)
  
  by Jeremi ( 14640 ) writes: on Tuesday May 20, 2003 @09:24PM (#6003545) Homepage
  
  The ultimate solution to the problem is to let computers write the software themselves. Give them a goal, set up evolutionary and genetic algorithms, and let them go at it on a supercomputer cluster for a few months.
  
  That only works if you can write a fiteness algorithm that can tell whether the program did the correct thing or not -- otherwise, you have no way to decide what to "breed" and what to throw away. And for many types of program, that fitness algorithm would be more difficult to write than the program you are trying to auto-generate...
  
  Of course, you'd need to make sure the algorithms that humans wrote aren't flawed themselves, but once you got that pinned down, you would be more or less home-free.
  
  All you've done is replace a hard problem ("write a program that does X") with a harder problem ("write a program that teaches a computer to write a program that does X"). No dice.
  
  Even if you didn't take this drastic a step, another solution would be computer-aided software burn-in. Let the computer test the software for bugs. A super-QA Analysis if you will. Log complete program traces for every trial run, and let the machine put the software through every input/output possiblity.
  
  For most modern programs, there isn't nearly enough time left before the heat-death of the universe to do this. Hell, for programs other than simple batch-processors, the number of possible input and outputs is infinite (since the program can do an arbitrary number of actions before the user quits it)
  
  Parent Share
  twitter facebook
For those who are willing to pay... (Score:5, Insightful)

by PseudononymousCoward ( 592417 ) writes: on Tuesday May 20, 2003 @09:14PM (#6003445)

The number of bugs is smaller. Think of the systems used by the telcos, or NASA. Are they perfect? No, but they are much, much more stable than Win32, or Mac, or Linux. The reason is simple, the owners demand them to be.

There are costs associated with fixing bugs and reducing crashes. The more stable an operating system is to be, the more time and money that must be devoted to its design and implementation. PC users are not willing to pay this amount for stability, either in explicit cost, or in hardware restrictions or in trade-offs for other features.

As Linux evolves over time, its stability will always improve, but it may still never reach the stability of, say, VMS. Why? Because even with the open source model of development, there are still tradeoffs to be made, tradeoffs between new features and stability, mostly. And successive bugs are harder and harder to fix, requiring greater and greater amounts of time. At some point, the community/individual decides that they would rather spend their time going after some lower-hanging fruit.

Just my $0.02

Actually, IAAE.

Share
twitter facebook
- Re:For those who are willing to pay... (Score:5, Interesting)
  
  by dghcasp ( 459766 ) writes: on Tuesday May 20, 2003 @10:22PM (#6003973)
  
  Think of the systems used by the telcos, or NASA. Are they perfect? No, but they are much, much more stable than Win32, or Mac, or Linux. The reason is simple, the owners demand them to be.
  
  This reminds me of a story I read in the internal magazine of a telecomunications equipment supplier that I used to work for. It was about an international toll switch somewhere in the U.K. that had been up for 17 years (or something extreme like that.) Furthermore, this included having all of its hardware upgraded and replaced. Twice.
  Just stop and think about that for a while in PC terms... "I replaced my motherboard with the power on without rebooting my system, while it was serving 10,000 web pages a second."
  Granted, this is a higher level of hardware with full redundancy, but it still boggles my mind.
  
  Parent Share
  twitter facebook
- Actually (Score:4, Insightful)
  
  by Sycraft-fu ( 314770 ) writes: on Tuesday May 20, 2003 @10:50PM (#6004164)
  
  One of the biggest barriers to stability for something like Linux (or Windows) is the fact that it must accomadate new software and hardware configurations all teh time. If you take a Lucent 7R/E phone switch it will run on a given hardware (the 7R/E) hardware. IT will run Lucent's OS, it will do only what it was designed to (switch phone circuts). There is no putting new hardware in it, less it be Lucent approved, there is no loading of new apps to make it do things, less it be Lucent approved, and so on.
  
  IF you want an open OS that will run with hardware by whoever happens to want to make it and software by whoever hapens to want to write it, you cannot have a verified design that is 100% reliable. Unforseen interactions WILL happen and crashes or other malfuncations will result.
  
  Parent Share
  twitter facebook
Mandate memory checking tools (Score:5, Interesting)

by hawkstone ( 233083 ) writes: on Tuesday May 20, 2003 @09:15PM (#6003454)

I'm sure it's harder to accomplish this for kernel level code (it's primarily OSes being pointed at right here) but you can think everything is working hunkey-dorey and not realize something is going wrong under the covers.

Most errors of this can be found with testing under tools like valgrind [kde.org] or Rational's purify [rational.com]. I'm sure there are others (I've heard of ParaSoft Insure++, ATOM Third Degree, CodeGaurd, and ZeroFault), but the quality of these tools really matters.

The issue is that tiny errors can cause crashes intermittently, and not immediately. For example:
uninitialized memory reads -- usually not a problem, but if this value is ever actually used, it will be.
array bounds reads -- never acceptable, but depending on the structure of memory, may not always cause an immediate crash.
array bounds writes -- like ABRs, may not be immediately fatal, but these are going to crash your code sooner or later.

Since they don't always cause an immediate crash, these errors are likely to creep in to released code without use of one of these tools. And if you want to know why we shouldn't always run programs in an environment that checks these kinds of things, try it once; you'll notice a speed hit of usually an order of magnitude. C/C++ is a perfectly acceptable language -- not all debugging has to be done by the compiler/interpreter or only after you notice a problem.

Anyway, hope that wasn't too pedantic....

Share
twitter facebook
Obligatory anti-MS (Score:5, Insightful)

by cptgrudge ( 177113 ) writes: on Tuesday May 20, 2003 @09:18PM (#6003500) Journal

Of course, there's no need to mention Microsoft's inability to create a stable system.
What exactly is the purpose behind this? Why was it put in here? People are going to need to grow up if people in "our" circle want to be taken seriously. I've used Windows 2000 and Windows XP both. They crash as much as my Red Hat and Debian boxes do. Never. They are all rock solid.
I work for a public school system. We have a class at the High School that teaches and certifies for A+ (I know, I know). They have all sorts of problems getting stuff to work and to get a system stable. In Windows and Linux.
It isn't because they are high schoolers.
It isn't because they are "just learning".
It's because they buy really shitty hardware. They look for the best cost, and they get their hardware from some loser manufacturer that has fucked up drivers and horrible quality control.
Properly maintained boxes with quality hardware in them just don't crash anymore. Programs maybe, but not systems.
Christ, people, this has been beat to death! Microsoft has a great product for an OS now! Get back to making something better than them instead trying to convince yourself that Microsoft is delusional.
Mod me Flamebait, I don't care.

Share
twitter facebook
- - Re:and (Score:4, Informative)
    
    by Temporal ( 96070 ) writes: on Tuesday May 20, 2003 @10:31PM (#6004041) Journal
    
    My Win2k box plays games reliably and maintains more than a few months of uptime.
    
    Please refer to this post [slashdot.org] for more information.
    
    Thank you.
    
    Parent Share
    twitter facebook
A lesson from history (Score:5, Insightful)

by Dr. Bent ( 533421 ) writes: <ben@@@int...com> on Tuesday May 20, 2003 @09:26PM (#6003564) Homepage

Back in the Middle Ages, when the Catholic Church wanted a Cathedral built, they would pay a bunch of Freemasons to do it. The Freemasons viewed themselves as creative artisans, and they closely guarded the secrets they used to construct these impressive houses of worship.

The method they used, however, was less than impressive. Typically, they would start with a general design, and piece together stone and mortar until something collapsed, which happened quite [thinkquest.org] often [heritage.me.uk]. Then they would patch the section that collapsed and keep on going until something else fell down, or they finished. Given the level of understanding with regards to Physics and Material Science, those Freemasons has no other choice than to build them this way.

Now fast forward to the 21st century. The engineering disasters on par with those medieval collapses can be counted on one hand (Tacoma Narrows Bridge and the Hyatt Regency walkway collapse are the only two I can think of). This is directly due to the fact that a civil engineer can determine if a design is structurally sound before they build it.

Contrast this with modern day software development. We can't even tell if a system is flawed after we build it, let alone before. So software gets written, deployed, and put into the marketplace that has no assurances whatsoever of actually doing what it's supposed to do (hence the 10,000 page EULA).

You can't have Civil Engineers until you have Physics. And you won't have 100% bulletproof software until you have Software Engineering. And you won't have that until someone can figure out a way to prove that a given peice of software will perform as it's supposed to. JUnit [junit.org] is a step in the right direction, but there's still a long way to go. It's going to take a breakthrough on the order of Newton to make Software Engineering as reliable a discipline as Civil Engineering.

Share
twitter facebook
- Re:A lesson from history (Score:4, Insightful)
  
  by Minna Kirai ( 624281 ) writes: on Tuesday May 20, 2003 @11:23PM (#6004337)
  
  It's going to take a breakthrough on the order of Newton to make Software Engineering as reliable a discipline as Civil Engineering.
  
  The reliablity of today's Civil Engineering comes not from deep theoretical understanding ala Newton- it's really just the same "build, crash, repeat" method those Freemasons have been using for 1000 years.
  
  Now that we've had centuries of experience at building similar kinds of structures, most of the kinks have been worked out. Those rare CivEng projects that break new ground still have a high risk of unexpected failures. (A 4000% cost overrun is a failure [boston.com])
  
  Civil Engineering still uses empirical testing to decide if a new technique is reliable, as does "Software Engineering". You just notice it more in SE because that field has more opportunities for innovation and much, much fewer penalties when an experiment fails.
  
  JUnit is a step in the right direction, but there's still a long way to go.
  
  JUnit is a step down a curving road to a dead-end. It won't take us to an ultimate solution (but it will provide benefit in the near-term future). That's because it's not a system to help formally prove code is correct (which some unpopular languages support to small degrees)- instead, Unit Testing is just a way to automate "build, crash, repeat" empirical testing.
  
  Parent Share
  twitter facebook
- Re:Try the UML (Score:4, Interesting)
  
  by Billly Gates ( 198444 ) writes: on Wednesday May 21, 2003 @12:18AM (#6004645) Journal
  
  Architects and engineers use extremely detailed drawings. Have you ever taken any drafting courses in Highschool or College? Every piece and even the size of every screw is accurately detailed as possible. It takes forever to get anything done because the precsion is more important. It drives some people like myself crazy.
  
  The blueprint is the actual prototype of the product being designed.
  
  The problem is if you document every step and algorthim in exact detail you will spend weeks, months, and yes years without a single line of code!
  
  This is unacceptable in today's bussiness world where all the projects are due yesterday and your bosses demand percentage wise how much of the code is being developed. If you spend a month planning and not a single line of code is developered your canned.
  
  My father took over a project where a clueless IT manager got because she slept with the CIO. Anyway she went to a seminar which talked about over flowcharting everything would be the wave of the future. She then had all the programers draft every single algorithm to the very if statements themselves on paper. After 4 months and not a single line of code my old man took over. From there he finished the project within 3 weeks!
  
  My point is that drafting programs is too time consuming. In a way your drawing is the program and changes can be made as you go. Its essential to have good flowcharts and notes but they need to be generalized. If there is an error in it you can delete the line and fix it. In engineering you would have to dissamble the actual product and redesign it. Because they would cost time and money it is not accepted. In software that limitation is not there or as sevre.
  
  UML tries to be the blueprint of all software programs but instead is only used to explain certain subsystems and algorithms. Mostly flowcharts are used so all the developers have a sense on how the program will work and how to invoke different pieces of the program.
  
  I do not think this going to change unless there is a quick and easy way to debug UML charts. Logic errors are killer and if its perfect I suppose you can compile the uml directly into the language of choice.
  
  Hmmm infact this might be the way to do it in the future.
  
  Parent Share
  twitter facebook
Software, complexity, and human nature. (Score:4, Insightful)

by Christopher Thomas ( 11717 ) writes: on Tuesday May 20, 2003 @09:31PM (#6003611)
There are several reasons why software keeps crashing, and they aren't going away any time soon. These reasons are:
- You can't prove that most software works.
  
  Except for a restricted set of cases, you can't prove that a given piece of code works or doesn't work. A truly exhaustive set of tests would be impractical to perform, and formal proofs of correctness place strong limits on the type of code you can write and the environment in which you can write it.
  
  The result is that code is assumed correct when no bugs are found. This only means that there probably aren't _many_ bugs left. Thus, it may still crash (or have a security hole, or what-have-you).
- Software is very complex.
  
  Software has been complex for a long time. It just tends to be bigger now. A larger system has more opportunities for unexpected high-level interactions between components, but even a smaller system will have enough twists and turns that formulating a really good test suite, or checking the code by inspection, is very difficult. Bugs will be missed. As was discussed above, many of these missed bugs will slip through testing and reach the world.
  
  Nobody wants to pay for perfect software.
  
  As more effort is applied, you can get asymptotically closer to a bug-free system. However, this is far past the point of diminishing returns on the cost/benefit curve. For sufficiently constrained systems, you can even try proving it correct, but this tends to lead to cutting out a lot of functionality, speed, or both.
  
  In situations where reliability must be had at any cost - aerospace control systems, vehicle control systems, medical equipment - the money will exist to produce near-perfect code, but even then there are bugs that occasionally bite. With commercial software, the buyer would rather have an application that crashes now and then than an application that costs ten times as much and comes out several years later.
  
  Free and/or open software avoids some of this by staying in development longer, which allows more of the bugs to be caught, but even free and/or open software evolves. Every change brings new bugs to be squashed. As long as there are new types of software that we want, it isn't going to end.
Share
twitter facebook
STFP (Score:5, Insightful)

by rice_burners_suck ( 243660 ) writes: on Tuesday May 20, 2003 @09:36PM (#6003651)

Software crashes because: Software is an immature field. Good software takes time. Software is unobvious to business managers who want the job done yesterday.
Businessmen generally do not understand the internal workings of software. They are in a "big-picture" sort of world where software is but one pesky detail that will be taken care of. A computer crash that causes so many thousands of dollars in damages is no different than a truck crash. There is simply a risk to every element of business. If the risk is relatively low, the big shots don't care about it. Grocery stores in earthquake prone areas continue to place glass jars on the edges of shelves. Sure, there will be an earthquake one day, but it's a calculated part of business risk, and the risk is relatively low (the Earth doesn't shake every five minutes).
Software bugs are a similar risk. It needs to look like it works. It needs to crash (and lose data) infrequently enough that the software will still sell. The business is not concerned with stamping out software bugs. It is concerned with releasing the software and making money. If the need arises, the business will improve the software and make more money. More often than not, this means adding features and shiny graphics. Fixing bugs is not very important to companies because customers do not pay for bug fixes. By the consumer, bugs are viewed as defects and their fixes should be free. By the company, bugs are viewed as a minor risk and fixing them would cost too much to justify. So you'll reboot once in a while or lose an hour's work once in a while. If it fries your hard disk, well, you should have backed up your data.
Software is also one of the newest fields of human endeavor. Buildings have been built, ships have sailed and farms were farmed, all for thousands of years. No matter how much progress happens in these fields now, they have come so close to "perfection" that continued improvement serves to lower cost, improve safety and increase convenience. It's not a matter of, "Gee, how can we make buildings that actually stand without falling down three times a week?" It's just a matter of, "How wide, how deep, how tall and what color glass do you want on the outside?" You pay X dollars, wait Y months and voila, there is a building. But programming has been around for how long, 50 years? It's an increasingly important but very immature field.
Buildings, bridges, ships... they're obvious. Everyone knows that if enough lifeboats aren't put on an unsinkable ship, it'll sink on purpose, just to piss you off. Everyone knows that if a 100 story building is going to stand, it has to take 10 years to build it. Everyone knows that a dam has to be pretty damn strong or it'll break and flood half the countryside. The building, shipyard and dam businesses aren't progressing at light speed. It is easy to justify 10 years for an outrageous building design because people KNOW what is involved. But software... Now that's totally unobvious. Software is an idea. It's abstract. It's a bunch of curse words that look like gobbledygook to the uninitiated. A bunch of "noise" characters on a broken terminal. Something done by a bunch of skinny, pimply faced geeks who got beat up in high school, took the ugly girl to prom and didn't have any friends. Why should a manager bother to care that fst_jejcl_reduce() causes a possible NULL pointer in the outer loop if case 32 is activated, which happens if the previous re-sort encountered two items with similar Amount fields, all of which will take a whole day to find and fix and will only happen, say, 2% of the times this particular feature is invoked by the user, which isn't that often? Why should anybody justify spending 2 years to develop some bulletproof program that can be banged out in 3 months, with bugs? What's the problem? Constructor workers are risking their lives, moving heavy things, sweating all day in the hot sun... While geeks are sitting in offices just punching crap on a keyboard. How difficult could it possibly be? To
Read the rest of this comment...

Share
twitter facebook
Turing showed this (Score:4, Interesting)

by martin-boundary ( 547041 ) writes: on Tuesday May 20, 2003 @09:41PM (#6003671)

A crashed computer is a computer that's stopped. Alan Turing proved in 1936 that the halting problem is unsolvable. So, it's impossible to know when and how a computer is going to crash or not under all possible circumstances (inputs).
Accept it. It's a fact of nature.

Share
twitter facebook
all systems crash, not just MS (Score:5, Interesting)

by dirk ( 87083 ) writes: <dirk@one.net> on Tuesday May 20, 2003 @09:44PM (#6003696) Homepage

When can we finally give up the FUD of "MS crashes all the time"? Anyone who has used a later MS OS (Win2k or XP) can easily see they crash very rarely. I have had my Redhat install have more problems than my Windows install in the past 6 months, and on the MS system most of the problems have been 3rd party software while on the Linux most of the problems have been the OS itself. The reason systems crash is that there are many pieces, written by many different people, interacting with each other. This is the same whether the OS is Linux of Windows. The harping on the instability of Windows does nothing but hurt the Linux cause, since anyone who actually uses a newer version of Windows knows that the person has no basis in reality.

Share
twitter facebook
Why do computers crash? Because we let them. (Score:4, Insightful)

by dschuetz ( 10924 ) * writes: <davidNO@SPAMdasnet.org> on Tuesday May 20, 2003 @09:51PM (#6003735)

Face it -- if our cars broke down as frequently as Windows (or Linux or whatever), we'd be suing the auto industry out of business.

If our VCRs ate every tenth tape and only played tapes from the same manufacturer as the VCR with any quality, they'd all be returned to Circuit City.

But for software, we grit our teeth and say, well, I just don't understand computers, and reach for the power switch.

Until we, as consumers, start fighting for software that works without crashing, we'll continue to get the lowest possible quality -- just as we have for years. Once the customer starts demanding a quality product, the quality (and whatever software development practices, languages, testing procedures, etc., are needed) will follow.

Bottom line -- there's no real incentive. Microsoft makes billions with buggy software, the increase in profit for selling non-buggy software is pretty small.

Share
twitter facebook
I'm surprised nobody has pointed out yet... (Score:4, Informative)

by Frobnicator ( 565869 ) writes: on Tuesday May 20, 2003 @09:56PM (#6003769) Journal

That beyond all the hyperbole and other reasons, there is something that could be done but usually isn't.
In C++, which a great deal of software is written in, an exception block [or the language or system equivalent] placed around the entire application will catch just about any recoverable error. This is how most of the windows blue-screens or 'your application has performed an illigal operation and will be terminated' messages are brought up. This is how Linux and other unixes generates a core dump.
The actual handling may be in a signal handler, try/catch block, or abend, but the functionality is present in every activly developed language I have ever worked with from cobol and fortran to c, c++, java, and object pascal.
The main reason for applications actually crashing is programmer lazyness.
The main reason for applications getting into a state that they can crash is improper complexity management.
When it comes to drivers, I'm much more forgiving, since it is quite difficult to manage both the hardware and software, and the communication between different programs.
Finally, the operating system itself, which is the layer between the drivers and the applications, I haven't seen any in the last 5 years that has been unstable. Even Windows ME, for all its faults, was very stable in the actual 'operating system'.
But that's just my 2 pesos.
frob

Share
twitter facebook
Containing the Damage (Score:5, Informative)

by Salamander ( 33735 ) writes: <jeffNO@SPAMpl.atyp.us> on Tuesday May 20, 2003 @11:24PM (#6004342) Homepage Journal

A lot of people are answering the question of why there are bugs at all, and it's an important question, but I'd like to take a different angle and consider why there are so many visible bugs. Why does a bug in a driver, or even an application, bring down a whole system? In addition to reducing the incidence of actual bugs, IMO, we should also do a better job of containing the bugs that will inevitably exist even if we all use the latest whiz-bang code analysis tools (which rarely work for kernel code anyway). Some of the semi-informed members of the audience are probably thinking that's the job of the operating system; I'd argue that our entire current notion of operating systems is flawed. There are way too many components in a typical computer system that "trust each other with their lives" in the sense that if one dies all die. Memory protection between user processes is great, but there should be memory protection between kernel entities, and other kinds of protection, as well. One of the basic services that operating systems need to provide going forward is greater fault isolation and graceful instead of catastrophic degradation.

The Recovery Oriented Computing [berkeley.edu] project at Berkeley has gotten some press recently for trying to address this issue. Many here on Slashdot don't seem to "get it" because they've never worked on systems in which a component failure was survivable; they don't realize that rebooting a single component - perhaps even preemptively - is better than having the whole system crash. "Software rot" is a real problem, no matter how hard we try to wish it away. ROC isn't about saying bugs are OK; it's about saying that bugs happen even though they're not OK, and let's do the best we can about that. Another project in the same space, with more of a hardware/security orientation, is Self Securing Devices [cmu.edu] at CMU. There, the idea is to find ways that parts of a system can work together without having to share each others' fate. While the focus of the work is on security, it shouldn't be hard to see how much of the same technology could be applied to protect a system from outright failure as well as compromise. There are plenty of other projects out there trying to address this problem, but those are two with which I happen to have personal experience.

The key idea in all cases is that current OS design forces us to put all of our eggs in one basket, and that's really not necessary. Designing fault-resilient systems is tough - few know that better than I do - but that's only a reason why we should do it once instead of devising ad-hoc clustering solutions for each specific application. Lots of people use various forms of clustering as a way to achieve fault containment and survive failures, but the solutions tend to be very ad-hoc and application-specific. Do you think Google's solution works for anything but Google, or that a database transaction monitor is useful for anything that's not a database? Fault containment needs to be a fundamental part of the OS, not something we layer on top of it.

Share
twitter facebook
- Re:Not always the softwares fault: (Score:5, Insightful)
  
  by Jeremi ( 14640 ) writes: on Tuesday May 20, 2003 @09:14PM (#6003442) Homepage
  
  I've found in my years of repairing pc's that the majority of software problems have their root cause in hardware.
  
  Wow, your experiences are much different from mine, then. I'd say 95%+ of my computer problems are caused by software bugs.
  
  Software errors are repeatable. The exact same situation should produce the exact same error.
  
  For a significant percentage of software errors, that statement is false (at least misleading), because it's nearly impossible to reproduce "the exact same situation". For example, take any multithreaded program with a race condition bug -- the chances of the two threads getting the exact same time-slices on two different executions of the program are approximately zero. The result: a crash that happens only sometimes, at random, even given the exact same starting conditions.
  
  Parent Share
  twitter facebook
- Re:Because it doesn't matter to you! (Score:5, Insightful)
  
  by Anonymous Coward writes: on Tuesday May 20, 2003 @09:28PM (#6003582)
  
  The difference between a system that crashes and one that doesn't is the development and testing. When you buy something from M$ or a zaurus PDA you are getting a consumer product(i.e. cheap). It takes signifigantly more skilled developers and more testing (i.e. expensive) to make systems that don't crash, and consumers(including you) won't pay for them. You pay for features not stability. If you had said why does my Solaris, OpenVMS, engine management system, air traffic control system, life support sys etc. crash you might have had a point but you are talking about consumer products that emphasise features over stability so you got what you payed for.
  
  Parent Share
  twitter facebook
  - Re:Because it doesn't matter to you! (Score:5, Interesting)
    
    by Blkdeath ( 530393 ) writes: on Tuesday May 20, 2003 @09:58PM (#6003787) Homepage
    
    It takes signifigantly more skilled developers and more testing (i.e. expensive) to make systems that don't crash, and consumers(including you) won't pay for them.
    
    Is there a time where the development methods and quality control learned from these large, mission-critical projects will find their way to the consumer product market? If not, why?
    
    Parent Share
    twitter facebook
    - Re:Because it doesn't matter to you! (Score:5, Interesting)
      
      by shaitand ( 626655 ) writes: on Tuesday May 20, 2003 @11:35PM (#6004399) Journal
      
      Intending this as a genuine comment and not a shameless chance to bash microsoft or vote pro linux. This is where an open source system such as linux excels... it does so because alot of the same code that goes into making those critical platforms goes into the main stream releases, thus carrying over to the average user at home. This is a big part of why linux is so stable even on desktops.
      
      The desktop applications for linux are less stable but benefit from similar developement models and sometimes having the same coders involved so tend to be more stable than the competition. After all, after a hard day of coding stable server code, that programmer goes home and listens to mp3's. He runs the same platform at home that he uses at work (linux)... but at home he's running the gui and playing mp3's, one day he decides to scratch and itch because a feature is missing he would like. This gets him looking at the code, and like many other coders he can't stand to see instability... especially since that is what he does. He invariable fixes things and adds a patch for whatever feature he wanted to add.
      
      ok now here comes the shameless plug:
      
      This is why I believe linux will continue to grow and be accepted as the dominate platform. Current software in other areas is stagnating, it has for a while, some applications cannot significantly improve without major revamps in technology (IM's come to mind), a slow steady approach to development (and yes it is slow considering the number of manhours spent on opensource... there are just so many more men to spend hours that it amounts to rapid development.) leads to fewer bugs in the final code that faces the test of time... more code faces the test of time because it was done right (or closer to right) the first time and thus gets the bugs ironed out of it. Open source development is free... it has no pressure to release final versions, no pressure to release features until they are stable... In the course of time (maybe 5 yrs, maybe 50) it's an eventuality that this will win because it cannot be killed, there is nothing to fight after all, no business to put out.
      
      Parent Share
      twitter facebook
    - - Re:Because it doesn't matter to you! (Score:4, Insightful)
        
        by Blkdeath ( 530393 ) writes: on Tuesday May 20, 2003 @10:13PM (#6003899) Homepage
        
        No, cost.
        
        Cop-out. BMW designs safe cars that are expensive. However, the Value cars employ many of those safety features because the research has been done, and the knowledge is now available.
        I could come up with dozens of analogies from countless industries, but it all comes back to this; why do poor coding methods continue to be employed, and where's the QC?
        
        Parent Share
        twitter facebook
        
        Re:Because it doesn't matter to you! (Score:5, Informative)
        
        by kwiqsilver ( 585008 ) writes: on Tuesday May 20, 2003 @10:35PM (#6004064)
        
        Actually... BMW has had some problems [baselinemag.com] with car-puters crashing, causing serious problems with the car's functionality.
        Guess whose OS they used.
        
        Parent Share
        twitter facebook
        
        Re:Because it doesn't matter to you! (Score:5, Interesting)
        
        by Blkdeath ( 530393 ) writes: on Tuesday May 20, 2003 @11:31PM (#6004377) Homepage
        
        Actually... BMW has had some problems with car-puters crashing, causing serious problems with the car's functionality.
        Guess whose OS they used.
        
        Call it reciprocity. ;)
        Otherwise, the functionality of cars and their safety mechanisms have evolved, and that evolution has made it from the $79,990 cars to the $13,990 cars that are being mass produced. Otherwise, who'd feel comfortable driving 160KPH in something that costs a mere 3-4 months' salary?
        Probably one of the sources of problem in the software development industry, I'd say, is duplication of effort. Rather than take existing code and improve upon it, people seem either egotistically or somehow legally (copyright++) bound to constantly re-invent the wheel.
        The GPL development model is great in theory, however in practise it tends to lead to "My camp is better than your camp" rather than "Our camp is approaching perfection".
        
        Parent Share
        twitter facebook
    - - Re:Because it doesn't matter to you, how much time (Score:5, Insightful)
        
        by Anonymous Coward writes: on Wednesday May 21, 2003 @12:04AM (#6004573)
        
        To get some idea of how much time testing really takes.
        
        I can write a simple code change in .5 hours.
        Step-thru debugging, where I watch every line of code execute: .5 hours.
        Informal testing: I do unit testing, change a database table, run the code, change the database table back to it's original value: 2 hours.
        Extreme Programming testing: Write code to do the same this, automatically for the life of the product: 8 hours.
        Extreme Programming System Testing: Run the program with sample files that will break in expected ways, and run in expected ways: 32 hours.
        
        Unless you are using Sun or IBM Mainframe software,
        you aren't going to get that kind of commitment for the product from Upper Management. Unless you're lucky
        
        Parent Share
        twitter facebook
  - Re:Because it doesn't matter to you! (Score:5, Insightful)
    
    by ergo98 ( 9391 ) writes: on Tuesday May 20, 2003 @10:38PM (#6004093) Homepage Journal
    
    It takes signifigantly more skilled developers and more testing (i.e. expensive) to make systems that don't crash, and consumers(including you) won't pay for them
    
    I beg to differ. While there are variances based upon the quality standards and initiatives at an organization, a large correlation can be made between the complexity of software and the incidents of bugs (i.e. Bugs = (1.0 / Quality_Standards) X (Lines_of_Code / Years_In_Active_Maintenance)). There is _no_ comparison between a piece of life support equipment whose lines of code can often be measured in the hundreds, and something like Windows XP where there are tens of millions of lines of code. Features come with a cost.
    
    The number 1 way of assuring quality code is by removing everything until you're left with the absolute essential functions.
    
    Parent Share
    twitter facebook
  - - - Re:Not all consumer devices crash (Score:5, Insightful)
        
        by dspeyer ( 531333 ) writes: <dspeyer.wam@umd@edu> on Wednesday May 21, 2003 @12:26AM (#6004682) Homepage Journal
        
        The "obvious truth" is that most bugs occur at boundaries. It's actually not very obvious, but it is very well established at this point. That's why intelligent modularity and clean APIs are so important.
        It's also why a single system that doesn't interact with anything tends to be easy to debug. A VCR just does its thing, and doens't worry about what anyone else may be doing -- not even the hardware it runs on!
        Single tasking systems seldom crash. Single tasking systems that maintain no state between programs crash even less. Ones which run on only their own hardware crash still less.
        But they also do less.
        
        Parent Share
        twitter facebook
- Don't get me started. (Score:5, Interesting)
  
  by bgalehouse ( 182357 ) writes: on Tuesday May 20, 2003 @10:15PM (#6003914)
  
  Yeah, ok, code crashes most often because it is written incorrectly. And cars and planes crash most often because people drive incorrectly. Entirely true, but not at all usefull.
  My pet explanation is that computer code is in many ways like legal code, with computers playing the part of honest criminals. They follow the law exactly, and walk through loopholes without even thinking about it.
  So you patch the loophole. This, oddly enough seems to make the code bigger. To further contribute to the code bloat, at different times different legislators/developers have different opinions about what the goal of the code is, and different areas which they own. Small patches are thought of as being safer, but interactions with other bits lead to... more loopholes. This is also why both Windows 2000 and the US tax code seem to take up a lot of storage space.
  This then makes clear the value in refactoring, not that I really expect the tax code to be replaced with something sane anytime soon. Following this line of reasoning we also see why carefull encapsulation is so important - so that one can rewrite one module of the system without affecting others.
  Advanced language features such as garbage collection and strong typing don't eliminate bugs. However, they do eliminate certain classes of bugs (segmentation faults) and so reduce the bookeeping required to produce bug-free code. Since this class of bugs is one of the most expident for hackers to take advantage of, there are also disproportinate security benefits to using safe languages.
  On a final note, testing is a defense against bugs, but I believe that testing, especially black box integration testing, should be a final defense that is rarely sees action. If the developers can't find all but a few of the bugs with their own testing, the developers have lost their perspective on how the code works. If a separate QA team finds 1000 bugs, in my experience, the development process has failed and the system will always seem buggy.
  
  Parent Share
  twitter facebook
- - Re:Computers don't crash (Score:5, Interesting)
    
    by Anonymous Coward writes: on Tuesday May 20, 2003 @10:06PM (#6003843)
    
    The current issue of Scientific American states that 51% of crashes are due to user error. 15%=software error. 34%=hardware error. Refer to article for further info.
    
    You made a little "user error" there yourself-- the article says that 34%=software error and 15%=hardware error.
    
    Oh, and those figures are just for Web applications, not software applications in general.
    
    It's an interesting article. Unfortunately, they're not very clear about what constitutes a "user error." I've filled out Web forms that gave me an "error" when I included hyphens in my phone number or credit card number. That's far from an error, it's just poor user interface design.
    
    In my opinion, something the user does should never cause a program or operating system crash. If this can occur, it is the developer who is at fault, not the user.
    
    Apple's Human Interface Guidelines are a nice introduction to user-fault tolerance, even if you're developing for other platforms.
    
    Parent Share
    twitter facebook
    - - Re:Computers don't crash (Score:5, Insightful)
        
        by NetCurl ( 54699 ) writes: on Wednesday May 21, 2003 @01:08AM (#6004862)
        
        Personally I don't think not giving the user the option of defining any settings which could cause malfunction to be the answer. The reason? Well it's pretty simple, when set properly those same settings give flexibility, added functionality, and performance (at least one, sometimes two, often all three of the above).
        
        See, that's the thing. I like Apple's OS because at surface level, you can't get access to those features that could really break things if you screwed with them too much. If you really want to muck around with those settings, they are there and ready to be played with through various means (Terminal -- it's a freaking BSD system, Third-Party, and power-user know-how). I would like to respectfully disagree with your statment and say that by default they don't offer the option of defining settings that may cause malfunction, but in OS X they have left almost complete wiggle-room to in fact screw EVERYTHING up; if you know what you're doing. I think it's more genius than anything...
        
        Parent Share
        twitter facebook
  - Re:Computers don't crash (Score:5, Insightful)
    
    by abirdman ( 557790 ) * writes: <abirdman@maine.[ ]com ['rr.' in gap]> on Tuesday May 20, 2003 @10:16PM (#6003925) Homepage Journal
    
    I'm afraid if a user error causes the program to crash, I've got to call it a software error. It's not that hard to write the error handling handling routines, it's just never in the budget. And the users are invariably able to discover new frontiers of errors the programmer(s) never dreamed of. No matter. If clicking the wrong box, entering the wrong data, plugging in the wrong mouse, or installing the wrong screensaver causes a program to crash it's not the users fault (bless them, for they know not), it's the programmers and design engineers fault.
    
    Hardware errors are another problem altogether. Luckily, it's usually quick to diagnose, and it's usually cheaper to replace hardware than software. It's great how I've been using Microsoft error reporting for about 6 months now, and it's never been their fault. They must be getting better. \snicker>
    
    Parent Share
    twitter facebook
    - Re:Computers don't crash (Score:5, Insightful)
      
      by digital photo ( 635872 ) writes: on Wednesday May 21, 2003 @03:26AM (#6005387) Homepage Journal
      
      I would agree. Properly and well written code will gracefully handle runtime errors.
      Translation: Short of the user fubar'ing the program or data files themselves, the program should handle all user input in a graceful way.
      The problem though is that to do this would require quite a bit of extra work.
      Progammers are caught in a situation of getting something ready for market at a time dictated to them by a department which doesn't understand the underlying issues or saying "Screw it" and making the code solid.
      That only describes one way in which the problem is caused.
      The bigger problem is the attitude people have about computers which allows for this kind of shoddy programming. People are, for the most part, okay and even expectant of their computers to crash at some point in time.
      This in turn makes it okay to release bad code which will be "fixed later".
      I say that whenever we get a crash or a problem, we report it to the company and we post it to our websites and to review sites.
      I say that the users should make it a big fat noticable problem to the companies whenever their software breaks.
      why? because it means that whenever someone who's never used the software before searches on Google for that software or software company's name, they will find page after page of complaints, dissuading them from using the software.
      the flip side is, if the software works, post to your sites and review sites. Give the people and companies who produce good software credit when it is due.
      As users and consumers, we should find ways to encourage the producers and companies to produce solid code.
      Solid stable code shouldn't be the exception to the rule.
      
      Parent Share
      twitter facebook
- - - - Re:And (Score:5, Funny)
        
        by jdray ( 645332 ) writes: on Tuesday May 20, 2003 @10:06PM (#6003849) Homepage Journal
        
        Isn't that "restore him from backup?"
        
        Parent Share
        twitter facebook
- Re:C and C++ are the problem (Score:5, Insightful)
  
  by Anonymous Coward writes: on Tuesday May 20, 2003 @10:00PM (#6003807)
  
  A commonly held notion, but not really well thought through.
  
  Sloppy programmer accesses through bad pointer in C. OS traps task.
  
  Sloppy programmer accesses beyond array bounds in MySafeLanguage. Runtime system traps tasks.
  
  In either case, your program "crashes", and the user isn't going to be any happier if you tell them that it's the "MSL virtual run time environment" that painted the blue screen of death than if it's the "operating system". The crappy program still ate my data.
  
  The two actual causes, IMO:
  
  1) People always code on the bounds of manageable complexity. Think about the programs people wrote 25 years ago. Nice as they were at the time, and they were on the bounds of manageble complexity, they have what would now be considered a laughable number of features and capabilities. As tools and processes and programmers get better, you don't get a better version of the same old thing you always had. You get something new and different that's just now become possible.
  
  2) Users (customers) get what they deserve. I have yet to meet a real customer that will actually wait longer and pay more for a higher quality system. Instead, they'll pay less to the guy that gets there cheaper or sooner. Everyone rants about quality, but they turn around and reward time-to-market and corner-cutting on development. If any significant proportion of users really insisted on quality, they'd get it, and probably at a much higher price. (Some, but not all, embedded development falls into this category.) Instead, they want it now and cheap, and the company that takes longer and cost more simply goes out of business.
  
  Parent Share
  twitter facebook
- Re:C and C++ are the problem (Score:4, Insightful)
  
  by El Cubano ( 631386 ) writes: on Tuesday May 20, 2003 @10:53PM (#6004178)
  
  Don't allow people to use languages that allow you to access memory not assigned to you or to access array positions that don't exist.
  
  It always bugs me at how quick people are to blame the problem for crappy coding on the language. This would be tantamount to a carpenter saying, "if my hammers weren't so damned versatile I could build a higher quality product and not break my thumb open." People would look at him like he was crazy. Or better yet, an inexperienced apprentice saying, "That hammer is just too powerful for me to use."
  
  That being said, C and C++ are the hammer that was designed by carpenters (OS experts) for use by caprenters (OS experts). Don't blame the problems on a bunch of kids who are neverly properly educated on the use of the tool.
  
  Parent Share
  twitter facebook
- What are you smoking? (Score:5, Interesting)
  
  by Jerk City Troll ( 661616 ) writes: on Tuesday May 20, 2003 @10:02PM (#6003820) Homepage
  
  My OS X box, which I use for web browsing and word processing, crashes about once every three days.
  
  The Ti PowerBook G4 I am writing this post on is running Mac OS X 10.2.x. It goes in an out of sleep on an irregular basis, and not always when it is idle. I swap PCMCIA cards in and out. It hops from network to network. I do a lot more than browsing and word processing.
  
  According to my Konfabulator uptime widget, I have 83 days, 23 hours, 20 minutes. My load average at the moment is 1.7. It has not been rebooted since I installed OS X (I did it myself after buying it just for messing around purposes).
  
  You sir are either lying, have bad hardware, or you've severely corrupted your installation. This operating system (which is BSD) is solid as a rock.
  
  Parent Share
  twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Simple ... (Score:4, Insightful)

Re:Simple ... (Score:5, Insightful)

Re:Simple ... (Score:4, Funny)

Re:Simple ... (Score:5, Funny)

Re:Simple ... (Score:5, Interesting)

Simple, yes, for other reasons (Score:5, Insightful)

Re:Simple, yes, for other reasons (Score:4, Insightful)

Re:Simple, yes, for other reasons (Score:4, Interesting)

Re:Simple, yes, for other reasons (Score:5, Funny)

Re:Simple ... (Score:5, Funny)

Re:Simple ... (Score:5, Insightful)

Re:Simple ... (Score:5, Insightful)

it DOES cause an error (Score:5, Interesting)

Re:it DOES cause an error (Score:5, Funny)

Easy (Score:4, Funny)

Whose computers still crash? (Score:5, Funny)

Re:Whose computers still crash? (Score:5, Funny)

Re:Whose computers still crash? (Score:5, Informative)

Re:Whoops, bullshit alert. (Score:5, Informative)

Re:Whose computers still crash? (Score:5, Funny)

Re:Whose computers still crash? (Score:4, Insightful)

Re:Whose computers still crash? (Score:5, Funny)

Re:Whose computers still crash? (Score:5, Funny)

OT: Electric overconsumption (Score:5, Insightful)

Re:OT: Electric overconsumption (Score:5, Interesting)

Re:Whose computers still crash? (Score:4, Insightful)

Re:Whose computers still crash? (Score:5, Funny)

Check this out -- lets talk some SERIOUS UPTIME. (Score:4, Interesting)

AS LONG AS YOU CAN TEST EVERY STATE... (Score:5, Insightful)

Re:AS LONG AS YOU CAN TEST EVERY STATE... (Score:5, Insightful)

Debian. (Score:5, Funny)

Re:AS LONG AS YOU CAN TEST EVERY STATE... (Score:4, Insightful)

Human Error (Score:5, Insightful)

Re:Human Error (Score:5, Insightful)

Re:Human Error (Score:5, Insightful)

Re:Human Error (Score:5, Insightful)

Re:Human Error (Score:5, Insightful)

Re:Human Error (Score:5, Interesting)

It's bugs! (Score:5, Insightful)

Speed (Score:5, Insightful)

Re:Speed (Score:5, Insightful)

Re:Speed (Score:4, Interesting)

In my CompSci class.. (Score:4, Insightful)

Re:In my CompSci class.. (Score:4, Insightful)

Re:In my CompSci class.. (Score:5, Funny)

Time is Money. (Score:5, Interesting)

because someone was very curious and decided to... (Score:4, Funny)

Reliability and complexity (Score:4, Insightful)

It's not the need for speed (Score:5, Insightful)

Re:It's not the need for speed (Score:5, Insightful)

Don't forget the hardware... (Score:5, Insightful)

crashes? (Score:4, Interesting)

Touchy subject (Score:5, Interesting)

Scientific American... (Score:5, Interesting)

It's expected. (Score:4, Insightful)

Complexity, my dear Watson (Score:5, Insightful)

Complexity, standards, peer review, sanity. (Score:4, Insightful)

Essence of Software Engineering (Score:5, Insightful)

Microsoft (Score:5, Insightful)

Re:Microsoft (Score:5, Interesting)

Re:Microsoft (Score:4, Interesting)

Nope! Case in point. (Score:4, Informative)

Economics? (Score:5, Insightful)

The ultimate solution (Score:5, Interesting)

Re:The ultimate solution (Score:5, Interesting)

For those who are willing to pay... (Score:5, Insightful)

Re:For those who are willing to pay... (Score:5, Interesting)

Actually (Score:4, Insightful)

Mandate memory checking tools (Score:5, Interesting)

Obligatory anti-MS (Score:5, Insightful)

Re:and (Score:4, Informative)

A lesson from history (Score:5, Insightful)

Re:A lesson from history (Score:4, Insightful)

Re:Try the UML (Score:4, Interesting)

Software, complexity, and human nature. (Score:4, Insightful)

STFP (Score:5, Insightful)

Turing showed this (Score:4, Interesting)

all systems crash, not just MS (Score:5, Interesting)

Why do computers crash? Because we let them. (Score:4, Insightful)

I'm surprised nobody has pointed out yet... (Score:4, Informative)