How Do You Know Your Code is Secure? 349
bvc writes "Marucs Ranum notes that 'It's really hard to tell the difference between a program that works and one that just appears to work.' He explains that he just recently found a buffer overflow in Firewall Toolkit (FWTK), code that he wrote back in 1994. How do you go about making sure your code is secure? Especially if you have to write in a language like C or C++?"
Secure? (Score:2, Insightful)
Don't use C++ as if it was only "C with classes" (Score:5, Insightful)
Actually, the best thing would be not to use C or C++ at all, but that's where reality comes into play. Most developers don't even have the choice which language they should use, but that is predetermined by the employer and/or supervisor.
Avoid direct memory access (Score:5, Insightful)
However you don't have to do it like this, especially not in C++ which has a safe string class (for example) as part of its standard library. Unfortunately C++'s vector type still doesn't do bounds checking with the usual [] dereferencing - you have to call the at() method if you want to be safe. But the general principle is: don't do memory management yourself, use some higher-level library (which exist for C too) and let someone else do the memory management for you.
You can write a C++ program and be pretty confident it doesn't have buffer overruns simply because it doesn't use pointers or fixed-size buffers, but relies on the resizable standard library containers.
Security (Score:2, Insightful)
Writting in C/C++ doesn't do the whole thing better. A strong/typed language like Delphi or a managed language like C# are less likely to have any buffer overflow type bugs, etc, but you never know. Code writting is not pizza baking.
The answer is simple - you never know (Score:4, Insightful)
You can only minimize the risk that security issues will be found with any software. The best way to do this is to perform a rigorous code audit, preferably by security professionals. And if you can, make the software open source. You get a lot more eyes staring at it for free that way.
What's the matter with C/C++? (Score:5, Insightful)
It's not that C/C++ is so insecure by itself, the problem is that programmers may not have used the best programming practices. There are plenty of libraries for handling strings and memory allocation in C, in C++ there are string and storage classes that do as much or as little checking as you need.
When you are an expert programmer there are places where you need more efficiency than the super-safe string routines can give you. It's the job of the expert to determine exactly how to balance efficiency against security, and only C/C++ can give you this balance.
Grammar (Score:2, Insightful)
Some possibilities (Score:3, Insightful)
Assume failure (Score:5, Insightful)
Combine this mentality with the usage of safe classes as datatypes whenever possible, so that you can wrap your input verification into the functionality of the classes. If prudent, wrap external library routines in classes which manage the interaction with them, and which verify the data content being passed.
Use test suites to test every component of your program, and be sure to include invalid and pathologically insane input in your test suites.
Do not trade security for efficiency. And don't forget to cross your fingers.
Re:Don't use C++ as if it was only "C with classes (Score:3, Insightful)
Re:Avoid direct memory access (Score:5, Insightful)
Err, how much C++ have you written? I've yet to see any complex C++ *without* pointers since you cant reference or use dynamically created objects using the new operator without them. Not to mention in 101 other instances where they're useful.
Re:What's the matter with C/C++? (Score:5, Insightful)
Re:Don't use C++ as if it was only "C with classes (Score:3, Insightful)
Proving the Unprovable (Score:3, Insightful)
Traditionally it has been important to "specify and validate" requirements acribically, in the belief that this is was the way to write good code. This is partly true, but that way can quickly turn your process into a dinosaur - stifling change and preventing improvement because of non-compliance with "The Requirements".
You can try "defensive coding", which really treats all messages with great suspicion, messages being an old term for parameters. This is a cool technique, but can lead to slower code than necessary, and can lead to some bug being buried if code attempts to heuristically correct for "bad" messages (there is rarely any way to formally specify what is "bad"). You can use LINT tools (and there are very many, very sophistacted tools) which will catch a whole lot of stuff before it leaves the developer's screen. You can try practices such as pair programming and independent code inspection. On the coding side, you can even try (gasp) such methods as test driven development and contract based development.
On the testing side, there is nothing quite like having an experienced, qualified, motivated and _empowered_ testing team. A testing team which knows how to find bugs, knows how to communicate with coders and has the power to step defects going in to production. A technique I particularly like is defect insertion - secretly insert 10 bugs into the code base and see how many get squashed, this will give you an estmate of how many defects your process doesn't find. There are other cool techniques too, some based on mathematical analysis of the code's attribute - the more complex the code the costlier it is to maintain.
Opening up the codebase to many people might well increase the chance that someone will find the line which causes an error - but IMHO no one goes around looking for bugs unless they are looking for weaknesses. And there we have another (unethical) method - pay some hacker doodz to 'sploit your code. Hopefully they will not find a higher bidder LOL.
All of these methods are likely to increase development effort and cost, decrease the number of defects, increase user satisfaction, decrease maintainance costs and increase well-being and harmony. So it is a trade off, perfect code is incredibly difficult to create - the question is what level of perfection are you (and your customers) willing to pay for. Problems mostly arise when expectation does not meet reality - some flakiness in an F/oss application suite is more acceptable to me than random crashes in software which cost me hundreds - or tens of thousands - or millions - of dollars.
In order to increase some quality aspect of code (security, performance, robustness, correctness...) one can therefore focus on one or several categories - the people, the process, the culture, the tools, the technique, the time&cost etc. The choice of what to focus on is dictated by reality: no one has unlimited resources (except, almost, Google).
There is no silver bullet - but there are golden rules. Finding people who know the difference is crucial I believe.
(Full disclosure: Yeah, I'm looking for heavy duty PM work
Re:Just Say No (Score:2, Insightful)
Ok, What language is your Ada compiler written in? There are very few self-hosted languages that do not rely on "C" at some level. Also, the OS and the system libraries were written in C. At some level you need to deal with the stated problem. All that being said many people are probably better off with Ada unless they actually "study" software security on a daily basis.
Re:Don't use C++ as if it was only "C with classes (Score:3, Insightful)
Security? What's that? (Score:2, Insightful)
I think the question itself makes little sense without a deeper investigation in the system!
Re:The only sure way I know of: Lambda calculus (Score:2, Insightful)
Coding 101 (Score:5, Insightful)
It's specifications, pre- and post-conditions, all that "theoretical bullshit" we learned in university. It's just that writing code that way is very un-exciting, and that's a vast understatement.
SPARK (Score:5, Insightful)
The verification system implements Hoare-logic and is supported by a theorem prover. Buffer Overflow is only one of many basic correctness properties that can be verified. Properties that can be verified are only limited to what can be expressed as an assertion in first-order logic.
SPARK is a small language (compared to C++ or Java...) but the depth and soundness of verification is unmatched by anything like FindBugs, SPLINT, ESC/Java or any of the other tools for the "popular" languages.
(If you don't know or care what soundness is in the context of static analysis, then you've probably missed the point of this post...
- Rod Chapman, Praxis
Re:The only sure way I know of: Lambda calculus (Score:5, Insightful)
There is also the question of what the proof actually says. You can't prove, for example, whether a lambda program will terminate (Halting Problem), and in fact you can prove that you can't prove this. If you have a sufficiently well expressed specification for your program, you can verify that the program and the specification match. Unfortunately, if you have a specification that concrete, you can just compile it and run it.
By the way, Scheme is not a functional language. It has a number of properties that make it possible to write functional code, but saying Scheme is a functional language is like saying C++ is an object oriented language.
Re:Avoid direct memory access (Score:1, Insightful)
There are a couple of solutions to this problem:
3.) Don't use different versions of the C++ standard library* in the first place.
*It's not called STL. That was the original name of a container library developed at HP that was later included in the C++ ISO spec. It also never even included a string class.
Re:You don't (Score:4, Insightful)
Re:Avoid direct memory access (Score:5, Insightful)
What does that is rushed code, poor design and inadequate testing. These feature heavily in the vast majority of commercially produced code I've seen. Frankly most of what I've seen is horrifically bad. With code of such low quality, C should be avoided, but that's not C's fault, it's crap house coding rules. C is elegent, minimal, and mindbendingly fast. This does not mark it as ideal for enterprise tools, but it does have a place there, for time intensive operations.
It is extremely easy to ensure buffers in C have a strictly limited inputs, and do not encounter overflows. It's also easy to not do this, and thus faster. That I suspect, is where most of the problems come from.
Open source code used in the enterprise seems nowadays to be starting to suffer from similer problems to the commercial code I've seen, although commenting schemes are better. The problem seems to me to be a feeling that things must be pushed forward to compete. That isn't a good plan. Slower development, more testing before actual deployment, and less feature creep are what is needed.
Re:What's the matter with C/C++? (Score:5, Insightful)
C and C++ have a larger domain that can suffer from buffer overflows than languages with automatic memory management. In C, a buffer overflow can potentially occur at any point in your source code. In a language which automatically manages memory and checks bounds, the possible points at which buffer overflows can occur are reduced. This does not necessarily make the application more secure, but it does mean that there are less points at which it can be compromised.
I'm not sure that the efficiency increase from dropping boundary checks is often necessary, except possibly in high-end 3D games. Also, many languages allow for binary libraries written in C, so it would be possible to write an application in C#, Python, Ruby or whatever, and farm out any efficiency-critical routines to a library.
Re:Don't use C++ as if it was only "C with classes (Score:2, Insightful)
Why would you HAVE to use C or C+ or C*+**+++? I don't mean to be a troll, but if you are writing in an inherently insecure language (i.e., any compiled language) you aren't going to get secure code.
OTOH of you write in, say, assembly, you are setting yourself up for the complexity. You have to make sure your buffers won't overflow, as opposed to leaving it to the compiler writers.
As to overflows, if you KNOW your language is prone to overflowed buffers, it seems wise to check for overflows with your own code. After this long, there really is no excuse for buffers that overflow. It isn't hard to check for the length of a string, after all.
If bridge engineers were as lazy as programmers, bridges would be falling down by the hundreds. My 1992 car is full of hundreds of thousands little bitty moving partsand fluids, but as long as I keep clean oil and filters in it, it doesn't break. My last car was an 1988, it lasted until last year. But I have to replace my 2002 Microsoft operating system because it's not secure? Somebody is making a lot of money off of poorly designed and poorly built software. There is no reason why I should have rto replace an OS.
There are reasons for program errors, but no excuses. If your code is shit, it's shit because you wrote shit. Either you're incompetent or lazy. "You can have cheap, secure, or fast. Pick two."
Re:What's the matter with C/C++? (Score:3, Insightful)
Re:What's the matter with C/C++? (Score:3, Insightful)
yeah a gun by itself is not insecure either.... try giving it to a baby.....
There is the crux of the C/C++ problem, we give an oozie to to 3 year olds without the training and knowledge. 9/10 C/C++ programmers I ever interviewed failed to properly explain how pointers work. Those that did answer pointer questions correctly tend to have programmed more securely than those that put */&/** by memory.
It also comes down to money, a good C/C++ programmer isn't cheap.
You cannot know. You can only engineer. (Score:3, Insightful)
I think it was Knuth who said, "In theory, theory and practice are the same. In practice, they are not."
In theory, for any nontrivial program, you cannot know absolutely that it is secure. You cannot even know that it will terminate. The Turing showed that there is no algorithm which will decide if a program will halt. Most other problems of program behavior can be reduced to halting. (Just place a call to exit() immediately after the code that outputs the behavior in question.) In general, there is no way to prove that a program has any particular property that can be reduced to a termination property.
The choice of language does not matter, either. Turing used a language that was very primitive, even compared with the simplest assembly languages. But Turing's language is equivalent in computing power to every modern general-purpose programming language. Church's completeness hypothesis is widely accepted as valid, though a proof in the strict sense cannot be written. So, Turing's mathematical proof of the halting theorem is valid for every modern programming language.
There are some programs for which we do know that the program is correct. Such programs are all very small, solve well-defined mathematical problems, and are written in well defined functional programming languages. These proofs depend on very careful, mathematical definitions of the programming language, and of the function to be computed. The programming language is, strictly, an algebra. The proofs simply show that the algebraic formula (the program) transforms the algebraic input to the correct algebraic output. In every case, such proofs are quite difficult and tedious. And, as noted above, they are not possible in the general case.
In practice, we can apply methods that are known as "engineering". That is, we can apply logic, design, inspection, review, and testing to develop some amount of confidence that it will behave as expected. But, engineering methods do not provide certainty. They only provide high confidence. The choice of language and tools have some effect on the ease or difficulty of doing the engineering work, but do not change the boundaries of what is possible.
How do we "know" that a bridge will not fall down. There are no proofs of bridges. There is only engineering. Engineers apply logic, experience, design, inspection, reviews, and tests, so that they can have confidence in the design. The confidence is based on statistics. For a given shape of steel or concrete, we can measure loads that cause the steel to fail, and we can measure the variance in those loads due to the manufacturing tolerances of the material. When we use that shape and material to build the bridge, we can have statistics about how much load the bridge can support without failing. But even with all that engineering, sometimes bridges do fall down. The load measurements are only statistics, not proofs. There is always a confidence interval around every measurement, and the confidence can never be 100 percent.
We can never have absolute proof of any property of any real, nontrivial program. We can have confidence as close to 100 percent as we want, if we spend enough effort on the engineering.
Re:You don't (Score:3, Insightful)
Yes, that is funny, but there is truth to it as well (which is why its funny).
Security, software development, and everything else is a process, not an event. It gets better over time, and basically, the way that issues come out is for them to be found "in the wild". And as these issues are found, better tools and techniques make the process better over time, but I don't envision a world where people just think of bugfree, usable, featureful software that just appears, but all in all it keeps getting better.
The error in question:while(lp != (struct listelem *)0) {
free(lp);
lp = lp->next;
} is pretty silly, and I don't know how it took over a decade to find that. In my experience, code like that crashes pretty regularly, and debugging it will point to the error.
Today, what some programmers do is to do FREE(lp); where FREE() is a macro or something that does if (a) { free(a); a=NULL; }. This prevents double frees, and ensures that future use of the pointer will predictably die with a null pointer exception. In 2006, bugs like this should not find themselves in C code. We now check our stuff, use languates or tools that check for crap like this for us, or whatever. In 1994, I guess it was OK for such a bug to be interoduced into code, but not in 2006.
Re:Don't use C++ as if it was only "C with classes (Score:3, Insightful)
One more thing about OpenBSD (Score:3, Insightful)
Let's not forget their wonderful documentation! Complete and accurate API documentation is absolutely necessary for writing secure and reliable software. And of course the programmers should actually read the documentation and check all the details of the API calls they are using (return values, etc...)!
Why bash C/C++? (Score:3, Insightful)
As a C/C++ developer I am a little offended by the article summary. Certainly C/C++ has a lot of flexibilities that allow bad developers to write bad code. However, many other languages, e.g. Java, allow bad programmers to write code that looks good because of stronger type checking, reduced use of pointers and the like. However, nothing stops a bad developer from writing insecure code in any language. Maybe you don't manage your resources correctly. Maybe you do a bad job of implementing encryption/protected storage. Maybe your authentication scheme is weak, your site is vulnerable to cross-site scripting vulnerabilities, or your session data can be easily spoofed.
Secure code is not a product of language, it's a product of developers who take the time to fully understand the tools that they are using to build the product, including the ins and outs of their language of choice and its key risk elements, and who research risk elements for all other parts of the system.
Re:Avoid direct memory access (Score:3, Insightful)
*Using a file
*Using a semaphore
*Using a database connection
*Any other resource that is unique
If you can't get this simple concept, you shouldn't be programming. Ever.
The only difficulty is if you have ownership of pointers bouning all over. Of course, I've never seen a good design that did that. Good modular design solves the ownership problem for you in 99.9% of cases. In fact, if I ever find a place in code where it isn't clear, its a warning to me that I need to redesign that part of the codebase.
Re:Don't use C++ as if it was only "C with classes (Score:3, Insightful)
Re:Coding 101 (Score:3, Insightful)
That depends really. As a math geek I find a certain amount of pedantry and formlisation natural. I mean many people are happy to spend the extra time writing annotations to define types signatures for functions (and even types for variables in some languages) which is, really, just a light form of specification. Using Eiffel, or Java with JML, or Spec#, or D, really isn't as bad as you might think - you mostly end up writing things you'll have to get around to writing anyway (when it comes time to properly document APIs etc.), you just get to write it along with the code in a slightly more formal way...
Re:What's the matter with C/C++? (Score:3, Insightful)
There are plenty of us who are perfectly capable of functioning in that environment but choose not to, preferring to focus mental energy on algorithms rather than silly implementation details like whether that pointer I've got points to something stack-allocated or heap-allocated. Besides that, I do mind the risk, because I have the mental capacity and maturity to understand that, even though most of the code I write that compiles is also correct, I'm not perfect.
Re:Just Say No (Score:2, Insightful)
Laws of construction (Score:3, Insightful)
1 - measure with a micrometer
2 - mark with chalk
3 - cut with an axe
4 - if it doesn't fit, use a larger hammer