How Do You Know Your Code is Secure? 349
bvc writes "Marucs Ranum notes that 'It's really hard to tell the difference between a program that works and one that just appears to work.' He explains that he just recently found a buffer overflow in Firewall Toolkit (FWTK), code that he wrote back in 1994. How do you go about making sure your code is secure? Especially if you have to write in a language like C or C++?"
Does language matter? (Score:3, Interesting)
The only sure way I know of: Lambda calculus (Score:4, Interesting)
Here's why, and why just about any computational problem can be solved using FP (functional programming):
Functional languages conform to lambda calculus, which has been shown to be Turing equivalent, which means that any program that can be computed on a Turing machine can be solved using Lambda calculus. So long as you program using strictly functions, your program can be verified according to the rules of lambda calculus, and the verification would be as sure as a mathematical proof. This is the only sure way I know of really knowing with mathematical certainty that your application is secure.
Pure functional programming has no assignment statements; there are no state changes for you to keep track of in your program, and in many cases abuses resulting unintended changes of state are the root of security problems. This is not to say that there is no state in functional programming; the state is maintained through function call parameters. (For example, in an imperative programming language, iteration loops keep track of a state variable that guides the running of the loop, whereas a functional program never actually keeps track of state with a variable that changes value; a functional program would carry out iteration by recursion, and the state is simply kept as a parameter passed to each call of the function. No variable with changing state is ever coded.)
Since functional programs lack assignment statements, and assignment statements make up a large fraction of the code in imperative programs, functional programs tend to be a lot shorter for the same problem solved. (I can't give you a hard ratio, but depending on the problem, your code can be up to 90% shorter when described functionally.) Shorter code is easier to debug, which helps in securing code. The reason functional code is so much shorter is that functional programing describes the problem in terms of functions and composition of functions, whereas imperative code describes a step by step solution to the problem. Descriptions of problems in terms of functions tend to be far shorter than algorithmic descriptions of solving them, which is required in imperative code.
Here's the biggest benefit of managing complexity with functional programming: as a coder, you NEVER have to worry about state being messed with. The outcome of each function is always the same so long as the function is called with the same parameters. In imperative programming as done in OOP, you can't depend on that. Unit testing each part doesn't guarantee that your code is bug free and secure because bugs can arise from the interaction of the parts even if every part is tested and passed. In functional programming, however, you never have to deal with that kind of problem because if you test that the range of each function is correct given the proper domain, and pre-screen the parameters being passed to each function to reject any out-of-domain parameters, you can know with certainty where your bugs come from by unit testing each function.
If you need to guarantee the order of evaluation (something that critics of FP advocates sometimes use to dismiss FP advocacy), you can still use FP and benefit: in functional programming, order of evaluation can be enforced using monads. Explaining how is beyond the scope of a mere comment though, but in any case, if you need really reliable code, consider using a functional programming style.
I can't do justice to the matter here; for more information, see th
By the way, I meant to say this also (Score:3, Interesting)
To summarize, here's how you verify with mathematical certainty that a functional program is secure:
That's the gist of it. Anything more on this topic, such as automatic code auditing with the certainty of mathematical proofs (by means of lambda calculus proofs) is beyond my expertise. I just know that it's possible to truly secure functional code with mathematical certainty, whereas with imperative code, you can only be sure that your code has not yet failed or exposed a rare bug or failure condition.
Re:Same way you hunt bugs (Score:1, Interesting)
This is correct for crypto, wrong for everything else. I've often replaced third party authentication code that was above and beyond what was required in terms of complexity.
In principle I agree, in the real world it's about finding a balance. You can't design away security bugs and you can't spend 60% of your runtime stuck in input validation and security procedures.
Re:What's the matter with C/C++? (Score:1, Interesting)
3 Things (Score:1, Interesting)
b) formal peer reviews for pre-design, design, code, test specifications, and test results
c) Purify!!! http://www-306.ibm.com/software/awdtools/purify/ [ibm.com] A license for every developer AND tester!
I haven't written any code since 1999, but that was how I setup the development team for that company. The reviews also are a form of cross training and team building. Nobody is perfect and showing our individual errors helps everyone fit in. OTOH, there was 1 guy who clearly didn't understand header files and was labled "Mr. Header" for almost a year. After the first 1 week of taunts by his peers, he quickly learned when and how to avoid putting too many header files into his code.
Other code issues were discovered and learned by the entire team. We didn't hide errors, we published them within the team and never told management anything about who caused what to happen.
old versions of purify (Score:3, Interesting)
a long way but its still useful. Electric fence helps too.
Then a lot of old fashioned software engineering.. use raw arrays
as little as possible, add bounds checking to std::vector [] if you
feel inclined, use gprof to identify any code not being excercised
by your unit tests [you do have unit tests, right]. Lastly, actually
read the darn code and make sure anytime you are using raw arrays
you check the size.
Re:The answer is simple - you never know (Score:5, Interesting)
The best advice I read was from the Erlang documentation. It suggested that you program defensively on a system level, but not on a module level. If a module receives input it can't understand, or thinks it is in an invalid state, the correct behaviour is for it to crash. A system of monitors should deal with failures of components, because they can determine how the failure will affect other components. There has only been one remote root hole in OpenBSD in the last ten years, and it would have been avoided if the OpenSSH developers had used this principle.
Re:What's the matter with C/C++? (Score:3, Interesting)
Re:Don't use C++ as if it was only "C with classes (Score:1, Interesting)
C and C++ coders have jammed allocated memory on to the stack for years to make it "garbage collected" when the frame pops off the stack. That also opens up the very easy to exploit stack overflows, if you put memory on the stack and then do any unbounded or improperly bounded I/O to it, you're done. The STL provides smart pointers to allow you to manage heap as if it were on the stack but the problem remains that you have to manage it.
The word on the street, and I discount it, by a lot of the hackery types is that the use of C++ with any overflow is easier to exploit than a normal C application because it has the virtual function look up table which basically provides an easy one-stop shop where you can get a pointer to almost any other subroutine in the program. It makes sense logically but I don't know of any exploits that make use of it
Re:What's the matter with C/C++? (Score:3, Interesting)
Because bugs don't belong to programmers, they belong to code.
Imagine the difference between "I fixed a Linux kernel bug", which earns you much respect from the community, and "I fixed one of Linus Torvalds' bugs" - which is a rather offensive thing to say.
So while the insecurity is the programmer, not the language, we can't blame the programmer. It's simply not acceptable.
Re:Don't use C++ as if it was only "C with classes (Score:4, Interesting)
The difference between pure C/C++ and the STL is that something like strcmp can create a rather subtle sort of buffer overflow error, whereas buffer overflows involving STL containers are generally easier to avoid and detect. For that matter, if you use the STL algorithms library to its full potential, you may find that you hardly ever need to use explicit indexing or iterators other than begin() and end().
Why would I want to? (Score:3, Interesting)
I say go with The Market, and write the most insecure software you can. Securing your software will only waste your time and decrease your sales.
For every string function (Score:3, Interesting)
The real problems come into play when you're using a 3rd party library. You can always police your code, but it's hard to police / fix other's code. Open source libraries are great for this in general, but there's not always an open source solution for connecting to proprietary buses, services, etc.
In the end, solutions that require policing are only as good as policing. Policing is designed to only be effective after some atrocity has been committed, and so policing will likely only be effective after the exploit. A much better solution would prevent use of unbounded string functions by not having them defined. Perhaps there's some compiler magic that could be employed, but I doubt such techniques will gain much traction. It's like asking a guild of master carpenters to switch building materials. Once you know the materials and weaknesses, usually it's better to design around the weaknesses than to change materials.
As a pratical real-world example near me, our school system just replace over a million bricks in a nearby school. The reason was that the new-fangled iodized metal bolts were used (way back when) to bind the bricks to the sub wall. Iodized was new and "hot" and it didn't rust, so the wall should have lasted forever. However, it corrordes when exposed to salt water, and the school was close enough to the sea to be exposed to salt water vapor. The problem was discovered when a worker leaned against a brick wall and it toppeled over.
In the end, education will bring the current coders around, but don't expect the problem to go away. There will be many years of people reading antiquated "how to program" books that teach older, less safe, practices. There will be people reentering the marketplace that will have missed the newer techniques. There will be users installing from the copy of winZip (or whatever) that they downloaded in 2000.
Only with time, and a whole lot of paitence, will this problem die. It won't be fixed, it will decline until it's barely noticable.
Re:Correct but irrelevant (Score:2, Interesting)
As for how state functional languages handle state; state is held in the parameters a function is called with. The simplest example is recursion; in an imperative program using a for loop or a while loop or something like that, state is stored in a counter variable that gets incremented or decremented or somehow changed each time the loop is run. In recursion, if the ending condition is not met, the function calls itself with slightly differing parameters; the parameters keep track of the state, but unlike imperative programming, since the parameter is not a variable that can be changed once a call is made, it is impossible to have bugs caused by unexpected or unintentional changes to a variable in the scope of other operations that might change it. FP doesn't permit any declared values to change, so there are no "variables", just constants.
If this makes no sense at all, you'll just have to program a few loops in an imperative language, and a few in a functional language using recursion, and see the difference. It's a lot easier to show interactively than to explain.