Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Security

Codebreaking - Taking the First Step? 83

Master Spy asks: "Here's something that the Slashdot community might be able to help with. If you receive a message in code how do you take the first step? Back in the days of WWII it was easier. The codebreakers at Bletchley Park already knew that the messages were encoded using an Enigma machine so all they had to do was work out the positions of the rotors using brain power, the Bombe or later the Colossus machine. American codebreakers also knew the basic details about the methods the Japanese used but now however things are more complicated. Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' How do you know how the message has been encrypted? It could be an Enigma machine, it could have been XOR'd with a second message or a one-time pad or it could use some form of software encryption such as Blowfish or DES. Before you start ripping the message apart for decoding how do codebreakers find out what method has been used to encode the message?"
This discussion has been archived. No new comments can be posted.

Codebreaking - Taking the First Step?

Comments Filter:
  • Medium (Score:4, Insightful)

    by peripatetic_bum ( 211859 ) on Thursday February 20, 2003 @06:22PM (#5347652) Homepage Journal
    Let the medium decided how to decide this.
    Ie if were over the net, look at the wrappers.
    If over the radio, look at the spectrum.
    It's whats around the message that will break the message
    • Look at it. (Score:5, Insightful)

      by Glonoinha ( 587375 ) on Thursday February 20, 2003 @06:52PM (#5347869) Journal
      Take a bunch of encoded stuff and simply look at it, watch for patterns over the course of the data as a whole. For a small sample set

      'sdjek dYqkP 1Nt$% GGl9) MHrYD +++'

      this is not going to do you much good, but if you have reams of encoded / encrypted data just stare at it for a while, look at it in a way that you look through it (like those hidden picture things) and after a while you will recognize patterns and have something with which to work.

      There is a fine line between the high quality software engineer and mild autism. Ever watch 'Rain Man' or 'A Beautiful Mind' and think - hey that guy would be a BAD ASS developer ...

      Helps to be able to think in 6+ dimensions when you are cracking codes, and a photographic memory helps too.

      I should probably post this as an AC - last thing I need is the CIA / NSA figuring out what I am capable of :p
  • how (Score:5, Informative)

    by igs ( 260162 ) on Thursday February 20, 2003 @06:28PM (#5347694)
    Well, if this was easy, codebreaking wouldn't be any fun. Don't forget that both the Germans and the Japanese had a variety (tens if not hundreds) of different cyphers in circulation, so it wasn't exactly as simple as assuming it's Enigma or Code Purple.
    As to how it's done, that has to do with analysing the text, frequency analysis of 1-grams 2-gram etc. Simple substitution will exhibit one fingerprint (though different languages will obviously be different), something like a playfair or Venegere (sp) square will have another, and DES encypted text a completely different structure. Obviously on a small enaugh sample there may not be enaugh information to latch into...
    But with a larger sample, it's mostly a combination of good tools, experience, and guesswork :)
    • I gotta say that a non-halfassed encryption mechanism is going to have a fingerprint that can only be percieved as "random". To do otherwise would mean that the encryption system is really, really weak.
      • Maybe you do "gotta say" that, but you'd be wrong. As we discussed above, if the key length is less than the message length there will be some difference between pure noise and your signal.
        • Maybe you do "gotta say" that, but you'd be wrong.

          Nope.

          As we discussed above, if the key length is less than the message length there will be some difference between pure noise and your signal.

          Yup. But that doesn't matter. Not being random is entirely different from not being random in a manner that we can detect. An encryption algorithm that is *detectably* nonrandom is pretty poor.

          I assure you, you cannot tell the difference between, say, DES-encrypted data and random data.

          That's pretty much definitionally what a good encryption algorithm is.

          Sure, given a computer with infinite resources, any crypto algorithm except one-time pads can be broken. The point is that we don't *have* computers with infinite resources. That's why non-one-time-pad crypto is used.
    • Re:how (Score:3, Informative)

      by ChadN ( 21033 )
      Simply put, any new ad-hoc cipher that was not designed by experts will probably be subject to flaws that could be revealed by statistical tests, assuming a sufficient amount of ciphertext.

      Additionaly, any ciphertext that was encrypted by a well-designed cipher (and I'll include DES in this example, despite its relatively small keysize by modern standards) will NOT be much harder to decrypt simply because the cipher is unknown. Even if you had LOTS of ciphertext and tried against every known published cipher, along with billions of variants (ie, additional rounds for each one, etc.) the extra workload would be modest to minute, compared to the work of actually searching the keyspace (and in the case where there is no known plaintext, analyzing the de-ciphered text for probable plaintext).

      Most modern protocols go well out their way to advertise the ciphers used to encode messages, precisely because that bit of information is of no real extra security as long as the key is kept secret (and is well chosen). To not do so would make deploying things a nightmare.

      So I think, in the end, few experts would argue that using the most commonly known good cipher, with a well-chosen key, is any less difficult to 'break' than using an obscure and secret cipher. Especially if that cipher is not one that is also widely deployed as a secure cipher. The real hard work is in finding the key. And the fun work is in finding ad-hoc ciphers that people think are secure because the method is secret.
      • If your new ad-hoc cipher is used few enough times (and only once per key), it may effectilvy be a one time pad and suffer from the same level of uncracability. Its like Paul Revere's "one if by land and two if by sea". That was a one time pad and was used once. If they keep using the system, then it would be weak but the algorithm is secure if the key gen was done properly (like a coin toss). They were only transmitting one bit of data and had a one bit cypher. Assume a modern counterpart was going to send "land" or "sea" with a something like 3DES. If the attacker knows the encrypted stream, the cypher method and the two good results, they could attempt to find keys the give those two results and then look at the keys to figure out ones looks like line noise and the other is ascii "nukem". It would be farily clear which one was the key. This is a weakness of many modern crypto systems. If your only sending 10 bits of data, use a 10 bit cypher. Some very well funded ecommerce systems likes to use hashs outside of encrypted packets. It turns out that guessing the data and verifing it aginst the hash is much faster than brute forceing the crypto.
  • The better you know what's out there to use, the better the chance of recognizing what you're up against.
    • by Anonymous Coward
      I agree totally.

      I've been working years on this message sent to me in 1983.

      Please help if you can.

      Here is the message with quotes:

      'T'

      Thanks
  • by orthogonal ( 588627 ) on Thursday February 20, 2003 @06:33PM (#5347723) Journal
    Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++'

    Yeah, 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' showed up on my SETI@home screen too.

    This is clearly the signature of the Grays from Cygnus Prime. You don't want to communicate with them.

    They Grays of Cynus Prime are evil. They will put chips in your head.

    They will use the chips to make you do bad things. Like posting to Slashdot.
  • Do the statistical analysis on the encrypted data. In several ways. If all you get is seamingly random stream of data with equal distribution of all values then you've got raw stream encrypted by modern, quite strong cipher.

    Good luck ;)
  • Step One: (Score:4, Informative)

    by Jerf ( 17166 ) on Thursday February 20, 2003 @06:36PM (#5347749) Journal
    Step One: "Aquire more samples."

    When you have less data then a smallish key (and that message has no more then 28 * 8 = 244 bits, probably much less), the data can (most likely) decrypt to anything at all with the proper key. If that's all you really have, then you need to pursue non-code-breaking methods of finding out what that is.

    And of course what to do next depends on the characteristics of that more data. A lot of cyptoanalysis assumes you have knowlege of the encryption method; this is because it's "easy" to obtain by reading code, but "easy" is a relative term. It's easier then just guessing, but still hard. Without knowlege of an algorithm, you need to luck out and hope they used one with a distinct signiture. If they didn't, you're probably basically out of luck on a single person's resources, because all of the "good" algorithms should be effectively indistinguishable from noise after encryption.
  • by Dot.Sig ( 167512 ) on Thursday February 20, 2003 @06:36PM (#5347754)
    "Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++'"

    I transmit a polite reply saying, "No, I am NOT interested in being your love monkey no matter how much you lust after me." Gee whiz! The things people say when they think nobody is listening!

  • by j.e.hahn ( 1014 ) on Thursday February 20, 2003 @06:37PM (#5347763)
    Code breaking is hard by its very nature. You're trying to find an unknown message by inverting (or short-circuiting) an unknown process.

    If you think of things mathematically, you're looking to find a plaintext p in the set of all possible plaintexts P and some function f from the set of all ciphertexts to the set of all plaintexts where f(c)=p. These means both f and p are unknown, and while multiple solutions may exist they are likely of "measure zero" in 2 very large spaces. (let's asssume we have a suitable measure for such things, and not worry about the real details.)

    To a mathematician, finding a general solution to the above would be a Field's medal winning sort of thing. The reality is that you need more information. If you got a large message you should start checking letter/symbols counts, following by the counts of various character pairings, etc. The goal is often to come up with a statistical model to see if you can build a plausible f. Another thing is to try common functions (xor with various values, etc.) on the stream and see what happens. Sometimes that'll give you a clue. But most of time it involves a little luck, a little intuition and a lot of perseverance.

  • by Anonymous Coward on Thursday February 20, 2003 @06:41PM (#5347789)
    1. Get some books. Schneier's book is the best starting point.

    2. Learn statistics (and basic number theory). You can discover a lot about a message by its statistical properties.

    3. BREAK LOTS OF CODES. Without experience, you are lost. Start by breaking substitution and Caesar ciphers (easy with statistics), then Vigenere/Gronsfeld ciphers (harder but still "crypto for dummies"), then try XOR ciphers (they can be solved easily in an interesting way)... then try to understand how WEP is broken... DeCSS .. move up the scale until you can understand the way more sophisticated codes are broken (for instance differential cryptanalysis). It gets harder at this point and well outside the realm of practicality but if you get this far, you will be able to break any cryptographically weak cipher (which includes the products of many companies, unfortunately).

    4. If you become advanced enough, you can start reading papers on cryptanalysis. Many of them are surprisingly easy to understand once you understand number theory. However, it is much more difficult to *discover* some of the stuff these guys come up with, it's pretty amazing.

    Anyway, to summarize, understand the statistics involved and PRACTICE until you can just look at a substition cipher and understand what it says... just by the letter frequencies! If you are trying to break a simple code you need lots of ciphertext to analyze.

    And don't forget: sometimes you don't need to break a code at all. As a poster above wrote, sometimes context is enough. Sometimes an external clue will give the code away. How do you know what to look for? Experience!
    • by Garin ( 26873 ) on Thursday February 20, 2003 @09:19PM (#5348900)
      Actually, I disagree. I don't think Schneier's book is the best place to start. It's a fine book, no doubt, but it says very little about real cryptology from a theoretical standpoint, or from the point of view of teaching you to develop or break codes.

      If you're a math god, start with the Handbook of Applied Cryptography [uwaterloo.ca] by Menezes, van Oorschot, and Vanstone.

      If your math isn't quite as godly, start with Thomas Barr's "Invitation to Cryptology". It's an excellent starter book for anyone with even a little bit of mathematical skill. You really don't need much but some high school math, maybe a bit of first-year algebra and stuff, and a willingness to do the chapter problems.
      • I don't think Schneier's book is the best place to start. It's a fine book, no doubt, but it says very little about real cryptology from a theoretical standpoint, or from the point of view of teaching you to develop or break codes."

        Uh, are we all talking about the same "Scneier's Book"? Applied Cryptography is exactly about real cryptology, etc. Are you referring to Secrets and Lies, perhaps?

        • No, "Applied Cryptography" is what I'm talking about. It is, as its title suggests, about -applying- cryptography. It teaches you exactly how to implement the algorithms in code, and it teaches you how to use those algorithms in every day situations.

          He gives some insight into the workings of the algorithms, by eg. explaining how the S-Boxes and stuff work. But it doesn't really teach you how to do cryptanalysis. And it doesn't really teach you how to make your own algorithms.

          The books I've suggested are less practically-oriented. The Handbook goes into the mathematics of it all quite nicely. It's a good introduction to the field, and those who can get through it will have a very solid background for more advanced study. The Invitation book is aimed at maybe second-year students or first year students with some knowledge of linear algebra. It gives you a taste of what cryptology is like, and starts you off on the right path early in your education.
  • Enlist in the NSA and enroll in the National Cryptologic School.
  • This is a question that probably takes a CS PhD to be able answer. So different encryption schemes have different suseptabilities (sp?) to different attacks. For example, if you are using a one-time pad stream cipher using a pad that has never been used before, you are totally SOL as an attacker. It isn't breakable without that pad. Period. If you are using some of the more sophisticated ciphers that have short keys (block ciphers), then there are sophisticated statistical analyses that can be performed to determine the likely method being used.

    What you are referring to however is a situation where you don't know the encryption method. This is extra security through obscurity, which we know doesn't work very well. Many encryption schemes are very, very good, and you won't able to attack them easily even with knowledge of what they are. Usually, for example, you need to know a bit of the message, in addition to the cipher to be able to break it. For example, a bunch of emails may start with "From: xxxxx." If you have a lot of emails, encrypted similarly, you may be able to mount a reasonable attack, depending on the method used.

    -Sean

  • Reading Material (Score:4, Informative)

    by dhwang ( 93406 ) on Thursday February 20, 2003 @06:51PM (#5347855)

    If you are interested, I would suggest that you start by reading The Code Book [simonsingh.net] by Simon Singh [simonsingh.net]. It gives a good overview of the history of the battle between cryptography and cryptanalysis, and how ciphers have evolved to defeat methods of codebreaking. It's an interesting and entertaining read and you might gain some insight on how you would approach this particular cipher.

    BTW, I have a truly marvellous solution to your cipher which this textarea is too small to contain.

  • It also helps if you have a basic idea of what's encrypted, ie what kind of plaintext message you're dealing with. A .doc has a different signature than a jpeg or flat ascii or html, etc. some encryption software relies on headers or footers to the encrypted data in order to sanity check for decryption. again, look also at the medium that the message is transmitted through -- tcp/ip traffic to port 443 speaks volumes about what algorithms are being used. transmissions received in the 2.4 Ghz wavelength also speaks volumes about what algorithms you may be dealing with. finally, never trust the developer to do the 'right thing' with algorithmic selection -- look at adobe's algorithm selection for its ebooks. look for a pattern in what you're dealing with. it can't hurt to generate a dictionary of known ciphertext file patterns a la the *nix 'file' command. lacking a certain amount of information about what you're dealing with(message length, source of the ciphertext, etc), though, you're SOL.

    anyway, I haven't had to deal with much of the kind of encryption that protects data from a government, mostly just the kind that delays your kid sister, so ymmvg...

  • Simple (Score:5, Funny)

    by splattertrousers ( 35245 ) on Thursday February 20, 2003 @06:53PM (#5347874) Homepage
    Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' How do you know how the message has been encrypted?

    First you djc,s dk%33R +++ (110), then you sD##N KDL:: Ds03k -332+. From there, it's a trivial matter of just 3!Wop mclDI a002g a!22# with the sklj3 V3iia aq@@1 +1867 -5309.

    Duh.

    • Re:Simple (Score:2, Interesting)

      by burns210 ( 572621 )
      heh, i found the perfect article [jefraskin.com] that deals with this topic. Suppose aliens find the probe we sent out in the 70's. They see the message we put on the probe and they try to decypher it, but how? It is a very interesting read.

      "what does this say?"

      qv7qrc77qrrx777qrrrs7777qrrrrg77777qrrrrrbv7bqrbck bqqbvkbqrgkbqhskbqpckbqbqvmr rcmhhgmjjbgmppyctbqbivayrpga7bbhjbqbxmawhhbqawwqx7 77kbqbrjhrbvaprkatrkhrca aamwhmwhwhwhwhwhrapkqrmpkqc7bhbhwgawiiqwiqbv

  • You start with known encryption methods (simplest first) and by process of elimination you keep going until you get a clue. A good cryptologist has information about everything from Bacon's method to the most recent ciphers and their algorithms.

    The folks who cracked the Enigma started the same way. The Polish started the process, sent info to England where it was completed.

    A code fragment that short, though, would be darned impossible to crack unless you get more.

    But you can already see patterns: word length, multiple "+" characters (maybe an indicator of end-of-phrase or something?).

    But that's -- basically -- how you do it. Educated guesses and grunt work (either by you or computer). Unless it's Quantum encryption which is spoiled as soon as you intercept, so you can't decode it.

    Check out The Code Book [amazon.com] for some great -- albeit basic -- information about methodology and history.
    • Actually the Brits (or anyone else) did not CRACK the code. They captured several pieces of euipment and KEY books.
      They just used the monster computers to EMULATE the 12 rotor Enigma's that they did not have, and some other higher level machines.

      As far as I know they only had 3 and 4 rotor machines and they used the computer to emulate the bigger, badder Enigma's....

  • Homeland Security (Score:3, Insightful)

    by lostindenver ( 53192 ) on Thursday February 20, 2003 @06:57PM (#5347908)
    If You are an american even posting this is probably a violation of some 4 letter acronym or terrorist prevention law. I mean why not just ask how to make a model rocket why dont ya.

  • ... but I think a time honored tradition of inside knowledge would be applicable here. In short spies, I know its not as technicl of an answer as most were hoping for, but I would conclude that more information has been provided (including information on how to break encryption and decode messages, what the "enemy" is using to encrypt messages) through secret operations an spying than anything else.

    Think about the machines they used to decode messages in WWII... the paterns came from somewhere. I think the same applies today to a degree.
  • by rjh ( 40933 ) <rjh@sixdemonbag.org> on Thursday February 20, 2003 @07:03PM (#5347952)
    Stop thinking about the encrypted bits. Start thinking about who sent these bits and who these bits were sent to. Think about the application which created the data. Think about what purpose the data is going to be used.

    Once you have this information, you'll be much better equipped to figure out what the basic structure underpinning the cipher is. For instance, if the data is part of a realtime encrypted stream, I'd think "stream cipher" and look at RC4 or SEAL. If the data's part of a pen-and-paper arrangement with all values mod 26, I'd think "Solitaire". If the data's a pen-and-paper arrangement meant for communicating between two deep-cover espionage agents, I'd think "one-time pad". If the data's something pulled off a disk drive, I'd think of Matt Blaze's ECB+OFB algorithm. Etc.

    What it boils down to is, this question is pretty arbitrary. Very rarely will you have no metainformation about the plaintext. Seek out as much metainformation as you can, and use the metainformation to make educated guesses, cribs, etc.
  • Code breaking (Score:3, Interesting)

    by crmartin ( 98227 ) on Thursday February 20, 2003 @07:08PM (#5347989)
    Here's a partial answer:

    (1) there is always the possibility that you simply won't. In fact, a properly used one time pad cipher is indistinguishable from noise. It's also a major pain in the ass to use, because you must somehow transmit as many bits of key as you want to send bits of message, and your one-time pad is only as good as your method of transmitting the key.

    (2) If there is some kind of message in the signal and a cipher is involved other than a one-time pad or something isomorphic to one, then there will be some degree of redundancy in it. This is a theorem of information theory. Statistical measures will eventually reveal that the redundancy exists.

    (3) At that point, there are lots of approaches. A good readable and interesting introduction to these, along with the history of such things, is David Kahn's The Codebreakers. [amazon.com] Bruce Schneier's Applied Cryptography [amazon.com] is a good, more technical introduction for the computer geek. I've also heard good things for Handbook of Applied Cryptography [amazon.com] as well, but I don't actually know the book.

    But as someone notes above, it's an inherently hard problem to simply identify the cipher, and modern ciphers like RSA are, as far as we know, computationally intractable because the only known attack requires factoring a very large prime number.

    (4) You give up and hire a pretty young woman to talk the marine guards into letting you at the code room. (Details of this approach are left as an exercise for the interested reader.) Sometimes the old fashioned ways are best.
    • > ciphers like RSA are, as far as we know,
      > computationally intractable because the only known
      > attack requires factoring a very large prime
      > number.
      I believe you meant to say breaking RSA requires factoring a very large *composite* number.

    • you mentioned information theory-- that sparks a thought: don't data compression algorithms level out some of the statistical irregularities in a message? maybe that might help in making it harder to crack. (transmitting the code table obviously introduces some security problems, but lets forget that for a second :-) )

      illuminate me. thanks.
      • Correct. Since compression algorithms effectively decrease the correlation of the original, the result resembles 'noise', or a random bit stream. You see, if there is still correlation in the original, it can be compressed further.
        Note though that compressed files often have 'standard' headers or other sections, and that is very vulnerable to attack.
      • Absolutely -- in fact It Is A Theorem that "perfect compression" would result in something that looked absolutely "white", because a "perfectly compressed" signal would have no redundancy. (Easily proven by contradiction: if there were any redundancy, the signal could be compressed further by eliminating it.)

        Similarly, an encrypted signal should appear to have very little redundancy, because the closer to "white" it looks, the less statistical information there is to attack.

        Your point about the code table is also good. We tend to forget as computer geeks that there is another kind of encryption besides a cipher, which is a good old fashioned code. That is, the construction of an ARBITRARY table of correspondences between some random code group and some desired message. This would be like ...
        ...

        28451 Run for your life, they found us!
        28452 My cat really likes our new down comforter.
        28453 When in the course of human events, it becomes necessary for one people ...
        ...

        (It used to be in Europe that you could tune across the shortwave and find a voice reading five-digit code groups out in German for just hours. I don't know that I ever heard authoritatively what that was, but it's odds-on that it was traffic from the East German stazi or from the Soviets to "moles".)

        Obviously, a real code has the possibility of very great compression ... but at the cost of not being able to to transmit any message. Real code books solve this by having codes for letters and digits, like ...
        ...

        38451 Alpha
        38452 Bravo
        38453 Charlie
        ...
        ... which makes it possible to laboriously spell out any message, but means that an unplanned message is very definitely NOT compressed. It is also a theorem that there exists a shortest compressed version of any message (the exact length depends on some assumptions, but the assumptions account at most for a constant factor.)

        If you were to, say, Huffman-compress a message and could transmit the code table separately (and securely) the result would be pretty hard to read directly. (That still wouldn't make a good code, because it would be too easy to use statistical information about letter distribution in English to work backwards.)
  • by GuyMannDude ( 574364 ) on Thursday February 20, 2003 @07:09PM (#5348011) Journal

    If you receive a message in code how do you take the first step?

    You do what everyone else here does when they come across a problem that may or may not full under the category "News for Nerds. Stuff that Matters": you submit it to Ask Slashdot, of course! Don't worry: they'll print it. They'll print anything and it doesn't even have to be in the form of a question!

    GMD

  • by 3-State Bit ( 225583 ) on Thursday February 20, 2003 @07:10PM (#5348015)
    The only, only thing you can expect to learn is who's communicating with whom [and when / how much information is exchanged] ( you probably know this already ) , and what protocol they're using ( it's probably unbreakable ).

    Chances are, if you are intercepting an encrypted stream, you are intercepting an unbreakably encrypted stream.

    Perhaps you are thinking that if only you knew what protocol the stream is using, you might look online and see if that protocol has been cracked.

    Don't waste your time.

    The chances are approximately 0 that the stream you are intercepting is using a protocol that has been cracked, or that it is using a keyspace you can brute-force for under a few hundred thousand dollars, or in under a matter of years.

    Sorry -- you have a higher chance (almost infinitely higher -- as I said, the chance you will succeed in what you are asking to do is approximately 0) of port-scanning the machine at the source or the destination and 0wning it than you do of breaking the stream.

    I don't say this to mean you should give up -- just that you're phrasing your question wrong. Don't discount the 0wning venue of attack.

    For every million desktop machines communicating over TCP/IP, only a matter of a few dozen will have 0 exploitable security weaknesses. (However, most security weaknesses are unknown.)

    Find out what kind of machine is at the source and the destination, then 0wn one of them. Chances are almost overwhleming that it's possible, if not with a remote exploit, then through social engineering. (Send an attachment that will be opened on either end of the communication, or induce either end to visit a web page in a browser that is exploitable (=, basically, every browser except Lynx).

    If they browse with Netscape or Internet Explorer, chances are almost overwhelming that they can be owned.

    It's not that hard to get someone to browse to a certain page, if you know anything at all about who that person is.

    Back to your original question: gone are the days that protocols were breakable by any hotshot think tank. Today only implementations are, and rarely at the level you're trying to address. Don't break the code -- break into the system.

    Hope this helps.
  • by mbstone ( 457308 ) on Thursday February 20, 2003 @07:41PM (#5348274)
    You social-engineer the NSA or other TLA with teraflop codebreaking computer-capability into helping you crack the message. For example, consider the following method used by an Idahoan to get his potato field plowed:

    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>& gt ;
    An old man lived alone in southern Idaho. It was early spring, and he wanted to spade a garden plot to prepare it for planting potatoes. But it was very hard work and he just didn't have the energy.

    You see, his only son, who would have helped him, was in prison.

    The old man wrote a letter to his son and mentioned his predicament.

    A week later, he received a note back, which said, "For heaven's sake, Dad, whatever you do, don't dig up that section of the garden! That's where I buried the GUNS!"

    The next morning, bright and early, a dozen police showed up and dug up the entire garden, without finding any guns.

    Looking out the kitchen window, the old man thought "Now, what in the world is going on here?" Confused, he wrote another letter to his son telling him what happened, and asked him for advice.

    Another week passed and his son's reply arrived in the mailbox. The old man carried the letter up to the house, sat down at the kitchen table and read, "Now plant your potatoes, Dad. It's the best I could do for you under the circumstances."

  • khfs hskdf woeiyr ngusdt [mediagab.com] lsdfhuyttr *^+hgf 1khh^ [mediagab.com] jshdf +++
  • by Emrikol ( 21551 ) <emrikol&decarbonated,org> on Thursday February 20, 2003 @07:45PM (#5348300) Homepage
    sdjek dYqkP 1Nt$% GGl9) MHrYD +++


    NO CARRIER


    Damn line noise...


    Good old memories!

  • by stefanlasiewski ( 63134 ) <slashdotNO@SPAMstefanco.com> on Thursday February 20, 2003 @07:47PM (#5348321) Homepage Journal
    Back in the days of WWII it was easier. The codebreakers at Bletchley Park already knew that the messages were encoded using an Enigma machine so all they had to do was work out the positions of the rotors using brain power, the Bombe or later the Colossus machine.

    I think you're simplifying these first steps too much.

    When the UK first intercepted a message like 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' , they had little idea what it was. Encryption? Only part of a transmission?

    It took them months (years?) to work through the encryption system. In the beginning, they didn't even know there was an Enigma machine. They broke the code by brute force: Many people trying many different methods. If the Nazi's changed the key, you had to start all over from the beginning.

    It took a long time for the codebreakers to figure out that there even was an Enigma machine, find a machine and figure out how it worked. It took time and effort, people died retreiving the machine.

    Once the codebreakers had used Enigma for time, it sometimes became really simple to determine out if a transmission was Enigma (or other encryption) code.

    If it wasn't cleartext, it was code. You knew certain things about the code: The key was transmitted first, the key was 6 characters long.

    Some code had fingerprints: One guy always transmitted 'HIT#@!', where #@! = 'LER', and so you used "HITLER" to break the rest of the code. Someone else always used a German women's name (Maybe his girlfriend) said "GRE$$!" where $$! = "TTA", so "GRETTA".

    So take a step back, if you can't determine the nature of the code you are seeing, it will be very hard to crack.
    • Actually they knew about Enigma right from the start of the war. The Enigma wasn't much of a secret: a commercial version was available to civilians long before the Nazi's adopted it. Plus the Poles had been breaking Enigma for a few years before WWII, and they turned over all their intelligence to the Brits shortly before Germany attacked. The Poles figured out the details of the military Enigma through good old-fashioned espionage.

      And they didn't break the codes by brute force. They exploited both misuse of the Enigma and flaws in the machine's design. If they had to brute force it, the war would've been long over by the time they were done.

      The Code Book, by Simon Singh has more details on Enigma and the code breakers at Bletchley Park.

      The point is this: you are much better off getting the information from another source rather than analyzing the data. And, your best chance of cracking a code lies in human error: flawed use of crypto.
      • Actually they knew about Enigma right from the start of the war.

        But the UK was trying to break German code long before war broke out, before the Pole provided their intelligence. I think it was clear to some within the UK military that war with Germany was inevitable.

        But I'll have to read that book by Singh, I keep hearing good review.
  • The looks of the encrypted message give good idea about the method it has been encoded with, examples include:

    * Charachter range (alphanumeric - other)
    * Length
    * Special charachters found much in the encryption

    It needs background, if you have seen the type before, you can distingiush it a bit.

    And then there is the context the code is brought in.

    Anyway, code breaking is usually used for malcious stuff nowadays I guess.
  • As others have pointed out, the way this is done in practice is by looking at who sent the message to whom and the circumstances around it.

    That said, it's worth pointing out that cryptographers take a keen interest in the more academic form of this question, and their defined criterion is that a cipher is only good if it's not feasible to distinguish ciphertext from uniformly distributed random data. Stated slightly more formally: if you can find a way to distinguish ciphertext from random data with probability p > .5 (keeping in mind that guessing at random will make you right half of the time, assuming half of the messages you're presented are ciphertext and half are random) then the cipher is considered broken. This means that even if you can only correctly pick out every one-millionth ciphertext, and you have no clue what that message is, or what key was used, the cipher is still "broken".

    • Um, huh? Can you reference this for the rest of us? I'm admittedly a little behind the curve on this stuff, but if its true that MOST ciphertext is that "white" it would let me get at another problem I'm interested in.

      ... on the other hand, it sounds kind of unlikely for information-theoretic reasons, unless there are some additional constraints on key size and message lengths before the key is changed.

      • Can you reference this for the rest of us?

        I could probably dig something up, but let's see if this doesn't address your confusion first: In my "informal" statement I mentioned that the distinguisher must be "feasible", rather than "possible", but in my "slightly more formal" statement I neglected to make this clear.

        it sounds kind of unlikely for information-theoretic reasons

        It's impossible for infomation-theoretic reasons. Once you get past the unicity distance, there's always a way to distinguish random data from ciphertext: try decrypting with every possible key and see if one of them gives you recognizable plaintext, right?

        However, this brute force attack is infeasible if the keys are sufficiently large and if there are no other weaknesses. A publication of a more efficient algorithm that can distinguish ciphertext from random data with p > .5 would be considered a break.

        If this still doesn't make sense, respond and I'll try to dig up some references to papers that have used this concept.

  • Hrmph.. (Score:4, Funny)

    by penguin_punk ( 66721 ) on Thursday February 20, 2003 @08:22PM (#5348529) Journal
    Thanks Cliff, now everyone knows Osama's slashdot nick.
  • by pizza_milkshake ( 580452 ) on Thursday February 20, 2003 @08:50PM (#5348700)
    sdjek dYqkP 1Nt$% GGl9) MHrYD +++

    this is obviously perl code

  • 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' decrypts to,

    'I send you this file in order to have your advice'
  • sdjek dYqkP 1Nt$% GGl9) MHrYD +++

    translates to

    "Enlarge your penis in 5 easy steps!"

    How did I get this? I think it's because that's the e-mail I get most nowadays. So it's most likely to be that. QED
  • ...visit your local library. :) No really, I recommend checking out the National Cryptologic Museum just off of 295 in Maryland - http://www.nsa.gov/museum/index.html

    - RR
  • It's garbage (Score:5, Insightful)

    by lkaos ( 187507 ) <anthony@codemonk ... s minus math_god> on Friday February 21, 2003 @03:25AM (#5350685) Homepage Journal
    sdjek dYqkP 1Nt$% GGl9) MHrYD +++

    Two things give it away:

    The spaces are too regular. You'd be quite hard pressed to form a coherent sentence with any character occuring every 5n character.

    So then perhaps the spaces are irrelevant. Then the next questionable aspect is the last three +++'s. Now, if your code didn't atleast work in groups of three, the mathematic likely hood of three +++ occuring would be small.

    So then, what would make most sense is some kind of consistant bit manipulation at least in cycles of three characters. Then you double GGs and unique character (%$) make that unlikely too.

    So what makes the most sense? Just random typing.

    Look at the first set of characters:

    sdjek

    Just type it a few times... It's quite natural. You might have well used asdf (I bet your typing style isn't perfect... you probably favor your right hand).

    If you examine each other character grouping, you'll see that none of them are very hard to reach.

    Also, it gets the KIS approval which in most circumstances, is the winning vote.
    • Now, Pinky, with this new encryption scheme that deliberately resembles random typing, we shall take over the world!
    • Beg to differ.

      > The spaces are too regular. You'd be quite hard pressed to form a coherent sentence with any character occuring every 5n character.

      Dividing into groups of five characters is a classic technique for simple substitution ciphers - it prevents someone guessing that 'buubdl bu oppo' is 'attack at noon'.

      > Then the next questionable aspect is the last three +++'s.

      Could mean end of line, end of message, anything. Just because it's unlikely to come up at random doesn't mean that it couldn't have been put there.

      Here, here's some practice for you in recognizing 'random' typing vs. ciphertext:

      * wked ik sir ewjsk ao e dkso slo rjdic s akkdo

      * narfs lrcqy athba qabnk opnuu irbpw enfui xlrip pesji ilouh +++

      One of these means something. The other... doesn't.
  • by fiffilinus ( 45513 ) on Friday February 21, 2003 @06:15AM (#5351156) Homepage
    A book titled 'System Identification And Key-Clustering', by Dr. I. J. Kumar is available from Aegean Park Press [aegeanparkpress.com]. It deals with defining a methodology for identifying cryptosystems and narrowing the key space applicable for a given message. This is quite what you want, but be warned - it is not for the faint of heart...
  • How to start. (Score:3, Informative)

    by Eivind ( 15695 ) <eivindorama@gmail.com> on Friday February 21, 2003 @07:37AM (#5351344) Homepage
    1. Try to get more text coded with the same cryptosystem (and preferably the same key). Cracking anything based on 25 bytes of ciphertext is going to be hard.
    2. Look for statistics. Run character-statistics. Do they look like normal text, only with different symbols ? If so you have a monoalphabetic substitution-cipher, crackable in 5 seconds by a computer or 5 minutes by hand. Repeat for digraphs or trigraphs. Any result different from "all combinations equally likely" (or close) gives you a hint.
    3. Try to xor the text with a copy of itself shifted various places left and rigth. Observe how many nulls you get with various displacements. If you get a jump in nulls for a certain shift, you're likely dealing with a periodic substitution-cipher. Again easily crackable if the period is not too long and you have enough ciphertext. (enough here is something like 20 times the period. So if the period is 50 you'd need a kilobyte of ciphertext to easily attack it, more or less.)
    If the text looks completely random under all statistical analysis you can think of, and stays that way even when xored with itself shifted various ways odds are you're dealing with something a bit more serious, and you'll need more expertise than you can gain from a "ask slashdot" article to crack it.

    Good luck !

  • "The attachment foo.doc was garbled. Please re-send in .txt format"
  • The real reason the British were able to break the Enigma codes was that Polish underground agent had stolen an Enigma. Every day the German Weather forecast had the same first line which gave them the exact settings for the machine... Don't over estimate the British... Hey, they invented the concentration camps during WW I.... The Germans "only" took it one step further...

    Anyway, there are a number of ways to go around your problem... The problem is that you need 4 things to decipher the code:

    1. Message Format (has the message been split into multiple parts and rearranged?)...
    2. Used algorithm (DES etc.).
    3. Key length.
    4. The language of the message.

    If these are not known you have the following option: Aquire a number of PC's and either code breaking software (if you can get hold of it) or write it your self. Using these machines set each of them to try breaking the message using different algorithms. The best software doing this have access to a number of dictionaries inorder to check whether it is on the wrong track. This will take some time regarding on the machines... Have fun!
  • CrypTool [cryptool.org] is a free (win32, linux w/WINE) tool with alot of cryptogrphy / cryptoanalysis functionality.

    I would start like this:

    • Try to compress (zip/gzip) - compressibility is a sign for bad crypto.
    • Have a look at the auto-correlation - if you see a comb pattern then it is probably something like XOR, Vigenère, addition mod 256 or similar. CrypTool can break those algorithms automatically.
    • Have a look at character frequency, 2-grams, n-grams
    • Apply some tests for random data - good crypto should produce data undistinguishable from random data
    • If the data looks random you might need some hints on the algorithm.
    All the tests suggested can be performed with CrypTool. If the crypto is strong you will need some more insight, but in many practical cases bad crypto is used, e.g. in Psion Word.
  • 1)Check for similar codes from past. 2)Check with every code for similarity 3)Finalyze 4)Failed Repeat 2. 5)Decode. if everything fails 1)Ask around if anyone knows about new codeing system. 2)Verify the match 3)Decode. or 1)Get longer code. 2)Try to crack open the code your self. 3)Get shot by goverment becaue you found there secret =P

THEGODDESSOFTHENETHASTWISTINGFINGERSANDHERVOICEISLIKEAJAVELININTHENIGHTDUDE

Working...