Distinguishing Encrypted Data From Random Data?

Distinguishing Encrypted Data From Random Data? 467

Posted by timothy on Sunday September 19, 2010 @03:50PM from the is-this-a-solved-problem? dept.

gust5av writes "I'm working on a little script to provide very simple and easy to use steganography. I'm using bash together with cryptsetup (without LUKS), and the plausible deniability lies in writing to different parts of a container file. On decryption you specify the offset of the hidden data. Together with a dynamically expanding filesystem, this makes it possible to have an arbitrary number of hidden volumes in a file. It is implausible to reveal the encrypted data without the password, but is it possible to prove there is encrypted data where you claim there's not? If I give someone one file containing random data and another containing data encrypted with AES, will he be able to tell which is which?"

Distinguishing Encrypted Data From Random Data?

This discussion has been archived. No new comments can be posted.

Search 467 Comments Log In/Create an Account

Comments Filter:

Ignore the person holding the phone book. (Score:2, Insightful)

by Suki I ( 1546431 ) writes: on Sunday September 19, 2010 @03:55PM (#33629498) Homepage Journal

After a few whacks on the head with the NYC Yellow Pages (old school, print edition) I think someone could find out which file is encrypted and which is garbage.

It's all about entropy (Score:5, Insightful)

by cpghost ( 719344 ) writes: on Sunday September 19, 2010 @03:58PM (#33629538) Homepage

Encrypted files have maximum entropy, just like absolutely random files. Basically, you can't tell which one is which. However, absolute random noise on a disk isn't all that usual, so any encrypted file (or pure random file) will stand like a sore thumb: it will be highly visible. But, again, you can't tell the difference.

Re:Well (Score:5, Insightful)

by Kjella ( 173770 ) writes: on Sunday September 19, 2010 @04:00PM (#33629558) Homepage

As far as I know finding patterns in the output is tightly linked to reducing the number of possible keys, so good encryption algorithms should not create patterns. Of course if your encryption software writes some kind of header - which wouldn't affect the security of the encrypted contents - then it will be obvious to anyone looking that you have an encrypted container. So this is 99% about implementation and 1% about encryption algorithms.

It depends.... (Score:5, Insightful)

by TrumpetPower! ( 190615 ) writes: <ben@trumpetpower.com> on Sunday September 19, 2010 @04:00PM (#33629562) Homepage

If I give someone one file containing random data and another containing data encrypted with AES, will he be able to tell which is which?
Does the person to whom you give these two files have a rubber hose? Is he a member of the “extraordinary rendition” team?
The point of steganography is to not get caught in the first place. If you need plausible deniability, you’ve already lost.
Cheers,
b&

Re:Ignore the person holding the phone book. (Score:1, Insightful)

by acnicklas ( 1740146 ) writes: on Sunday September 19, 2010 @04:05PM (#33629608)

Obligatory XKCD: http://www.xkcd.com/538/ [xkcd.com]

Re:No (Score:3, Insightful)

by sjames ( 1099 ) writes: on Sunday September 19, 2010 @04:06PM (#33629620) Homepage Journal

It would be best to precondition the media by writing random data over the entire thing. For added fun, encrypt the text of various childrens books and write the result to the drive.

Re:Ignore the person holding the phone book. (Score:5, Insightful)

by parlancex ( 1322105 ) writes: on Sunday September 19, 2010 @04:14PM (#33629670)

I think you're missing the point. Of course after they know that you have some encrypted data on your disk the strength of the encryption becomes moot because they can just drug / beat you until you tell them the key, but what this question is about is hiding encrypted data in unencrypted data so prying eyes can't tell if anything is even there at all.

For example, there may come a day when airport security could demand you disclose your passwords when they find you are carrying storage with encrypted content using the aforementioned techniques, but they aren't going to drug / beat every single person coming onto an airplane or going across a border. If your jpgs look like everybody elses jpgs both visually and under close analytical scrutiny they aren't going to bother you. Another example is there may come a day when any traffic on the Internet that cannot be positively identified as a common protocol with statistically "normal" contents is simply rejected. Maybe not here, maybe not right now, but this kind of idea is still very useful.

Re:Well (Score:5, Insightful)

by bytesex ( 112972 ) writes: on Sunday September 19, 2010 @04:16PM (#33629678) Homepage

It depends what you call an 'encryption algorithm'. If you mean 'DES', then no - DES is nowadays considered a weaker algorithm. If you mean 'AES-256', then still no - you need to *apply* AES-256 before it's any good, because AES is a block-cipher and will re-encrypt identical blocks of plain-text with the same key to identical blocks of ciphertext. If you mean 'AES-256 in CBC mode with random IV and SHA-256 HMAC authentication', then that's an algorithm that can be safely used. Under certain real-world circumstances.

Re:It's all about entropy (Score:5, Insightful)

by mlyle ( 148697 ) writes: on Sunday September 19, 2010 @04:17PM (#33629684)

Not exactly.
The problem with steg'ing inside known container formats, compressed container formats, is this:
Each implementation of the compression algorithm has its nuances. If the majority of an MP3 looks like it was compressed by the iTunes implementation, but then there's a range of output iTunes would not generate (particularly if the input file is known), that's very suspect. Ditto if things like PSNR change, even subtly, for the portion where steganography is in play. Even though compressed data has a great deal of entropy, it IS significantly constrained over random data in that A) known decompression programs must return specified output from it, and B) known compression programs generated this data as output from possibly-known input data.
If your adversary is the local police or one of your buddies, this stuff doesn't matter. If it's intelligence agencies or research organizations, good luck. Steganography is hard.

Re:Ignore the person holding the phone book. (Score:4, Insightful)

by John Hasler ( 414242 ) writes: on Sunday September 19, 2010 @04:45PM (#33629848) Homepage

Try to get your head around the idea that they might have possession of your hard disk but not have possession of you. Or they don't even know who you are. Or they are honest cops, trying to determine if you have violated the rules. They've asked you if there is encrypted data on the laptop, you said no, and they are doing a routine check to verify that. Contrary to popular opinion, "The Man" is not always ready, willing, and able to administer a beating.
Then there is the possibility that your opponent is not "the Man" but some sort of furtive criminal...

Re:Ignore the person holding the phone book. (Score:4, Insightful)

by dcollins ( 135727 ) writes: on Sunday September 19, 2010 @04:50PM (#33629870) Homepage

"Did I miss the point or do we need the drugs and wrench?"
You missed the point. The primary question of the OP is this: "...is it possible to prove there is encrypted data where you claim there's not?"
Hint: Include the likelihood of false-positives and false-negatives in your "wrench-based" analysis.

Re:It's all about entropy (Score:5, Insightful)

by v1 ( 525388 ) writes: on Sunday September 19, 2010 @05:20PM (#33630008) Homepage Journal

However, absolute random noise on a disk isn't all that usual,
Actually, nowadays, it's extremely unusual. Blocks are all zero'd from the factory, and anything you save over them that's later marked free will almost certainly be far from random. (like pieces of pictures, documents, applications, etc)
Really, statistically speaking, if you wanted to look on a hard drive for encrypted data, your best bet would be to go looking for blocks of high entropy data.
The only defense against this would be if you did a random wipe of your hard drive when you bought it, and then reinstalled, and patched your OS to automatically random-wipe files before deleting or updating/moving them. But then you get into the area of "this person is obviously going to a lot of work to make it easy to hide something from us", which by itself raises an eyebrow.
And on that note, I'm a little surprised now that I think about it, that I can't come up with a single example anywhere of a native or add-on OS feature for any OS, that does random-wipe-on-delete. OS X has "erase free space" built into disk utility, and you can find an app to do this for other OSs, but obviously zero'd blocks are not what we need to be creating. And the fact that you have to do this step manually, and it takes HOURS to run usually, is also surprising. I don't know offhand if OS X's "secure empty trash" zeros or randoms, but you're not likely to do that for EVERYTHING you throw away since it takes time, and since a lot of files get moved/deleted by the OS automatically without doing this. (end problem: anyone with a clue knows you can't hide anything in a bunch of zero'd blocks)

Re:It's all about entropy (Score:5, Insightful)

by Kjella ( 173770 ) writes: on Sunday September 19, 2010 @05:44PM (#33630160) Homepage

Well, the problem is that it doesn't really apply to compressed data. Compression schemes try packing things as efficiently as possible, so there's relatively little you can add without making it obvious the compression is tampered with. You could try embedding it as some sort of watermark into the photo/video before compression, but that too is difficult and won't hide very much. And most people don't carry tons of BMPs, WAVs and uncompressed AVIs..
So far it seems most people agree the best way to hide encrypted data is within other encrypted data. You don't have to be super-paranoid to use encryption, my last workplace used full disk encryption and I don't think anyone can seriously accuse you of anything if you just say that "I feared by computer would get stolen, and I could be exposed to identity theft or have my family photos posted online" or something like that.
The best solutions I have seen work like this:
1) If you enter both your "normal" password and your "secret password" => access to the normal disk and it'll seamlessly move around any secret data as long as there is room.
2) If you enter only your "secret" password => access to your secret data.
3) If you're under duress, you give just the "normal" password and you get just the normal disk. Your hidden data can get overwritten since the encryption software doesn't know about it, but there's no way to prove that there is a secret container or a secret password.

Re:Shouldn't (Score:3, Insightful)

by linuxrocks123 ( 905424 ) writes: on Sunday September 19, 2010 @06:13PM (#33630362) Homepage Journal

Yes. People do.
We know you can brute-force AES. We also know that if you had a computer the size of the Earth where every piece of matter the size of a grain of sand was an ALU, you wouldn't be able to do it in thousands of years. The only hope attackers have is more sophisticated cryptanalysis techniques. This may or may not happen within 30 years.
---linuxrocks123

Re:It's all about entropy (Score:3, Insightful)

by AK Marc ( 707885 ) writes: on Sunday September 19, 2010 @06:17PM (#33630386)

It is both irrelevant and directly addresses the point. The answer is: "There is no such thing as random data that isn't encrypted, so the question need never be asked. If it's 'random' it is encrypted, even if indistinguishable from truly random data." That may not be true in all cases, but true enough for law enforcement to make your life a living hell for having wiped your HDD with a randomizer.

Re:It's all about entropy (Score:3, Insightful)

by moonbender ( 547943 ) writes: <moonbender AT gmail DOT com> on Sunday September 19, 2010 @06:56PM (#33630650)

I'm not so sure high entropy data is all that rare. While the container format makes them distinguishable from completely random data, compressed audio and video files do have very high entropy, I think. And much of the space of a drive will probably be used for movies and music.

Re:It's all about entropy (Score:3, Insightful)

by melikamp ( 631205 ) writes: on Sunday September 19, 2010 @07:00PM (#33630672) Homepage Journal

I won't make a prediction about a proportion, but it seems to me that orphaned blocks of compressed files would seem pretty darn random, and almost everyone has those.
Also, in GNU/Linux at least, there is shred utility that does what it sounds like: overwrites files with patterns (optionally, with zeroes) before erasing them. May be it works on OS X too?

Re:It's all about entropy (Score:1, Insightful)

by Anonymous Coward writes: on Sunday September 19, 2010 @07:54PM (#33631010)

Nobody's trying to deny that the data looks suspicious. The question is, if someone tells you there's nothing but random numbers on the drive, can you prove that it's a lie? If I tell you that I wiped the disk with random data to delete my video diary for good, can you prove me wrong? This question has serious implications in countries where you have to disclose encryption keys. It's impossible to prove that random-looking data is actually random data, because there's always the possibility of it being the result of one-time-pad encryption. You'd have to prove the non-existence of something. It is therefore unreasonable (and therefore ultimately unlawful) to demand such proof from anyone. So here we are. I claim that it is random data and you can't ask me to prove it. What do you do? Arrest me for carrying random data on a hard disk?

On The Practical Side (Score:3, Insightful)

by BoRegardless ( 721219 ) writes: on Sunday September 19, 2010 @07:57PM (#33631030)

What happens if you use the old "torn sheet of paper" routine?
Each drive or device moving from A to B goes with a different courier/ISP/method and no "piece" contains enough information to be identifiable or usable.
All the pieces need to arrive at the destination to be able to be re-constructed back into usable form.
Any time you send a complete message in one burp, one hard drive or one CD or one image, there is a chance for decryption by any number of accidents or threat of death to all your family members one person at a time while you watch.
No encryption was used in the creation of this message...thus I have deniability.

Re:No, you ALL miss the point. (Score:5, Insightful)

by cetialphav ( 246516 ) writes: on Sunday September 19, 2010 @09:00PM (#33631368)

You tell them you just visited your cousin Jim, who had an old hard drive he didn't want anymore, and you needed a spare so he gave it to you, but not before he ran "dd if=/dev/urandom of=/dev/sda1" because he didn't want you having his old tax documents.
And now you have just fallen victim to a classic interrogation technique. They have just gotten you to tell a story that then can investigate and determine its credibility. They will talk to your cousin Jim; they will look for signs of an OS installation at the date and time you said. They then ask more follow up questions (for which they already know the true answer) to get you to dig a bigger grave for yourself. Then they show you that they know you are lying and inform you of the penalty for that crime and offer you a "deal" to tell the truth.
The fact is that when you are dealing with good interrogators, you cannot lie your way out of it. If you have a huge file full of random data, that is suspicious and there is nothing you can say to change that. The whole point of steganography is to hide the data in something innocent so that no one ever asks you anything. The goal is to blend in and give them no reason to give you a second though.

Re:It's all about entropy (Score:1, Insightful)

by Anonymous Coward writes: on Sunday September 19, 2010 @10:11PM (#33631752)

Fix forever B random binary strings of length M each, call them N = {n_1, n_2, ... , n_M}.
1) Suppose that there is some n_[1-m] for which all but one of the B strings has the bit in that position as 0. The attacker can then determine whether that one string is in the one time pad based merely on whether the bit in that position is flipped between the plaintext and the ciphertext.
2) Gaussian elimination [wolfram.com] = game over. Your strings form a B by M matrix, the plaintext XOR the ciphertext forms a vector of length M, and the "key" is the vector of length B that Gaussian elimination would solve for.

Re:Shouldn't (Score:4, Insightful)

by ras ( 84108 ) writes: <russell+slashdot ... rt DOT id DOT au> on Sunday September 19, 2010 @10:15PM (#33631778) Homepage

AES encrypted data should be indistinguishable from random data
Nope. This assertion has been made here over and over again, and it is out and out wrong . See: http://opensource.dyc.edu/random-vs-encrypted [dyc.edu]
In essence, encrypted data sticks out like dogs balls because of its high entropy, yet there are enough patterns in it to make it obvious to an expert it isn't just random data. Even if it did look like random data who in the hell is going to believe you are carrying around gig's of data you can trivially generate as needed from /dev/urandom? Nobody.
So, the problem you have to solve is how you are going to plausibly explain away gig's of what is clearly encrypted crap. Forget TrueCrypt, or any special tools that don't normally come with your Operating System. Their very presence screams "liar!". Forget large encrypted files that don't have any conceivable use, even if they aren't named "my-porn-collection.zip.gpg". After all, its your laptop so a program you use must have put them there, so some program should break if you move them out of the way.
And finally, once you come up with a way of hiding your encrypted crap, don't go blasting it over the internet. If it became common knowledge the men with rubber hoses may hear of it, rendering your lovely invention useless.
Some evidently don't agree with this last piece of advise because they have posted their solutions to the problem right here, on one the largest megaphones on the 'net. Fortunately for them, Slashdot has in typical Slashdot fashion come to their rescue. Unlike the piece of miss-information I am responding to which is rated "5, informative", these insightful and informative posts are rated 1. Probably because they necessarily involve long complex commands which are utterly beyond your average slashdotter, which probably means they will rarely be used, which probably means they are right - my last piece of advise is alarmist.

Re:It's all about entropy (Score:3, Insightful)

by Anonymous Coward writes: on Sunday September 19, 2010 @11:40PM (#33632192)

As another poster pointed, what you're talking about is the entire point of encryption. And yes, with good cryptography, it's pretty much impossible to decrypt except by brute forcing with all possible key combinations.
Good cryptography is hard, though, which is why it's generally best to leave it up to the experts. Not saying you can't have fun thinking about it, but realize that you're probably not going to come up with anything really new in the field.
I'd suggest started with Applied Cryptography for a primer.

Re:NSA? Bah. (Score:2, Insightful)

by Shakrai ( 717556 ) * writes: on Monday September 20, 2010 @02:24AM (#33632912) Journal

See, the "drug him" part I have an issue with. I have personal experience with both mind altering drugs [erowid.org] and truecrypt. Let me assure you that drugs do not help you to remember a complex pass phrase.... ;)

Re:NSA? Bah. (Score:3, Insightful)

by denzacar ( 181829 ) writes: on Monday September 20, 2010 @06:24AM (#33633756) Journal

Not that kind of drug. [wikipedia.org]
Also, drugs are to be used in sequence with the beatings - not simultaneously.
No point in beating up someone who can't feel anything just for the sake of beating him up.
Leave the personal enjoyment for later.

You're solving the wrong problem (Score:1, Insightful)

by Anonymous Coward writes: on Monday September 20, 2010 @10:21AM (#33635326)

Don't worry about hiding the data - there are many ways of doing that. Worry about hiding the software that accesses it. The thing that gets most folks using steganography caught is the `investigator' finding steg. software on their machine. After that it's just a question of searching through each of the formats it does or threatening them with obstruction of justice / other crimes until they tell you what they used it for. Or, at least, that's what I learnt from the high tech crime squad...

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Distinguishing Encrypted Data From Random Data? 467

Distinguishing Encrypted Data From Random Data? More Login

Distinguishing Encrypted Data From Random Data?

Ignore the person holding the phone book. (Score:2, Insightful)

It's all about entropy (Score:5, Insightful)

Re:Well (Score:5, Insightful)

It depends.... (Score:5, Insightful)

Re:Ignore the person holding the phone book. (Score:1, Insightful)

Re:No (Score:3, Insightful)

Re:Ignore the person holding the phone book. (Score:5, Insightful)

Re:Well (Score:5, Insightful)

Re:It's all about entropy (Score:5, Insightful)

Re:Ignore the person holding the phone book. (Score:4, Insightful)

Re:Ignore the person holding the phone book. (Score:4, Insightful)

Re:It's all about entropy (Score:5, Insightful)

Re:It's all about entropy (Score:5, Insightful)

Re:Shouldn't (Score:3, Insightful)

Re:It's all about entropy (Score:3, Insightful)

Re:It's all about entropy (Score:3, Insightful)

Re:It's all about entropy (Score:3, Insightful)

Re:It's all about entropy (Score:1, Insightful)

On The Practical Side (Score:3, Insightful)

Re:No, you ALL miss the point. (Score:5, Insightful)

Re:It's all about entropy (Score:1, Insightful)

Re:Shouldn't (Score:4, Insightful)

Re:It's all about entropy (Score:3, Insightful)

Re:NSA? Bah. (Score:2, Insightful)

Re:NSA? Bah. (Score:3, Insightful)

You're solving the wrong problem (Score:1, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot