Choosing Better-Quality JPEG Images With Software? 291
kpoole55 writes "I've been googling for an answer to a question and I'm not making much progress. The problem is image collections, and finding the better of near-duplicate images. There are many programs, free and costly, CLI or GUI oriented, for finding visually similar images — but I'm looking for a next step in the process. It's known that saving the same source image in JPEG format at different quality levels produces different images, the one at the lower quality having more JPEG artifacts. I've been trying to find a method to compare two visually similar JPEG images and select the one with the fewest JPEG artifacts (or the one with the most JPEG artifacts, either will serve.) I also suspect that this is going to be one of those 'Well, of course, how else would you do it? It's so simple.' moments."
File size (Score:2, Insightful)
it is lossy compression, after all . . .
Try compressing both further (Score:2, Insightful)
I suppose you could recompress both images as JPEG with various quality settings, then do a pixel-by-pixel comparison computing a difference measure between each of the two source images and its recompressed version. Presumably, the one with more JPEG artefacts to start with will be more similar to its compressed version, at a certain key level of compression. This relies on your compression program generating the same kind of artefacts as the one used to make the images, but I suppose that cjpeg with the default settings has a good chance of working.
Failing that, just take the larger (in bytes) of the two JPEG files...
Translation: Please help me with my porn... (Score:5, Insightful)
use the JPEG underlying details (Score:2, Insightful)
Re:File size (Score:4, Insightful)
File size may not be accurate if it has been converted multiple times at different quality, or if the source is actually lower quality.
The only way to properly compare is if you have the original as the control.
If you compare between 2 different JPEG quality images, the program won't know which parts are the artifacts. You still have to decide yourself...
It's easy (Score:5, Insightful)
Run the DCT and check how much it's been quantized. The higher the greatest common factor, the more it has been compressed.
Alternatively, check the raw data file size.
quantization tables (Score:3, Insightful)
Others have mentioned file size, but another good approach is to look at the quantization tables in the image as an overall quality factor. E.g., JPEG over RTP (RFC 2435) uses a quantization factor to represent the actual tables, and the value of 'Q' generally maps to quality of the image. Wikipedia's doc on JPEG has a less technical discussion of the topic, although the Q it uses is probably different from the example RFC.
Re:File size (Score:3, Insightful)
File size doesn't tell you anything. If I take a picture with a bunch of noise (eg. poor lighting) in it then it will not compress as well. If I take the same picture with perfect lighting it might be higher quality but smaller file size.
Why this is modded up, I don't know. Too many morons out there.
Re:AI problem? (Score:3, Insightful)
You're right, it needs to be done by humans to be sure.
Amazon's Mechanical Turk should do the trick.
https://www.mturk.com/mturk/welcome [mturk.com]
Re:Measure sharpness? (Score:4, Insightful)
Even faster is look at the DCT coefficients in the file itself. Doesn't even require decoding - JPEG compression works by quantizing the coefficients more heavily for higher compression rates, and particularly for the high frequency coefficients. If more high frequency coefficients are zero, it's been quantized more heavily, and is lower quality.
Now, it's not foolproof. If one copy went through some intermediate processing (color dithering or something) before the final JPEG version was saved, it may have lost quality in places not accounted for by this method. Comparing quality of two differently-sized images is also not as straightforward either.
Re:DCT (Score:3, Insightful)
Re:AI problem? (Score:3, Insightful)
And to reply to myself.. several other posters have noted that taking the DCT of the compression blocks in the image will give information on how highly compressed the image is... there's one example.
Re:use a "difference matte" (Score:3, Insightful)
So, that will show you which parts differ. How do you tell which is higher quality? Sure, you can probably do it by eye. But it sounds like the poster wants a fully automated method.
Re:Easy (Score:4, Insightful)
Re:File size (Score:3, Insightful)
actually one of the meta values that is stored is a quality indicator.
And when you save a max quality copy of a min quality jpeg, the picture still looks like crap.
Re:DCT (Score:1, Insightful)
Or just take the 2D FFT of the entire images. Higher JPEG compression should result in fewer high frequency components in an image.
Re:AI problem? (Score:3, Insightful)
Re:AI problem? (Score:5, Insightful)
Here's a simple but expensive formula:
1. Get the image
2. Compress it severely.
3. Compare the difference between original and the compressed.
The lower the difference, the lower the image quality.
4. Profit!
Or you could just measure the amount of data in the DCT space. Duh.
Re:File size (Score:4, Insightful)
Unfortunately, that's a subjective term based on the 'codec' used to make the jpg. Not everyone's 100 is the same nor is everyone working off the same scale (i.e. 1-10 vs 1-100).
In addition, I bought a program [winsoftmagic.com] (Windows only, sorry) that allows the user to pick the areas of the image that need the most bits. Basically, it allows you to pick the quality for any abitrary region (using standard selection tools like lasso) when saving the JPEG.
I mostly got it for the batch processing and its excellent image quality when you set it to minimum compression.
Re:File size (Score:2, Insightful)
Re:File size (Score:5, Insightful)
Try file size on the set of images of interest to you and see if it coincides with your intuition. If it does, you're done.
Re:DCT (Score:4, Insightful)
That works, but only if you have exact, pixel-to-pixel correspondence between the photos. It won't work if you just grab 2 photos from flicker that both show the Eiffel tower, and you wonder which one is "better".
Luckly, there is a simple way to do it: use jpegtran to extract the quantization table form each image. Pick the one with the smaller values. This can easily be scripted.
Caveat: this will not work if the images have been decoded and re-coded multiple times.
Re:AI problem? (Score:3, Insightful)
Things such as thin wires, multi-colored ribbon cable, close-ups of a circuit board, and other images with lots of similar details seem to benefit most from this kind of tweaking, mainly thanks to the placement and qualities of the artifacts, rather than their mere existence or apparent severity.
I've had this happen many times - set an icon for, say, 35% quality and it will probably look kinda grungy, but step it down by just one or two percent and suddenly the artifacts shift around or change their appearance, sometimes in a manner that better suits the image - almost like constructive interference.
Re:AI problem? (Score:3, Insightful)
Re:AI problem? (Score:3, Insightful)
Of course, what you really need is the NCIS image enhancement package.
Re:AI problem? (Score:3, Insightful)
This just about gets to the heart of it. "Better" is a subjective term, so choosing better quality images is not going to be something everyone can agree on. Your example nails it. If you have two copies of the same image, one is higher resolution than the other, but saved with a higher compression rate, which is better? The answer is going to be "it depends on if the noise introduced by the higher compression annoys me more than the reduced information in the lower resolution image."
If the compression on the high resolution image is high enough, you might still have better detail in the lower resolution image. If the higher resolution image isn't actually higher resolution, just higher dimensions (it's the smaller image scaled up), this is automatically a lower quality image (you can always recreate the higher resolution image from the lower resolution image, but not vice versa as rounding errors cause information loss whenever you scale an image).
There may also be subjective differences like brightness/contrast/tone mapping differences.
Given that the question being asked is a subjective one, the correlation of file size to subjective image quality should be so high that you may gain only a few percent better predictability with an extremely complex algorithm.