Choosing Better-Quality JPEG Images With Software?

Become a fan of Slashdot on Facebook

Choosing Better-Quality JPEG Images With Software? 291

Posted by timothy on Thursday July 16, 2009 @06:02PM from the on-the-tip-of-my-script dept.

kpoole55 writes "I've been googling for an answer to a question and I'm not making much progress. The problem is image collections, and finding the better of near-duplicate images. There are many programs, free and costly, CLI or GUI oriented, for finding visually similar images — but I'm looking for a next step in the process. It's known that saving the same source image in JPEG format at different quality levels produces different images, the one at the lower quality having more JPEG artifacts. I've been trying to find a method to compare two visually similar JPEG images and select the one with the fewest JPEG artifacts (or the one with the most JPEG artifacts, either will serve.) I also suspect that this is going to be one of those 'Well, of course, how else would you do it? It's so simple.' moments."

This discussion has been archived. No new comments can be posted.

Choosing Better-Quality JPEG Images With Software?

Load All Comments

Search 291 Comments Log In/Create an Account

Comments Filter:

Easy (Score:3, Interesting)

by Anonymous Coward writes: on Thursday July 16, 2009 @06:05PM (#28723511)

Paste both images in your image editor of choice, one layer on top of each other, apply a difference/subtraction filter.

Share
twitter facebook
- Re:Easy (Score:4, Insightful)
  
  by Random Destruction ( 866027 ) writes: on Thursday July 16, 2009 @07:06PM (#28724265)
  
  Ok, so you know how two images differ. Which one is closer to the original? You don't know, because you don't have the original to compare.
  
  Parent Share
  twitter facebook
File size (Score:2, Insightful)

by Tanman ( 90298 ) writes:

it is lossy compression, after all . . .
- Re:File size (Score:5, Informative)
  
  by Robotbeat ( 461248 ) writes: on Thursday July 16, 2009 @06:14PM (#28723651) Journal
  
  File size doesn't tell you everything about quality.
  For instance, if you save an image as a JPEG vs. first saving as a dithered GIF and _then_ saving as JPEG, then the second one will have much worse actual quality, even if it has the same filesize (it may well have worse quality AND have a larger file size).
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Interesting)
    
    by Vectronic ( 1221470 ) writes:
    
    Also, stuff like Photoshop, will insert a bunch of meta/exif-bullshit but something like Paint, doesn't... it's usually only about 2 to 3kb, but it's still tainting your results if you are going by size alone.
    - - Re: (Score:3, Insightful)
        
        by Qzukk ( 229616 ) writes:
        
        actually one of the meta values that is stored is a quality indicator.
        And when you save a max quality copy of a min quality jpeg, the picture still looks like crap.
      - Re: (Score:2)
        
        by Chyeld ( 713439 ) writes:
        
        Unfortunately, that's a subjective term based on the 'codec' used to make the jpg. Not everyone's 100 is the same nor is everyone working off the same scale (i.e. 1-10 vs 1-100). It helps if all the images were made by the same program using the same parameters, but breaks down quickly as a valid comparitor after that.
        
        Re:File size (Score:4, Insightful)
        
        by nabsltd ( 1313397 ) writes: on Thursday July 16, 2009 @08:41PM (#28725001)
        
        Unfortunately, that's a subjective term based on the 'codec' used to make the jpg. Not everyone's 100 is the same nor is everyone working off the same scale (i.e. 1-10 vs 1-100).
        In addition, I bought a program [winsoftmagic.com] (Windows only, sorry) that allows the user to pick the areas of the image that need the most bits. Basically, it allows you to pick the quality for any abitrary region (using standard selection tools like lasso) when saving the JPEG.
        I mostly got it for the batch processing and its excellent image quality when you set it to minimum compression.
        
        Parent Share
        twitter facebook
- Re:File size (Score:4, Insightful)
  
  by teko_teko ( 653164 ) writes: on Thursday July 16, 2009 @06:16PM (#28723667) Homepage
  
  File size may not be accurate if it has been converted multiple times at different quality, or if the source is actually lower quality.
  The only way to properly compare is if you have the original as the control.
  If you compare between 2 different JPEG quality images, the program won't know which parts are the artifacts. You still have to decide yourself...
  
  Parent Share
  twitter facebook
  - Possible Method... (Score:2)
    
    by teko_teko ( 653164 ) writes:
    
    I just thought of a possible way to compare...
    Assuming both JPEG aren't at the lowest (or very low) quality:
    1. Take image A, create 10 or 20 more copies using different levels of quality (5, 10, 15, and so on).
    2. Compare each of them with image A, from lowest to highest quality.
    3. Stop where the diff no longer change with the previous image, then we can assume image A is at the previous image's quality level.
    Do the same with image B.
  - - Re:File size (Score:5, Interesting)
      
      by Chyeld ( 713439 ) writes: <chyeld.gmail@com> on Thursday July 16, 2009 @07:43PM (#28724585)
      
      There was a old story my AI teacher used to share back in college about a military contractor that was developing an AI based IFF (identifcation, friend or foe) system for aircraft.
      They trained it using what was, at the time, a vast picture database of every aircraft known. In the lab, they were able to get it down to 99% accurate, with the error favoring 'unknown' as the third option.
      So they took it out for a test run. The first night out the system tried firing on anything and everything it could lock on, including ground targets.
      This was bad. Horribly bad. But they were certain that there was some sort of equipment failure going on. After all their AI was damn near perfect at ID'ing the targets in the lab, the issues must be up the line somewhere.
      So they did a once over of the equipment and couldn't find a problem. Not sure what to do next the team took the system out for another dry run the next day. This time, the system refused to see any ground targets and anything it saw in the air was friendly.
      Now this was getting ridiculous, the team was extremely confused. So they did what they should have done the first time around, they did a third test run looking at what the AI was actually 'thinking'.
      And promptly discovered the problem. While they had a huge database of images to use, they realized that all their 'friendly' craft had pictures taken during the day, while in flight. All their 'hostile' craft however were pictures that had been taken at night during spy runs or from over head satalite shots.
      The AI wasn't keying off the planes, it was keying off whether it was daytime or night time.
      I don't know if the above actually ever happened, but my point is, it doesn't matter how many images you seed your database with. Unless you are there to tell it what is an artifact and what is just part of the picture, you are going to end up with horrible results and comical results.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by 4D6963 ( 933028 ) writes:
        
        I heard the same one except with tanks.
        
        Re:File size (Score:5, Funny)
        
        by Minwee ( 522556 ) writes: <dcr@neverwhen.org> on Thursday July 16, 2009 @08:51PM (#28725069) Homepage
        
        And a squad of kanagaroos firing RPGs.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Interesting)
        
        by Chyeld ( 713439 ) writes:
        
        I always wondered if that one wasn't an urban legend too, but appearently it was mostly true [hooksprogress.org]:
        The reuse of some object-oriented code has caused tactical headaches for Australia's armed forces. As virtual reality simulators assume larger roles in helicopter combat training, programmers have gone to great lengths to increase the realism of their scenarios, including detailed landscapes and - in the case of the Northern Territory's Operation Phoenix - herds of kangaroos (since disturbed animals might well give
- Re: (Score:3, Insightful)
  
  by Anonymous Coward writes:
  
  File size doesn't tell you anything. If I take a picture with a bunch of noise (eg. poor lighting) in it then it will not compress as well. If I take the same picture with perfect lighting it might be higher quality but smaller file size.
  Why this is modded up, I don't know. Too many morons out there.
  - Re: (Score:2)
    
    by lymond01 ( 314120 ) writes:
    
    I sort of had the impression the person was talking about the exact same picture, saved from the original to two different qualities of JPEG. If he were trying to tell the difference between the amount of JPEG artifacts in two different pictures, I imagine he would get inconsistent results, given many trials, for the reasons you say.
    I suppose he could have meant something different than what he said, but there aren't too many politicians trolling slashdot, I'd guess.
  - Re: (Score:2)
    
    by PitaBred ( 632671 ) writes:
    
    But if they're duplicate pictures (some kind of matching heuristic), then file size most certainly IS appropriate. You're starting from the same point, choosing the result with less lost during compression, and therefore larger, would be quite logical.
  - Re: (Score:2)
    
    by 4D6963 ( 933028 ) writes:
    
    Why is yours modded up higher I wonder. The OP wants to "compare two visually similar JPEG images and select the one with the fewest JPEG artifacts". That means they're the same image. That means file size will help you there, unless they're not the same resolution, although it should do regardless.
    If I take a picture with a bunch of noise (eg. poor lighting) in it then it will not compress as well. If I take the same picture with perfect lighting it might be higher quality but smaller file size.
    That's
  - Re:File size (Score:5, Insightful)
    
    by timeOday ( 582209 ) writes: on Thursday July 16, 2009 @10:11PM (#28725545)
    
    This is the kind of problem you can solve in 2 minutes with 95% accuracy (by using file size), or never finish at all by listening to all the pedants on slashdot. When people know a little too much they love to go on about stuff like entropy and information gain, just because they (sort of) can.
    Try file size on the set of images of interest to you and see if it coincides with your intuition. If it does, you're done.
    
    Parent Share
    twitter facebook
- Re:File size (Score:5, Informative)
  
  by Shikaku ( 1129753 ) writes: on Thursday July 16, 2009 @06:29PM (#28723825)
  
  http://linux.maruhn.com/sec/jpegoptim.html [maruhn.com]
  No. You can compress JPEG lossless.
  
  Parent Share
  twitter facebook
  - - Re:File size (Score:5, Informative)
      
      by Score Whore ( 32328 ) writes: on Thursday July 16, 2009 @06:52PM (#28724123)
      
      ...THERE IS NO LOSSLESS JPEG. PERIOD.
      Except for Lossless JPEG [wikipedia.org] standardized in 1993. But other than that, no there is no lossless jpeg.
      
      Parent Share
      twitter facebook
      - Re: (Score:2, Interesting)
        
        by mezis ( 595240 ) writes:
        
        Every single JPEG is lossy, for three reasons:
        
        a. Source (color) digital images use RGB colorspace (typically, the raw format is "RAW" with a Bayer layout). JPEG compresses three planes, with a YCrCb colorspace.
        Due to colorspace conversion and quantization error, you lose information. That's called "lossy".
        b. Even in lossless JPEG, each 64-pixel block is KR-transformed and quantized. Again, always lossy.
        c. No free lunch.
        
        Typically, even lossless JPEG makes you lose 1-2% of the total information (m
        
        Re:File size (Score:5, Informative)
        
        by Binary Boy ( 2407 ) writes: on Thursday July 16, 2009 @11:12PM (#28725743)
        
        Lossless JPEG and lossless JPEG2000 are both exactly that - lossless. Not perceptually lossless, which is what people often use to refer to high-quality, lossy JPEG/JPEG2000, or JPEG-LS. Lossless JPEG uses a PCM-like encoder, not DCT, AFAIR. Lossless JPEG and lossless JPEG2000 are, in fact, lossless, at least with regards to image data in supported color spaces. This is in part a result of *not* converting to YCrCb, since that conversion is lossy, of course. Not all Lossless JPEGs are 8bit YCrCb.
        Accusoft, for one, has a toolkit for building lossless JPEG applications which supports 16bit RGB and greyscale lossless JPEG modes.
        The near-lossless JPEG you're thinking of is JPEG-LS, which is perceptually lossless, and guarantees a maximum error rate that is generally neglible for almost all applications. This format gets better compression ratios than Lossless JPEG, of course.
        Neither the lossless or near-lossless JPEG modes are common though, outside of niche apps. Lossless JPEG2000 is, however, since almost all JPEG2000 libraries support it alongside the lossy modes.
        
        Parent Share
        twitter facebook
      - Re: (Score:2, Funny)
        
        by Anonymous Coward writes:
        
        >>Except for Lossless JPEG [wikipedia.org] standardized in 1993. But other than that, no there is no lossless jpeg.
        Katie Couric: What did John McCain do to try to stop the housing meltdown?
        Sarah Palin: He voted for legislation to more carefully regulate Fannie Mae and Freddie Mac to stop bad lending practices.
        Katie Couric: ...
        Katie Couric: Well, besides that, what did he do?
        Sarah Palin: ?
        And the funny thing is, we all remember this now as Sarah Palin not knowing the answer to the question, when it was
- NO not file size (Score:2)
  
  by frovingslosh ( 582462 ) writes:
  
  NO. Not file size. File size would be a potential test if all images were from the same original source and if they were only ever jpeg compressed once. Unfortunately, quite often one will come across images that have been jpeg compressed and re-compressed, and the final re-compression was done at "high quality', So the file is large for the image, but it still contains all of the jpeg artifacts from the lower quality compression. You can also see extra artifacts when one file has only been compressed once
I'm not an expert (Score:2)

by Flimzy ( 657419 ) writes:

But what if you saved both images in an uncompressed format (bmp?), then compressed them both using a lossless format (gzip?), and compared the file sizes...
Do it with a bunch of images, and I expect you'll discover that the low-quality-gzipped image will be smaller than the high-quality-gzipped image...
Maybe? *shrug*
- Re: (Score:2)
  
  by gurps_npc ( 621217 ) writes:
  
  I agree that this would probably be the simplest method. Note, I wonder if something as simple as examining the file size of the jpeg would be good enough for most cases.
- Re: (Score:2)
  
  by Bill, Shooter of Bul ( 629286 ) writes:
  
  Good idea. I'm also not an expert. Though, I would think there is a limit to how well this would work. If it were cell shaded to some extent, it might look better than a lossy jpg, but compress to a smaller size. The question is if there would be any point in between where loss of information would actually result in better image quality.
  Imagine a chess board is in the image. If an image is sort of lossy, the lines between the black and white might get a little blurred with some black running into some whit
- Re: (Score:2)
  
  by sznupi ( 719324 ) writes:
  
  I'm also not an expert, but I suspect it might work in the other direction far too often.
  Perhaps artifacts of low-quality jpeg images, embedded in simple stream of bmp, could look more like noise to general purpose compressor; more than "natural" photographs with gradual gradients.
  And random noise is incompressible.
File size or density? (Score:2)

by Durandal64 ( 658649 ) writes:

Have you tried just comparing the files' sizes with respect to the images' dimensions? It'll vary from encoder to encoder, but higher-quality JPEGs will be larger than lower-quality ones. You could just use the number of pixels in the picture and the file size to obtain a rough approximation of "quality per pixel" and choose the image with the higher value. It won't be perfect, but it's a lot easier than trying to pick out JPEG artifacts.

Also, the number of artifacts doesn't tell the full story. One imag
- Re: (Score:2)
  
  by PCM2 ( 4486 ) writes:
  
  And BTW, isn't this what most of us do already when we're searching Google Images?
Share your suggestions (Score:2)

by gehrehmee ( 16338 ) writes:

Given a set a pictures, it would be really nice to see them grouped by "these are several pictures of the same scene/object/subject". This is a tool I'm not aware of yet, and I'd love to hear what open-source tools people are using.
As a next step, it would be neat to pick out the one that's most in focus...
- Re: (Score:2)
  
  by Chabo ( 880571 ) writes:
  
  I saw a piece of software that does something similar to what you're talking about; recently I watched James May's Big Ideas [wikipedia.org], they showed a camera that you wear around to create a lifelog [wikipedia.org].
  The camera took photos every 30 seconds or so, and the software was able to divide sets of photos into "events"; it distinguished between the time the wearer was in the kitchen making breakfast, and when they sat at their computer typing up an article, for instance. I imagine that someone's created similar software for pub
- tineye? (Score:2)
  
  by E IS mC(Square) ( 721736 ) writes:
  
  Check out Tineye - http://tineye.com/faq [tineye.com]
  
  It does not do exactly what above post suggests, but it partially does what submitter asked (finding similar images on the net).
Try compressing both further (Score:2, Insightful)

by Ed Avis ( 5917 ) writes:

I suppose you could recompress both images as JPEG with various quality settings, then do a pixel-by-pixel comparison computing a difference measure between each of the two source images and its recompressed version. Presumably, the one with more JPEG artefacts to start with will be more similar to its compressed version, at a certain key level of compression. This relies on your compression program generating the same kind of artefacts as the one used to make the images, but I suppose that cjpeg with the
Requires original image in loss-less form (Score:2)

by xquark ( 649804 ) writes:

or else the problem is not truly resolvable. The other way is to
assume all the similar images come from the same source, if so then
its as simple as looking at the compression level in the file format
and the various levels of scaling applied to the lossy images.
ImageMagick can give you EXIF data. (Score:5, Informative)

by bcrowell ( 177657 ) writes: on Thursday July 16, 2009 @06:12PM (#28723623) Homepage

The ImageMagick package includes a command called identify, which can read the EXIF data in the JPEG file. You can use it like this:
identify -verbose creek.jpg | grep Quality
In my example, it gave " Quality: 94".
This will not work on very old cameras (from ca. 2002 or earlier?), because they don't have EXIF data. This is different info than you'd get by just comparing file sizes. The JPEG quality setting is not the only factor that can influence file size. File size can depend on resolution, JPEG quality, and other manipulations such as blurring or sharpening, adjusting brightness levels, etc.

Share
twitter facebook
- Re: (Score:3, Informative)
  
  by DotDotSlasher ( 675502 ) writes:
  
  imagemagick can also compare two images, and tell you how different they are. That is -- quantify the differences by returning a floating point number or two (PSNR, RMSE) in a way that a more-compressed JPEG image will return a correspondingly different floating point value. I know the question concerns two JPEG-compressed images, but if you do have an original image -- and you want to test which is closest to the original, ImageMagick can do that. Use the ImageMagick compare function.
  See http://www.imag [imagemagick.org]
- You're assuming a bit too much, aren't ya? (Score:2)
  
  by macraig ( 621737 ) writes:
  
  It's appears that you assume that he wants to compare images for which he himself is the source? What if the images he actually wants to compare are pr0n, of the same hi-res glamour photo sets obtained from different sources? He needs to decide which is the "best" pron to keep, right? (Never mind that he can probably jack off equally well to either/any... he's a COLLECTOR so it matters. :-)
  Such images will almost always have the EXIF data scrubbed from them, so your technique wouldn't work at all for rac
- - Re: (Score:2)
    
    by Jane Q. Public ( 1010737 ) writes:
    
    No guarantee, but the probability is extremely high. If you have two files of the same dimensions that are visually similar (i.e., different versions of the same picture), then the one with the higher dpi rating (which is not directly related to dimension in JPEG) and better quality (lower compression) as shown by EXIF data is almost certainly going to be the one with fewer artifacts in the real world. Of course it is possible to create situations in which that is not so, but they don't usually happen accid
Translation: Please help me with my porn... (Score:5, Insightful)

by Chyeld ( 713439 ) writes: <chyeld.gmail@com> on Thursday July 16, 2009 @06:13PM (#28723627)

Dear Slashdot,
Recently I checked my porn drive and realized that I have over 50 gigibytes of jpg quality porn collected. Unfortunately, I've noticed that a good portion of these are all the same picture of Natlie Portman eating hot grits. Could you please point me to a free program that will allow me to find the highest resolution, best quality version of this picture from my collection and delete the rest?
Many Thanks!

Share
twitter facebook
- Found it a while ago (Score:5, Informative)
  
  by sco08y ( 615665 ) writes: on Thursday July 16, 2009 @06:22PM (#28723749)
  
  I mean, you don't want second rate pictures in your pr0n stash?
  I had problems building it back then, let alone writing the scripts for it and the hassle of figuring out which images were duplicates, but this utility [schmorp.de] seems to fit the bill.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by Pointy_Hair ( 133077 ) writes:
  
  ... and I've been wanking to it so often I actually -have- gone blind now and can't tell the versions apart, even on my new widescreen monitor!
jpeg quality != image quality (Score:2)

by johnrpenner ( 40054 ) writes:

all things the same, jpeg quality gives a good index to the quality of the image,
but it can be just as true that a lower jpeg quality image might be a better quality image.
for example, two images: the first image might be scanned off a badly faded
colour photocopy of a famous painting - it is saved at 300 dpi - approximately
2800 x 1200 pixels, and the jpeg quality set at 12 -- the second image is a
well lit photograph of the original painting, scanned on a scitex scanner,
and brought in as a tiff original -- a
use the JPEG underlying details (Score:2, Insightful)

by cellurl ( 906920 ) * writes:

To make a JPEG, you cut it into blocks, run the DCT [wikipedia.org] on each block and mess with the 4:2:2 color formula and pkzip the pieces... That said, I would think measuring the number of blocks would be related to number of artifacts... In my barbaric approach to engineering, (assuming there is no other suggested way on slashdot), I would get the source code to the JPEG encoder/decoder and print out statistics (number of blocks, block size) of each image...
It's easy (Score:5, Insightful)

by Anonymous Coward writes: on Thursday July 16, 2009 @06:16PM (#28723673)

Run the DCT and check how much it's been quantized. The higher the greatest common factor, the more it has been compressed.
Alternatively, check the raw data file size.

Share
twitter facebook
quantization tables (Score:3, Insightful)

by angryargus ( 559948 ) writes: on Thursday July 16, 2009 @06:17PM (#28723683)

Others have mentioned file size, but another good approach is to look at the quantization tables in the image as an overall quality factor. E.g., JPEG over RTP (RFC 2435) uses a quantization factor to represent the actual tables, and the value of 'Q' generally maps to quality of the image. Wikipedia's doc on JPEG has a less technical discussion of the topic, although the Q it uses is probably different from the example RFC.

Share
twitter facebook
Measure sharpness? (Score:4, Interesting)

by Anonymous Coward writes: on Thursday July 16, 2009 @06:18PM (#28723693)

Compute the root-mean-square difference between the original image and a gaussian-blurred version?
JPEG tends to soften details and reduce areas of sharp contrast, so the sharper result will probably
be better quality. This is similar to the PSNR metric for image quality.
Bonus: very fast, and can be done by convolution, which optimizes very efficiently.

Share
twitter facebook
- Re: (Score:2)
  
  by PCM2 ( 4486 ) writes:
  
  But this method requires a copy of the original -- or failing that, you'd need to already know which of the JPEGs is the highest quality, which defeats the purpose.
  - Re: (Score:2)
    
    by uhmmmm ( 512629 ) writes:
    
    No it doesn't. This method has another problem (see my replies to it), but other than that, it could work. He's suggesting that to each copy of the image, you look at the difference between that copy and a blurred version of it. This will give you an idea of how sharp that copy is. And since JPEG throws out high frequency information first, resulting in blurring, it would appear at first glance that the sharper image should be the higher quality one.
    As I said in another comment though, JPEG operates on
- Re:Measure sharpness? (Score:4, Insightful)
  
  by uhmmmm ( 512629 ) writes: <.uhmmmm. .at. .gmail.com.> on Thursday July 16, 2009 @06:52PM (#28724109) Homepage
  
  Even faster is look at the DCT coefficients in the file itself. Doesn't even require decoding - JPEG compression works by quantizing the coefficients more heavily for higher compression rates, and particularly for the high frequency coefficients. If more high frequency coefficients are zero, it's been quantized more heavily, and is lower quality.
  Now, it's not foolproof. If one copy went through some intermediate processing (color dithering or something) before the final JPEG version was saved, it may have lost quality in places not accounted for by this method. Comparing quality of two differently-sized images is also not as straightforward either.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by uhmmmm ( 512629 ) writes:
  
  Also, JPEG works on blocks. While it's true that JPEG gets rid of high frequency details first (and thus results in blurring), this is only useful within each block. You can have high contrast areas at the edge of each block, and this is actually often some of the most annoying artifacting in images compressed at very low quality. So just because it has sharp edges doesn't mean it's high quality.
DCT (Score:5, Informative)

by tomz16 ( 992375 ) writes: on Thursday July 16, 2009 @06:18PM (#28723695)

Just look at the manner in which JPEGs are encoded for your answer!
Take the DCT (discrete cosine transform) of blocks of pixels throughout the image. Examine the frequency content of the each of these blocks and determine the amount of spatial frequency suppression. This will correlate with the quality factor used during compression!

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by mikenap ( 1258526 ) writes:
  
  This seems to me the best suggestion, and there's a simple visual way to accomplish it! The hardest hit part of the image is going to be the chroma information, which your eye normally has reduced resolution sensitivity for in a normal scene. To overcome this, load your JPEGs into your favorite image editor and crank the saturation to the max(this throws away the luminance data). Now the JPEG artifacts in the chroma information will HIT YOU IN THE FACE, even in images that seemed rather clean before. Pick
- Re:DCT (Score:4, Insightful)
  
  by eggnoglatte ( 1047660 ) writes: on Thursday July 16, 2009 @10:12PM (#28725553)
  
  That works, but only if you have exact, pixel-to-pixel correspondence between the photos. It won't work if you just grab 2 photos from flicker that both show the Eiffel tower, and you wonder which one is "better".
  Luckly, there is a simple way to do it: use jpegtran to extract the quantization table form each image. Pick the one with the smaller values. This can easily be scripted.
  Caveat: this will not work if the images have been decoded and re-coded multiple times.
  
  Parent Share
  twitter facebook
use a "difference matte" (Score:4, Informative)

by Anonymous Coward writes: on Thursday July 16, 2009 @06:20PM (#28723719)

load up both images in adobe after effects or some other image compositing program and apply a "difference matte"
Any differences in pixel values between the two images will show up as black on a white background or vise versa...
adam
BOXXlabs

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by uhmmmm ( 512629 ) writes:
  
  So, that will show you which parts differ. How do you tell which is higher quality? Sure, you can probably do it by eye. But it sounds like the poster wants a fully automated method.
Try ThumbsPlus (Score:3, Informative)

by Anonymous Coward writes: on Thursday July 16, 2009 @06:21PM (#28723729)

ThumbsPlus is an image management tool. It has a feature called "find similar" that should do what you want as far as identifying to pictures that are the same except for the compression level. Once the similar picture is found you can use ThumbsPlus to look at the file sizes and see which one is bigger.

Share
twitter facebook
Bits per pixel (Score:2)

by Citizen of Earth ( 569446 ) writes:

Compute the number of bits per pixel of the image data.
image quality measures (Score:5, Informative)

by trb ( 8509 ) writes: on Thursday July 16, 2009 @06:28PM (#28723813)

google (or scholar-google) for Hosaka plots, or image quality measures. Ref:
HOSAKA K., A new picture quality evaluation method.
Proc. International Picture Coding Symposium, Tokyo, Japan, 1986, 17-18.

Share
twitter facebook
- Check the quantization (Score:2)
  
  by PhrostyMcByte ( 589271 ) writes:
  
  I remember a Slashdot article of a guy who used JPEG quantization to detect if images were photoshopped... it had an example of a terrorist adding books. Can't find it via google tho.
  - Re:Check the quantization (Score:4, Informative)
    
    by Animaether ( 411575 ) writes: on Thursday July 16, 2009 @07:19PM (#28724379) Journal
    
    http://www.cs.dartmouth.edu/farid/research/tampering.html [dartmouth.edu]
    http://www.cs.dartmouth.edu/farid/publications/tr06a.html [dartmouth.edu]
    
    Parent Share
    twitter facebook
Blur Detection? (Score:2, Informative)

by HashDefine ( 590370 ) writes:

I wonder if out of focus or blue detection methods will give you a metric which varies with the level of jpeg artifcats, after all the jpeg artifacts should make it more difficult to do things like edge detections etc which are the same the things that made more difficult by blurry and out of focus images
A google search for blur detection should bring up things that you can try, Here [kerrywong.com] is series of posts that to do a good job of explaining some of the work involved
Fourier transform (Score:2, Interesting)

by maxwell demon ( 590494 ) writes:

Assuming the only quality loss is due to JPEG compression, I guess a fourier transform should give you a hint: I think the worse quality image should have lower amplitude of high frequencies.
Of course, that criterion may be misleading if the image was otherwise modified. For example noise filters will typically reduce high frequencies as well, but you'd generally consider the result superior (otherwise you woldn't have applied the filter).
Filters (Score:5, Funny)

by mypalmike ( 454265 ) writes: on Thursday July 16, 2009 @06:47PM (#28724045) Homepage

First, make a bumpmap of each image. Then, render them onto quads with a light at a 45 degree angle to the surface normal. Run a gaussian blur on each resulting image. Then run a quantize filter, followed by lens flare, solarize, and edge-detect. At this point, the answer will be clear: both images look horrible.

Share
twitter facebook
Automatic JPEG Artifact Removal (Score:4, Interesting)

by yet-another-lobbyist ( 1276848 ) writes: on Thursday July 16, 2009 @06:55PM (#28724165)

For what it's worth: I remember using Paint Shop Pro 9 a few years ago. It has a function called "Removal of JPEG artifacts" (or similar). I remember being surprised how well it worked. I also remember that PSP has quite good functionality for batch processing. So what you could do is use the "remove artifact" function and look at the difference before/after this function. The image with the bigger difference has to be the one of lower quality.
I am not sure if there is a tool that automatically calculates the difference between two images, but this is a task simple enough to be coded in a few lines (given the right libraries are at hand). For each color channel (RGB) of each pixel, you basically just calculate the square of the difference between the two images. Then you add all these numbers up (all pixels, all color channels). The bigger this number is, the bigger the difference between the images.
Maybe not your push-one-button solution, but should be doable. Just my $0.02.

Share
twitter facebook
compare against the static baseline. (Score:2)

by circusboy ( 580130 ) writes:

compare both images against the original, not each other.
count number of pixels different from the original, then calculate max and average difference between either image and the original.
decide which parameter means more to you.
go forward from there.
- Re: (Score:2)
  
  by circusboy ( 580130 ) writes:
  
  adding to that, you can run the following algorithm on the diff images.
  1. blur image by an arbitrary value,
  2. darken the image by an arbitrary value.
  3. repeat until image is all black.
  count the number of repetitions. given various values for steps one and two, you can tune the algorithm to find images that have large areas of mismatch.
  possibly not useful to you, but have found it good for validation testing for image manipulation software.
How about audio? (Score:2, Interesting)

by bondiblueos9 ( 1599575 ) writes:

I would very much like to do the same with audio. I have so many duplicate tracks in my music collection in different formats and bitrates.
- Re: (Score:2)
  
  by notseamus ( 1295248 ) writes:
  
  If you're running a mac and have all your files in an itunes library, then Dupin [dougscripts.com] is extremely useful. It matches on name, size, length, bit rate, or all at once.
  It's pretty useful, and the freeware version lets your delete from drive as well as library.
  If you're on windows, I searched for years and couldn't find anything :(
Look at the DCT coefficients (Score:4, Informative)

by uhmmmm ( 512629 ) writes: <.uhmmmm. .at. .gmail.com.> on Thursday July 16, 2009 @07:09PM (#28724301) Homepage

JPEG works by breaking the image into 8x8 blocks and doing a two dimensional discrete cosine transform on each of the color planes for each block. At this point, no information is lost (except possibly by some slight inaccuracies converting from RGB to YUV as is used in JPEG). The step where the artifacts are introduced is in quantizing the coefficients. High frequency coefficients are considered less important and are quantized more than low frequency coefficients. The level of quantization is raised across the board to increase the level of compression.
Now, how is this useful? The reason heavily quantizing results in higher compression is because the coefficients get smaller. In fact, many become zero, which is particularly good for compression - and the high frequency coefficients in particular tend towards zero. So partially decode the images and look at the DCT coefficients. The image with more high frequency coefficients which are zero is likely the lower quality one.

Share
twitter facebook
Image Quality Metrics. (Score:2)

by Jeremy Erwin ( 2054 ) writes:

Something like $\frac{1}{N} \sum_{i=1}^{N}(x_i-y_i)^2$, where $x$ and $y$ are arrays of pixels, and $N is the number of pixels in each array?
Is there a way to find out the compression engine? (Score:2)

by ID000001 ( 753578 ) writes:

Does JPEG header have the compression method listed as well as compression ratio? If not, is there any way to figure out what kinda compresison engine is used base on how an image is constructed?

If so, simply do some testing against some of the most popular compression engine base on the artifact to determines what engine is used, then find out their compression ratio (perhaps a simple files size might work?). Then simply pick the images with the best quality base on engine used and ratio?
variation (Score:2)

by superwiz ( 655733 ) writes:

Compute the variance of the Fourier coefficients within each block and then calculate the average for each image. The better quality image should have lower variance. If a block has a lot of edges, then the higher frequency coefficients should have much higher values than the lower ones. If a block is uniform, then the lower frequency coefficients should have higher values. So if you have a good image, it will be easy to see the difference between uniform parts and edges. That is the coefficients of th
It depends what you want.. (Score:2)

by Paracelcus ( 151056 ) writes:

find dupes on the internet http://tineye.com/ [tineye.com]
find dupes on your HDD http://www.bigbangenterprises.de/en/doublekiller/ [bigbangenterprises.de]
Just sort by the size (Score:2)

by MikeBabcock ( 65886 ) writes:

JPEG is pretty efficient at compressing images -- the only way they get smaller on average is by increasing the quality loss. Therefore, the larger of the two images in bytes is probably the better looking copy.
Subjective... (Score:2)

by GWBasic ( 900357 ) writes:

Well, your problem is that image quality is subjective. Can computers make good subjective judgements? Not really.
Let's say you count the number of pixels that are different? Well, what if JPEG usually slightly alters the brightness? You could weight the difference, but what if JPEG sometimes moves an edge by a pixel?
I think if you study a bit about how JPEG works, you might find that you can computationally determine how much information that is lost; but that does not mean that your computed number in
Expert's answer (Score:2, Interesting)

by mezis ( 595240 ) writes:

Exploit JPEG's weakness.

JPEG encodes pixels by using a cosine transform on 8x8 pixel blocks. The most perceptually visible artifacts (and the artifacts most suceptible to cause troble to machine vision algorithms) appear on block boundaries.

Short answer:
a. 2D-FFT your image
b. Use the value of the 8-pixel period response in X and Y direction as your quality metric. The higher, the worse the quality.

This is a crude 1st approximation but works.
Some things aren't doable yet (Score:2)

by PingXao ( 153057 ) writes:

Aside from the mathematical tests some have suggested, my gut tells me this is going to be almost impossible. There are tasks that a human can perform that just aren't doable given the present state of our software systems. The gap has as much to do with our understanding about how we perceive through our senses as it does with algorithms and calculation methodologies. We just don't know yet enough about the underlying processes to make a computer do it.
The same goes for other areas where AI is sorely la
Sorting steps to find originals (Score:3, Informative)

by rwa2 ( 4391 ) * writes: on Friday July 17, 2009 @11:43AM (#28730659) Homepage Journal
You probably don't necessarily want to find the "best quality" image, but rather the image that was closest to the original.
I take it you're either trying to eliminate the low-quality duplicates or thumbnails from a really large collection of pr0n, or trying to write an image search engine that tries to present the "best" rendition of a particular image first.
1. As a quick first pass (after you've run through to collect all the similar images into separate groups), you'd obviously want to find the version of the image with the highest resolution. This might let you easily throw out thumbnails or scaled down versions you might come across. Of course, some dorks will upscale images and post them somewhere, so you might still want to hang on to some of them for the second stage.
2. For the second pass, you'd likely want to scan through the metadata first, especially stuff exposed by EXIF. So you'd want to give higher scores to EXIF data that makes it sound like it came directly off a digital camera or scanner, and bump down the desirability of pictures that appeared to have been edited by any sort of photo editing software.
3. Then maybe you want to look at something that would rank down watermarks or other modifications.
4. Another step would be to compare compression quality, but I think that's what most of the other posts are concentrating on. But this is a difficult step because it can be easily fooled, since idiots can re-save a low quality image with the compression quality cranked all the way up so the file size becomes high even though the actual image quality is worse than the original. You probably need to run it through one of those "photoshop detectors" that could tell you whether the image has been through smoothing or other filters in a photo editor. The originals (especially in raw format and maybe high quality JPEG) will have a certain type of CCD noise signature that your software might be able to detect. In the same vein, a poorly-compressed JPEG will have lots of JPEG quantization artifacts that your software might be able to detect as well. Otherwise, you're kinda left with zooming in on pics and eyeballing it.
5. Finally you might be left with a group of images that are exactly the same but have different file names... you probably want some way to store some of the more useful bits of descriptive text as search/tag metadata, but then choose the most consistent file naming convention or slap on your own based on your own metadata.
Hopefully this gives you a start to important parts of the process that you might have overlooked...
Share
twitter facebook
- Re:AI problem? (Score:4, Interesting)
  
  by Robotbeat ( 461248 ) writes: on Thursday July 16, 2009 @06:10PM (#28723591) Journal
  
  ...it will simply require a human-level brain.
  How about Amazon's Mechanical Turk service?
  https://www.mturk.com/ [mturk.com]
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Funny)
    
    by eikonoklastes ( 530797 ) writes:
    
    Well, of course, how else would you do it? It's so simple.
  - - Re: (Score:2, Informative)
      
      by kpoole55 ( 1102793 ) writes:
      
      I've been lax, in a way, in my pruning of late so the findimagedupes program found about 28000 groups of near duplicate images. Finding that many was a surprise and that's why I started looking to see if a program had been written yet for the next step, finding the better image. I wrote a little script that prunes the identical files but now run into the problem of non-identical files that contain the same or nearly the same image.
- Re: (Score:3, Insightful)
  
  by nametaken ( 610866 ) writes:
  
  You're right, it needs to be done by humans to be sure.
  Amazon's Mechanical Turk should do the trick.
  https://www.mturk.com/mturk/welcome [mturk.com]
- Re:AI problem? (Score:5, Interesting)
  
  by CajunArson ( 465943 ) writes: on Thursday July 16, 2009 @06:49PM (#28724061) Journal
  
  I don't know about "quality", but frankly it shouldn't be too hard to compare similar images just by doing simple mathematical analysis on the results. I'm only vaguely familiar with image compression, but if a "worse" JPEG image is more blocky, would it be possible to run edge detection to find the most clearly defined blocks that indicates a particular picture is producing "worse" results? That's just one idea, I'm sure people who know the compression better can name many other properties that could easily be measured automatically.
  What a computer can't do is tell you if the image is subjectively worse, unless the same metric that the human uses to subjectively judge a picture happens to match the algorithm the computer is using, and even then it could vary by picture to picture. For example, a highly colorful picture might hide the artifacting much better than a picture that features lots of text. While the "blockiness" would be the same mathematically, the subjective human viewing it will notice the artifacts in the text much more.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Insightful)
    
    by CajunArson ( 465943 ) writes:
    
    And to reply to myself.. several other posters have noted that taking the DCT of the compression blocks in the image will give information on how highly compressed the image is... there's one example.
  - Re: (Score:3, Insightful)
    
    by moderatorrater ( 1095745 ) writes:
    
    Even simpler mathematical analysis would include such techniques as seeing which one takes up more disk space. Last I checked, that was very highly correlated with compression level.
    - Re: (Score:3, Insightful)
      
      by scdeimos ( 632778 ) writes:
      
      That's only a reasonable indicator if the two copies of the same image you are comparing are also the same resolution. It's not hard to have a higher resolution image consume less disk space if the compression level has been bumped up. Also, different programs usually produce different JFIF streams even when set to the same compression level and using the same *uncompressed* source image, making the DCT size approach even less reliable.
      - Re: (Score:3, Insightful)
        
        by nahdude812 ( 88157 ) * writes:
        
        This just about gets to the heart of it. "Better" is a subjective term, so choosing better quality images is not going to be something everyone can agree on. Your example nails it. If you have two copies of the same image, one is higher resolution than the other, but saved with a higher compression rate, which is better? The answer is going to be "it depends on if the noise introduced by the higher compression annoys me more than the reduced information in the lower resolution image."
        If the compression
- Re:AI problem? (Score:5, Informative)
  
  by arose ( 644256 ) writes: on Thursday July 16, 2009 @08:38PM (#28724977)
  
  AI or small utility [schmorp.de]... You never know with computers ;)
  
  Parent Share
  twitter facebook
  - Re:AI problem? (Score:5, Informative)
    
    by bendodge ( 998616 ) writes: <bendodge@bsgproY ... s.com minus poet> on Thursday July 16, 2009 @09:41PM (#28725395) Homepage Journal
    
    Since the mods haven't noticed, and I don't have mod points, let me point out that THIS POST HAS THE ANSWER. A real program that will do what the asker wants. The source is available, but I can't seem to find its license (it includes some of the Independent JPEG Goup's code). Also, doesn't a jpeg's EXIF data or some other tag in the file tell you what quality it was saved at?
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Interesting)
      
      by adolf ( 21054 ) writes:
      
      It almost does what he wants. He doesn't spell it out, but it seems strongly implied that he also wants a system capable of automatically finding these duplicates by itself, and then automatically determining which image is "best."
      Which seems obvious, to me: If he's got enough photos of sufficient disorganization that he can't tell automatically which duplicate is best, then there probably isn't any straight-forward way (with filenames or directory trees or whatever) to find out which ones are dupes to be
      - Re:AI problem? (Score:4, Informative)
        
        by bh_doc ( 930270 ) writes: <brendon@quantumf ... l.net minus city> on Friday July 17, 2009 @02:29AM (#28726599) Homepage
        
        http://www.jhnc.org/findimagedupes/
        There's a bunch, but I know you can construct command line operations with this one. I imagine you could construct a system from this and the parent program that will find dupes, then nuke the poorer quality of each, or whatever.
        
        Parent Share
        twitter facebook
- - Re:AI problem? (Score:4, Funny)
    
    by lunchlady55 ( 471982 ) writes: on Thursday July 16, 2009 @06:22PM (#28723745)
    
    Oh sure, it starts out innocently enough - pick the better image. Next thing you know Skynet's decided that it's the better LIFE-FORM.
    AI - JUST SAY NO!
    Brought to you by the Coalition for Human Survival (C) Aug. 29, 1997
    
    Parent Share
    twitter facebook
    - - Re: (Score:3, Informative)
        
        by fractoid ( 1076465 ) writes:
        
        Thou shalt not make a machine in the likeness of a human mind.
- - Re:AI problem? (Score:5, Insightful)
    
    by Spy der Mann ( 805235 ) writes: <`moc.liamg' `ta' `todhsals.nnamredyps'> on Thursday July 16, 2009 @08:37PM (#28724975) Homepage Journal
    
    Here's a simple but expensive formula:
    1. Get the image
    2. Compress it severely.
    3. Compare the difference between original and the compressed.
    The lower the difference, the lower the image quality.
    4. Profit!
    Or you could just measure the amount of data in the DCT space. Duh.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Insightful)
      
      by VanessaE ( 970834 ) writes:
      
      Just checking the size of the file (or, I suspect, just the size of the DCT data) won't always work. Sometimes an image can end up growing in size slightly while losing quality, depending on the nature of the image and the settings of the imaging program.
      Things such as thin wires, multi-colored ribbon cable, close-ups of a circuit board, and other images with lots of similar details seem to benefit most from this kind of tweaking, mainly thanks to the placement and qualities of the artifacts, rather than
  - - Re: (Score:3, Insightful)
      
      by SlashWombat ( 1227578 ) writes:
      
      Unfortunately, its not all that easy to compare. In general, the file with the higher byte count will be the better image, BUT ... The problem is there are different ways to compress the same picture. (There are several "controls", even in baseline JPEG. (Where the "quantisation" steps occur, where the high frequency cutoff for each macroblock occurs. Then there are different ways for the JPEG engine to entropy encode the bitstream. IE: Arithmetic coding is allowed by the JPEG standard, however, due to pate
- Re:Filesize is a hint (Score:5, Informative)
  
  by thethibs ( 882667 ) writes: on Thursday July 16, 2009 @06:34PM (#28723887) Homepage
  
  More Noise = Less Compression
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by jhfry ( 829244 ) writes:
  
  Software is well suited to detecting patterns, including patterns that might appear as distracting artifacts in an image. Just because subjectively the pictures are both equally similar to the original, doesn't mean that they are mathematically similar.
  I can imagine a method for comparing two images to an original image and scoring the two based upon how similar they are to the original while detecting and deducting for distracting compression artifacts. Ironically this method would be very similar to the

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Easy (Score:3, Interesting)

Re:Easy (Score:4, Insightful)

File size (Score:2, Insightful)

Re:File size (Score:5, Informative)

Re: (Score:3, Interesting)

Re: (Score:3, Insightful)

Re: (Score:2)

Re:File size (Score:4, Insightful)

Re:File size (Score:4, Insightful)

Possible Method... (Score:2)

Re:File size (Score:5, Interesting)

Re: (Score:2)

Re:File size (Score:5, Funny)

Re: (Score:3, Interesting)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:File size (Score:5, Insightful)

Re:File size (Score:5, Informative)

Re:File size (Score:5, Informative)

Re: (Score:2, Interesting)

Re:File size (Score:5, Informative)

Re: (Score:2, Funny)

NO not file size (Score:2)

I'm not an expert (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

File size or density? (Score:2)

Re: (Score:2)

Share your suggestions (Score:2)

Re: (Score:2)

tineye? (Score:2)

Try compressing both further (Score:2, Insightful)

Requires original image in loss-less form (Score:2)

ImageMagick can give you EXIF data. (Score:5, Informative)

Re: (Score:3, Informative)

You're assuming a bit too much, aren't ya? (Score:2)

Re: (Score:2)

Translation: Please help me with my porn... (Score:5, Insightful)

Found it a while ago (Score:5, Informative)

Re: (Score:2)

jpeg quality != image quality (Score:2)

use the JPEG underlying details (Score:2, Insightful)

It's easy (Score:5, Insightful)

quantization tables (Score:3, Insightful)

Measure sharpness? (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re:Measure sharpness? (Score:4, Insightful)

Re: (Score:2)

DCT (Score:5, Informative)

Re: (Score:3, Insightful)

Re:DCT (Score:4, Insightful)

use a "difference matte" (Score:4, Informative)

Re: (Score:3, Insightful)

Try ThumbsPlus (Score:3, Informative)

Bits per pixel (Score:2)

image quality measures (Score:5, Informative)

Check the quantization (Score:2)

Re:Check the quantization (Score:4, Informative)

Blur Detection? (Score:2, Informative)

Fourier transform (Score:2, Interesting)

Filters (Score:5, Funny)

Automatic JPEG Artifact Removal (Score:4, Interesting)

compare against the static baseline. (Score:2)

Re: (Score:2)

How about audio? (Score:2, Interesting)

Re: (Score:2)

Look at the DCT coefficients (Score:4, Informative)

Image Quality Metrics. (Score:2)

Is there a way to find out the compression engine? (Score:2)

variation (Score:2)

It depends what you want.. (Score:2)

Just sort by the size (Score:2)

Subjective... (Score:2)

Expert's answer (Score:2, Interesting)

Some things aren't doable yet (Score:2)

Sorting steps to find originals (Score:3, Informative)