Stories
Slash Boxes
Comments
typodupeerror delete not in

Comments: 291 +-   Choosing Better-Quality JPEG Images With Software? on Thursday July 16, @05:02PM

Posted by timothy on Thursday July 16, @05:02PM
from the on-the-tip-of-my-script dept.
graphics
software
kpoole55 writes "I've been googling for an answer to a question and I'm not making much progress. The problem is image collections, and finding the better of near-duplicate images. There are many programs, free and costly, CLI or GUI oriented, for finding visually similar images — but I'm looking for a next step in the process. It's known that saving the same source image in JPEG format at different quality levels produces different images, the one at the lower quality having more JPEG artifacts. I've been trying to find a method to compare two visually similar JPEG images and select the one with the fewest JPEG artifacts (or the one with the most JPEG artifacts, either will serve.) I also suspect that this is going to be one of those 'Well, of course, how else would you do it? It's so simple.' moments."
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Easy (Score:3, Interesting)

    by Anonymous Coward on Thursday July 16, @05:05PM (#28723511)

    Paste both images in your image editor of choice, one layer on top of each other, apply a difference/subtraction filter.

  • by bcrowell (177657) on Thursday July 16, @05:12PM (#28723623) Homepage

    The ImageMagick package includes a command called identify, which can read the EXIF data in the JPEG file. You can use it like this:

    identify -verbose creek.jpg | grep Quality

    In my example, it gave " Quality: 94".

    This will not work on very old cameras (from ca. 2002 or earlier?), because they don't have EXIF data. This is different info than you'd get by just comparing file sizes. The JPEG quality setting is not the only factor that can influence file size. File size can depend on resolution, JPEG quality, and other manipulations such as blurring or sharpening, adjusting brightness levels, etc.

    • Re: (Score:3, Informative)

      imagemagick can also compare two images, and tell you how different they are. That is -- quantify the differences by returning a floating point number or two (PSNR, RMSE) in a way that a more-compressed JPEG image will return a correspondingly different floating point value. I know the question concerns two JPEG-compressed images, but if you do have an original image -- and you want to test which is closest to the original, ImageMagick can do that. Use the ImageMagick compare function.
      See http://www.imag [imagemagick.org]
  • Dear Slashdot,

    Recently I checked my porn drive and realized that I have over 50 gigibytes of jpg quality porn collected. Unfortunately, I've noticed that a good portion of these are all the same picture of Natlie Portman eating hot grits. Could you please point me to a free program that will allow me to find the highest resolution, best quality version of this picture from my collection and delete the rest?

    Many Thanks!

    • Found it a while ago (Score:5, Informative)

      by sco08y (615665) on Thursday July 16, @05:22PM (#28723749)

      I mean, you don't want second rate pictures in your pr0n stash?

      I had problems building it back then, let alone writing the scripts for it and the hassle of figuring out which images were duplicates, but this utility [schmorp.de] seems to fit the bill.

  • It's easy (Score:5, Insightful)

    by Anonymous Coward on Thursday July 16, @05:16PM (#28723673)

    Run the DCT and check how much it's been quantized. The higher the greatest common factor, the more it has been compressed.

    Alternatively, check the raw data file size.

  • by angryargus (559948) on Thursday July 16, @05:17PM (#28723683)

    Others have mentioned file size, but another good approach is to look at the quantization tables in the image as an overall quality factor. E.g., JPEG over RTP (RFC 2435) uses a quantization factor to represent the actual tables, and the value of 'Q' generally maps to quality of the image. Wikipedia's doc on JPEG has a less technical discussion of the topic, although the Q it uses is probably different from the example RFC.

  • Measure sharpness? (Score:4, Interesting)

    by Anonymous Coward on Thursday July 16, @05:18PM (#28723693)

    Compute the root-mean-square difference between the original image and a gaussian-blurred version?
    JPEG tends to soften details and reduce areas of sharp contrast, so the sharper result will probably
    be better quality. This is similar to the PSNR metric for image quality.

    Bonus: very fast, and can be done by convolution, which optimizes very efficiently.

    • by uhmmmm (512629) <uhmmmm&gmail,com> on Thursday July 16, @05:52PM (#28724109) Homepage

      Even faster is look at the DCT coefficients in the file itself. Doesn't even require decoding - JPEG compression works by quantizing the coefficients more heavily for higher compression rates, and particularly for the high frequency coefficients. If more high frequency coefficients are zero, it's been quantized more heavily, and is lower quality.

      Now, it's not foolproof. If one copy went through some intermediate processing (color dithering or something) before the final JPEG version was saved, it may have lost quality in places not accounted for by this method. Comparing quality of two differently-sized images is also not as straightforward either.

  • DCT (Score:5, Informative)

    by tomz16 (992375) on Thursday July 16, @05:18PM (#28723695)

    Just look at the manner in which JPEGs are encoded for your answer!

    Take the DCT (discrete cosine transform) of blocks of pixels throughout the image. Examine the frequency content of the each of these blocks and determine the amount of spatial frequency suppression. This will correlate with the quality factor used during compression!

       

    • Re: (Score:3, Insightful)

      This seems to me the best suggestion, and there's a simple visual way to accomplish it! The hardest hit part of the image is going to be the chroma information, which your eye normally has reduced resolution sensitivity for in a normal scene. To overcome this, load your JPEGs into your favorite image editor and crank the saturation to the max(this throws away the luminance data). Now the JPEG artifacts in the chroma information will HIT YOU IN THE FACE, even in images that seemed rather clean before. Pick
    • Re:DCT (Score:4, Insightful)

      by eggnoglatte (1047660) on Thursday July 16, @09:12PM (#28725553)

      That works, but only if you have exact, pixel-to-pixel correspondence between the photos. It won't work if you just grab 2 photos from flicker that both show the Eiffel tower, and you wonder which one is "better".

      Luckly, there is a simple way to do it: use jpegtran to extract the quantization table form each image. Pick the one with the smaller values. This can easily be scripted.

      Caveat: this will not work if the images have been decoded and re-coded multiple times.

  • by Anonymous Coward on Thursday July 16, @05:20PM (#28723719)

    load up both images in adobe after effects or some other image compositing program and apply a "difference matte"

    Any differences in pixel values between the two images will show up as black on a white background or vise versa...

    adam
    BOXXlabs

    • Re: (Score:3, Insightful)

      So, that will show you which parts differ. How do you tell which is higher quality? Sure, you can probably do it by eye. But it sounds like the poster wants a fully automated method.

  • Try ThumbsPlus (Score:3, Informative)

    by Anonymous Coward on Thursday July 16, @05:21PM (#28723729)

    ThumbsPlus is an image management tool. It has a feature called "find similar" that should do what you want as far as identifying to pictures that are the same except for the compression level. Once the similar picture is found you can use ThumbsPlus to look at the file sizes and see which one is bigger.

  • by trb (8509) on Thursday July 16, @05:28PM (#28723813)
    google (or scholar-google) for Hosaka plots, or image quality measures. Ref:

    HOSAKA K., A new picture quality evaluation method.
    Proc. International Picture Coding Symposium, Tokyo, Japan, 1986, 17-18.

  • Filters (Score:5, Funny)

    by mypalmike (454265) on Thursday July 16, @05:47PM (#28724045) Homepage

    First, make a bumpmap of each image. Then, render them onto quads with a light at a 45 degree angle to the surface normal. Run a gaussian blur on each resulting image. Then run a quantize filter, followed by lens flare, solarize, and edge-detect. At this point, the answer will be clear: both images look horrible.

  • by yet-another-lobbyist (1276848) on Thursday July 16, @05:55PM (#28724165)
    For what it's worth: I remember using Paint Shop Pro 9 a few years ago. It has a function called "Removal of JPEG artifacts" (or similar). I remember being surprised how well it worked. I also remember that PSP has quite good functionality for batch processing. So what you could do is use the "remove artifact" function and look at the difference before/after this function. The image with the bigger difference has to be the one of lower quality.
    I am not sure if there is a tool that automatically calculates the difference between two images, but this is a task simple enough to be coded in a few lines (given the right libraries are at hand). For each color channel (RGB) of each pixel, you basically just calculate the square of the difference between the two images. Then you add all these numbers up (all pixels, all color channels). The bigger this number is, the bigger the difference between the images.
    Maybe not your push-one-button solution, but should be doable. Just my $0.02.
  • by uhmmmm (512629) <uhmmmm&gmail,com> on Thursday July 16, @06:09PM (#28724301) Homepage

    JPEG works by breaking the image into 8x8 blocks and doing a two dimensional discrete cosine transform on each of the color planes for each block. At this point, no information is lost (except possibly by some slight inaccuracies converting from RGB to YUV as is used in JPEG). The step where the artifacts are introduced is in quantizing the coefficients. High frequency coefficients are considered less important and are quantized more than low frequency coefficients. The level of quantization is raised across the board to increase the level of compression.

    Now, how is this useful? The reason heavily quantizing results in higher compression is because the coefficients get smaller. In fact, many become zero, which is particularly good for compression - and the high frequency coefficients in particular tend towards zero. So partially decode the images and look at the DCT coefficients. The image with more high frequency coefficients which are zero is likely the lower quality one.

    • Re:AI problem? (Score:4, Interesting)

      by Robotbeat (461248) on Thursday July 16, @05:10PM (#28723591) Journal

      ...it will simply require a human-level brain.

      How about Amazon's Mechanical Turk service?
      https://www.mturk.com/ [mturk.com]

    • Re: (Score:3, Insightful)

      You're right, it needs to be done by humans to be sure.

      Amazon's Mechanical Turk should do the trick.

      https://www.mturk.com/mturk/welcome [mturk.com]

    • Re:AI problem? (Score:5, Interesting)

      by CajunArson (465943) on Thursday July 16, @05:49PM (#28724061) Journal

      I don't know about "quality", but frankly it shouldn't be too hard to compare similar images just by doing simple mathematical analysis on the results. I'm only vaguely familiar with image compression, but if a "worse" JPEG image is more blocky, would it be possible to run edge detection to find the most clearly defined blocks that indicates a particular picture is producing "worse" results? That's just one idea, I'm sure people who know the compression better can name many other properties that could easily be measured automatically.
      What a computer can't do is tell you if the image is subjectively worse, unless the same metric that the human uses to subjectively judge a picture happens to match the algorithm the computer is using, and even then it could vary by picture to picture. For example, a highly colorful picture might hide the artifacting much better than a picture that features lots of text. While the "blockiness" would be the same mathematically, the subjective human viewing it will notice the artifacts in the text much more.

      • Re: (Score:3, Insightful)

        And to reply to myself.. several other posters have noted that taking the DCT of the compression blocks in the image will give information on how highly compressed the image is... there's one example.

      • Even simpler mathematical analysis would include such techniques as seeing which one takes up more disk space. Last I checked, that was very highly correlated with compression level.
        • Re: (Score:3, Insightful)

          That's only a reasonable indicator if the two copies of the same image you are comparing are also the same resolution. It's not hard to have a higher resolution image consume less disk space if the compression level has been bumped up. Also, different programs usually produce different JFIF streams even when set to the same compression level and using the same *uncompressed* source image, making the DCT size approach even less reliable.
    • Re:AI problem? (Score:5, Informative)

      by arose (644256) on Thursday July 16, @07:38PM (#28724977)
      AI or small utility [schmorp.de]... You never know with computers ;)
      • Re:AI problem? (Score:5, Informative)

        Since the mods haven't noticed, and I don't have mod points, let me point out that THIS POST HAS THE ANSWER. A real program that will do what the asker wants. The source is available, but I can't seem to find its license (it includes some of the Independent JPEG Goup's code). Also, doesn't a jpeg's EXIF data or some other tag in the file tell you what quality it was saved at?
        • Re: (Score:3, Interesting)

          It almost does what he wants. He doesn't spell it out, but it seems strongly implied that he also wants a system capable of automatically finding these duplicates by itself, and then automatically determining which image is "best."

          Which seems obvious, to me: If he's got enough photos of sufficient disorganization that he can't tell automatically which duplicate is best, then there probably isn't any straight-forward way (with filenames or directory trees or whatever) to find out which ones are dupes to be

          • Re:AI problem? (Score:4, Informative)

            by bh_doc (930270) <blhiggins&gmail,com> on Friday July 17, @01:29AM (#28726599) Homepage

            http://www.jhnc.org/findimagedupes/

            There's a bunch, but I know you can construct command line operations with this one. I imagine you could construct a system from this and the parent program that will find dupes, then nuke the poorer quality of each, or whatever.

      • by lunchlady55 (471982) on Thursday July 16, @05:22PM (#28723745)

        Oh sure, it starts out innocently enough - pick the better image. Next thing you know Skynet's decided that it's the better LIFE-FORM.

        AI - JUST SAY NO!

        Brought to you by the Coalition for Human Survival (C) Aug. 29, 1997

      • Here's a simple but expensive formula:

        1. Get the image
        2. Compress it severely.
        3. Compare the difference between original and the compressed.

        The lower the difference, the lower the image quality.
        4. Profit!

        Or you could just measure the amount of data in the DCT space. Duh.

        • Re: (Score:3, Insightful)

          Just checking the size of the file (or, I suspect, just the size of the DCT data) won't always work. Sometimes an image can end up growing in size slightly while losing quality, depending on the nature of the image and the settings of the imaging program.

          Things such as thin wires, multi-colored ribbon cable, close-ups of a circuit board, and other images with lots of similar details seem to benefit most from this kind of tweaking, mainly thanks to the placement and qualities of the artifacts, rather than

        • Re: (Score:3, Insightful)

          Unfortunately, its not all that easy to compare. In general, the file with the higher byte count will be the better image, BUT ... The problem is there are different ways to compress the same picture. (There are several "controls", even in baseline JPEG. (Where the "quantisation" steps occur, where the high frequency cutoff for each macroblock occurs. Then there are different ways for the JPEG engine to entropy encode the bitstream. IE: Arithmetic coding is allowed by the JPEG standard, however, due to pate
    • Re:File size (Score:5, Informative)

      by Robotbeat (461248) on Thursday July 16, @05:14PM (#28723651) Journal

      File size doesn't tell you everything about quality.

      For instance, if you save an image as a JPEG vs. first saving as a dithered GIF and _then_ saving as JPEG, then the second one will have much worse actual quality, even if it has the same filesize (it may well have worse quality AND have a larger file size).

      • Re: (Score:3, Interesting)

        Also, stuff like Photoshop, will insert a bunch of meta/exif-bullshit but something like Paint, doesn't... it's usually only about 2 to 3kb, but it's still tainting your results if you are going by size alone.

          • Re: (Score:3, Insightful)

            actually one of the meta values that is stored is a quality indicator.

            And when you save a max quality copy of a min quality jpeg, the picture still looks like crap.

            • Re:File size (Score:4, Insightful)

              by nabsltd (1313397) on Thursday July 16, @07:41PM (#28725001)

              Unfortunately, that's a subjective term based on the 'codec' used to make the jpg. Not everyone's 100 is the same nor is everyone working off the same scale (i.e. 1-10 vs 1-100).

              In addition, I bought a program [winsoftmagic.com] (Windows only, sorry) that allows the user to pick the areas of the image that need the most bits. Basically, it allows you to pick the quality for any abitrary region (using standard selection tools like lasso) when saving the JPEG.

              I mostly got it for the batch processing and its excellent image quality when you set it to minimum compression.

    • Re:File size (Score:4, Insightful)

      by teko_teko (653164) on Thursday July 16, @05:16PM (#28723667) Homepage

      File size may not be accurate if it has been converted multiple times at different quality, or if the source is actually lower quality.

      The only way to properly compare is if you have the original as the control.

      If you compare between 2 different JPEG quality images, the program won't know which parts are the artifacts. You still have to decide yourself...

        • Re:File size (Score:5, Interesting)

          by Chyeld (713439) <chyeld AT newsguy DOT com> on Thursday July 16, @06:43PM (#28724585)

          There was a old story my AI teacher used to share back in college about a military contractor that was developing an AI based IFF (identifcation, friend or foe) system for aircraft.

          They trained it using what was, at the time, a vast picture database of every aircraft known. In the lab, they were able to get it down to 99% accurate, with the error favoring 'unknown' as the third option.

          So they took it out for a test run. The first night out the system tried firing on anything and everything it could lock on, including ground targets.

          This was bad. Horribly bad. But they were certain that there was some sort of equipment failure going on. After all their AI was damn near perfect at ID'ing the targets in the lab, the issues must be up the line somewhere.

          So they did a once over of the equipment and couldn't find a problem. Not sure what to do next the team took the system out for another dry run the next day. This time, the system refused to see any ground targets and anything it saw in the air was friendly.

          Now this was getting ridiculous, the team was extremely confused. So they did what they should have done the first time around, they did a third test run looking at what the AI was actually 'thinking'.

          And promptly discovered the problem. While they had a huge database of images to use, they realized that all their 'friendly' craft had pictures taken during the day, while in flight. All their 'hostile' craft however were pictures that had been taken at night during spy runs or from over head satalite shots.

          The AI wasn't keying off the planes, it was keying off whether it was daytime or night time.

          I don't know if the above actually ever happened, but my point is, it doesn't matter how many images you seed your database with. Unless you are there to tell it what is an artifact and what is just part of the picture, you are going to end up with horrible results and comical results.

    • Re: (Score:3, Insightful)

      by Anonymous Coward

      File size doesn't tell you anything. If I take a picture with a bunch of noise (eg. poor lighting) in it then it will not compress as well. If I take the same picture with perfect lighting it might be higher quality but smaller file size.

      Why this is modded up, I don't know. Too many morons out there.

      • Re:File size (Score:5, Insightful)

        by timeOday (582209) on Thursday July 16, @09:11PM (#28725545)
        This is the kind of problem you can solve in 2 minutes with 95% accuracy (by using file size), or never finish at all by listening to all the pedants on slashdot. When people know a little too much they love to go on about stuff like entropy and information gain, just because they (sort of) can.

        Try file size on the set of images of interest to you and see if it coincides with your intuition. If it does, you're done.

    • Re:File size (Score:5, Informative)

      by Shikaku (1129753) on Thursday July 16, @05:29PM (#28723825)

      http://linux.maruhn.com/sec/jpegoptim.html [maruhn.com]

      No. You can compress JPEG lossless.

        • Re:File size (Score:5, Informative)

          by Score Whore (32328) on Thursday July 16, @05:52PM (#28724123)

          ...THERE IS NO LOSSLESS JPEG. PERIOD.

          Except for Lossless JPEG [wikipedia.org] standardized in 1993. But other than that, no there is no lossless jpeg.

            • Re:File size (Score:5, Informative)

              by Binary Boy (2407) on Thursday July 16, @10:12PM (#28725743)

              Lossless JPEG and lossless JPEG2000 are both exactly that - lossless. Not perceptually lossless, which is what people often use to refer to high-quality, lossy JPEG/JPEG2000, or JPEG-LS. Lossless JPEG uses a PCM-like encoder, not DCT, AFAIR. Lossless JPEG and lossless JPEG2000 are, in fact, lossless, at least with regards to image data in supported color spaces. This is in part a result of *not* converting to YCrCb, since that conversion is lossy, of course. Not all Lossless JPEGs are 8bit YCrCb.

              Accusoft, for one, has a toolkit for building lossless JPEG applications which supports 16bit RGB and greyscale lossless JPEG modes.

              The near-lossless JPEG you're thinking of is JPEG-LS, which is perceptually lossless, and guarantees a maximum error rate that is generally neglible for almost all applications. This format gets better compression ratios than Lossless JPEG, of course.

              Neither the lossless or near-lossless JPEG modes are common though, outside of niche apps. Lossless JPEG2000 is, however, since almost all JPEG2000 libraries support it alongside the lossy modes.

Main's Law: For every action there is an equal and opposite government program.