Steve's Tech Talk : How Much PEG Would a JPEG PEG if a JPEG Could PEG PEG?

How Much PEG Would a JPEG PEG if a JPEG Could PEG PEG?

Most of the time, when I write blog articles, I write about successes or methodology for success. That doesn’t mean that I don’t have my share of failures too. I had some R&D time on my hands yesterday, so I set myself the task of trying to determine the approximate compressed size of an AtalaImage using a particular Quality setting of the JpegEncoder that Atalasoft provides.

JPEG image compression is a lossy compression mechanism. This means that if I compress an image then decompress it, the resulting image will be different from the original. Image data is discarded to help compression with the hope that the discarded data will not be visually noticeable. This tends to work fairly well with photographic images, but tends to fail badly with text.

The simplest approach to measuring compression would be to just compress the image – but since compression can be a resource intensive operation, it might be nice to be able to approximate the size of the resulting image. This could be handy in a UI to show approximate file sizes and to be able to show the changes dynamically without slowing down the UI.

My initial approach was to take a fairly large sample of images and compress a fraction of the total image. I tried compressing from 1/8 to 7/8 of the total image area using all the Quality settings and tried to see if the was a consistent pattern in the relationship of partial compression and full compression. If this relationship exists, then it can be predicted.

After poring over 25,000 cells in a generated spreadsheet and looking at graphs, I found that there is no consistently predictable relationship using fractional compression. There are trends, yes, but the error rate went from 0 to 105% or more in some cases. The data I collected was really not usable.

I had high hopes for the approach since I thought that using a smaller amount of the source data should exhibit similar trends as using the entire image.

Rather than giving up, I decided that what I would do is use a moderate size sampling of images (I grabbed a few hundred from various photos I’ve accumulated) and calculated the average compression ratios for each quality setting for both color and gray images. Hopefully, the use of a large data set would give me a decent representation of trend.

Unfortunately, no. I get answers that more or less follow the actual behavior. Sometimes the error rate is entirely acceptable (within 20%), but sometimes it is off by 400% or more. I was better off using small area calculations.

The main issue is that I’m trying to use a straightforward linear relationship on one variable to distill a trend from many, many different variables at once. This is clearly something that needs more time and thought to get a solid answer.

As a final approach, I decided to unask the question and revisit my instinct about whether or not encoding the entire image is acceptable. I wrote a sample app the opens an image and lets me pull a slider around and measure the compression for each quality setting. As I suspected, the feel of the slider is unacceptable when using it on a typical digital camera image (8MP, if I recall correctly).

I think the true is solution involves:

Better modelling (hard)
Precaching the image sizes at the various quality settings in a background thread (tricky)
Keeping prior measurements around so as not to do them twice (cumbersome)

Published Wednesday, April 29, 2009 10:56 AM by Steve Hawley

Comments

No Comments

Anonymous comments are disabled

Steve's Tech Talk

This Blog

Syndication

Search

Navigation

Tags

Recent Posts

Archives

How Much PEG Would a JPEG PEG if a JPEG Could PEG PEG?

Comments