Return to Digital Photography Articles

JPEG Compression, Quality and File Size

When trying to resave a digital photo, one is often faced with a decision as to what "quality settings" (level of compression) to use. The JPEG file format (more properly JFIF), allows one to select an appropriate trade-off between file size and image quality. It is important to understand that JPEG (and nearly all lossy file formats) are not suitable for intermediate editing because of the fact that repeated saves will generally diminish the working file's quality. In addition to the cumulative introduction of visual artefacts (error), repeated recompression also introduces destructive color changes. It is for these reasons that "lossless" file formats (such as TIFF, PSD, BMP, etc.) are better choices for intermediate processing. JPEG should only be used for storing the final image (ie. after editing) and possibly the initial capture.

How does JPEG compression work?

When a JPEG file is opened in an image editor, a large number of steps must be performed before the raw image (one RGB triplet per pixel) can be displayed or edited. It is easier to understand the process by looking at the reverse process — ie. what happens when one generates a JPEG file (ie. save) from the raw image data. In summary, the JPEG algorithm involves the following stages:

  • Color Space Conversion - The image first undergoes a color space conversion where it is remapped from RGB (Red Green Blue) triplets into YCbCr (Luminance, Chrominance Blue, Chrominance Red) triplets. This color space conversion assists the use of different quantization tables (one for luminance, the other for chrominance).
  • Segmentation into Blocks - The raw image data is chopped into 8x8 pixel blocks (these blocks are the Minimum Coded Unit). This means that the JPEG compression algorithm depends heavily on the position and alignment of these boundaries.
  • Discrete Cosine Transformation (DCT) - The image is transformed from a spatial domain representation to a frequency domain representation. This is perhaps the most confusing of all steps in the process and hardest to explain. Basically, the contents of the image are converted into a mathematical representation that is essentially a sum of wave (sinusoidal) patterns. For example, the binary sequence 101010 can be represented by a wave that repeats every two pixels. The sequence 1100110011 can be represented by a wave that repeats every four pixels. Similarly, the sequence 1001101011 can be represented by the sum of several simpler waves. Now imagine that this mapping to wave equations (known as the DCT basis functions) is done in both the X and Y directions.
  • Quantization - Given the resulting wave equations from the DCT step, they are sorted in order of low-frequency components (changes that occur over a longer distance across the image block) to high-frequency components (changes that might occur every pixel). Is it widely known that humans are more critical of errors in the low-frequency information than the high-frequency information. The JPEG algorithm discards many of these high-frequency (noise-like) details and preserves the slowly-changing image information. This is done by dividing all equation coefficients by a corresponding value in a quantization table and then rounding the result to the nearest integer. Components that either had a small coefficient or a large divisor in the quantiziation table will likely round to zero. The lower the quality setting, the greater the divisor, causing a greater chance of a zero result. On the converse, the highest quality setting would have quantization table values of all 1's, meaning the all of the original DCT data is preserved.

    An important point to realize here is that the quantization table used for this step differs between nearly all digital cameras and software packages. Since this is the most significant contributor of compression or recompression "error", one is almost always going to suffer image degradation in resaving from different compressors / sources. Camera manufacturers independently choose an arbitrary "image quality" name (or level) to assign to the 64-value quantization matrix that they devise, and so the names cannot be compared between makes or even models by the same manufacturer (i.e. Canon's "Fine" vs Nikon's "Fine").

    Please see my article on JPEG Quantization Tables for the actual tables used in the Canon, Nikon, Sigma, Photoshop CS2 and IrfanView digital photos.
  • Zigzag Scan - The resulting matrix after quantization will contain many zeros. The lower the quality setting, the more zeros will exist in the matrix. By re-ordering the matrix from the top-left corner into a 64-element vector in a zig-zag pattern, the matrix is essentially sorted from low-frequency components to high-frequency components. As the high-frequency components are the most likley to round to zero, one will typically end up with a run of zeros at the end of the 64-entry vector. This is important for the next step.
  • DPCM on DC component - On a block-by-block basis, the difference in the average value (across the entire block, the DC component) is encoded as a change from the previous block's value. This is known as Differential Pulse Code Modulation.
  • RLE on AC components - On the individual entries in the 64-element vector (the AC components), a Run Length Encoding stores each value along with the number of zeros preceeding it. As the 1x64 vector contains a lot of zeros, it is more efficient to save the non-zero values and then count the number of zeros between these non-zero values. The RLE stores a skip and a value, where skip is the number of zeros before this component, and the value is the next non-zero component.
  • Entropy Coding / Huffman Coding - A dictionary is created which represents commonly-used strings of values with a shorter code. More common strings / patterns use shorter codes (encoded in only a few bits), while less frequently used strings use longer codes. So long as the dictionary (Huffman table) is stored in the file, it is an easy matter to lookup the encoded bit string to recover the original values. See my JPEG Huffman Coding tutorial.

Examine your JPEG Files!

I have written a free Windows utility that examines and displays all of the details described above in your JPEG files.

Download JPEGsnoop here.

Where does the error come from?

By far the biggest contributor to the error (ie. file size savings) in the JPEG algorithm is the quantization step. This is also the step that allows tuning by the user. A user may choose to have a slightly smaller file while preserving much of the original (ie. high quality, or low compression ratio), or a much smaller file size with less accuracy in matching the original (ie. low quality, or high compression ratio). The tuning is simply done by selecting the scaling factor to use with the quantization table.

The act of rounding the coefficients to the nearest integer results in a loss of image information (or more specifically, adds to the error). With larger quality scaling factors (ie. low image quality setting or high numbers in the quantization table), the amount of information that is truncated or discarded becomes significant. It is this stage (when combined with the Run Length Encoding that compresses the zeros) that allows for significant compression capabilities.

There are other contributors to the compression error, such as the color space conversions, but the quantization step is the most important.

Please see my results in the JPEG Quantization Table article for a more accurate comparison between software packages and their quality settings.

JPEG Chroma Subsampling

In order to further improve JPEG compression rates, chroma subsampling is used to reduce the amount of image information to compress. Please refer to my article on Chroma Subsampling for more information on the 2x1 and 2x2 subsampling typically used in digital cameras and image editors such as Photoshop.

Breakthrough in JPEG compression?

Up until now, it has been widely assumed that the JPEG image compression is about as good as it gets as far as compression rates are concerned (unless one uses fractal compression, etc.). Compressing the JPEG files again by Zip or other generic compression programs typically offers no further improvement in size (and often does the reverse, increasing the size!).

As documented in a whitepaper (no longer available) written by the authors of StuffIt (Allume Systems, formerly Alladin Systems), they have apparently developed software that will further compress JPEG files by up to an additional 30%! Considering how many years the highly-compressed JPEG algorithm has been around, it is surprising to see any new developments that offer this degree of increased compression. Note that the "Stuffit Image Format" (SIF) uses lossless compression of the lossy-compressed original JPEG image. Therefore, there is no further image quality reduction in this Stuffit additional compression.

On, there have been many theories as to how this additional compression could be achieved, but it seems that many feel that it must be through either replacement of the Huffman coding portion (and using arithmetic coding instead) or alternatives to the zig-zag reordering scan. It seems that the consensus is that SIF uses an implementation of arithmetic coding.

On first glance, this would seem to potentially revolutionize the photo industry. Imagine how this could affect online image hosts, or personal archiving needs. Saving 30% of the file size is a significant improvement. Unfortunately, a few significant problems are immediately apparent, possibly killing the adoption of this format:

  • Proprietary Standard - I cannot see this format taking off simply because a single company owns the format. Would you trust your entire photo collection to a single company's utility? The company can charge what it likes, has no guarantees about the company's future, etc. At least Adobe tried to do things right by releasing their DNG (Digital Negative format) specification to the open community, allowing many other developers to back the format. Allume / Stuffit sees this as a potential financial jackpot.
  • Processor Intensive / Slow - Unlike the methods used in the standard JPEG file compression scheme, the SIF method is apparently extremely slow. As tested by ACT (Archive Comparison Test website), a 1.8 GHz Pentium computer took nearly 8 seconds to compress or decompress a 3 megapixel file. While this is less of an issue for those wishing to archive photos to CD, for example, it is obvious that this would prevent the algorithm from ever being supported in most embedded applications (including within a digital camera).

Resaving and workflow

When resaving after making changes, I strive to preserve the quality of the original as much as possible and not lose additional detail to compression round-off error. Therefore, one should keep in mind a few suggestions about resaving:

OriginalSave as...Notes
TIFFTIFFIf the original was uncompressed, then it makes sense to resave it as uncompressed
BMPBMPIf the original was uncompressed, then it makes sense to resave it as uncompressed
JPG TIFF or BMP Best approach: Allows the best preservation of detail by saving in a lossless format. Unfortunately, this approach complicates things as most catalog programs don't handle the change of file type very well (as it changes the filename).
JPGJPG Alternate approach: While not as good as the previous approach that saved it in a lossless format, this can be adequate if the compression algorithm is the same (ie. same quantization tables & quality settings). If this is not possible, then resaving with a quality setting that is high enough (ie. striving for less compression than the original) might be the only choice.

If one intends to edit a JPEG file and resave it as a JPEG, the issue of recompression error should be given consideration. If a little additional error is not going to be of much concern (especially if it is nearly invisible), then resaving to match file size might be an adequate solution. If, however, the goal is to preserve the original image's detail as much as possible, then one has to take a closer look at the way the files are saved.

Option 1 - Resaving with no recompression error

All of the software applications that advertise "lossless" operations (such as lossless rotation) will resave the file with no additional recompression error. The only way that this can work is if the settings used in the compression algorithm match the settings of the original, identically. Any differences in the settings (more specifically, the quantization table and the quality setting / factor) will cause additional compression error.

Unfortunately, it is very difficult (as a user) to determine what these settings were, let alone have any control over them (besides the quality factor). Besides cjpeg, I haven't seen any other programs that actually allow you to configure the quantization tables yourself.

In an attempt to identify whether or not this option is even possible, I have compared the quantization tables of my digital camera to a couple imaging applications.

Fortunately, if one is resaving an image in the application that originally created the image (eg. saved an image in Photoshop, re-opened it, edited it and then resaved it in Photoshop), one can almost achieve this by simply resaving with the same quality settings as were used the previous time. As the quantization table is hardcoded, the user must ensure that the quality setting matches (not higher or lower than) the original. If one forgot what settings were used in the original, it is possible to make an educated guess by performing a couple test saves to compare by file size (across quality settings) to get a very rough idea.

Option 2 - Resaving with minimal recompression error

If one is resaving a photo with a different program than originally created the original (eg. Photoshop CS resaving an edited version of a photo direct from a digital camera), it is not possible to resave without some additional "loss" (recompression error) . The problem here is that the quantization tables and quality settings either are not known or they cannot be set. This is the most likely scenario for users editing their digital photos.

In this scenario, the goal is no longer "lossless resaving" but minimizing the additional recompression error that will be introduced. Making a very rough assumption, one can get an equivalent level of detail by resaving with settings that uses similar quantization tables. There are many reasons why this ends up being a rough assumption, but it should give a close approximation to the same level of detail.

Compression Quality and File Size

The following details the effect of JPEG quality on file size from several popular image editing programs. Unfortunately, each graphics program tends to use its own compression quality scale and quantization tables, and therefore, one can't simply transfer quality settings from one application to another.

As described in the section above, if one cannot guarantee lossless resaving (because of differences in the quantization tables), then it is worth looking at the quantization table comparison for a guideline.

Knowing what quality level is roughly equivalent to the original image helps in determining an appropriate quality level for resaving. Ideally, one doesn't want to resave at a lower quality level (and therefore lose image detail / quality), and on the other side, one shouldn't save at a higher quality setting as it simply wastes space and can in fact introduce extra recompression noise!

Digital Photo Source Characteristics

The source file for comparison purposes is a JPEG image shot with a Canon 10D digital SLR camera, recording a 6 megapixel image (3072x2048 pixels) at ISO 400 and super-fine quality. The source file size is 2280 KB.

Photoshop CS - JPEG Compression

For more detailed information, please see the article: Photoshop Save As vs Save for Web.


  • Photoshop CS2 allows a range of quality settings in the Save As dialog box from 0..12 in integer increments.
  • Photoshop CS2 allows a range of quality settings in the Save For Web dialog box from 0..100 in integer increments.
  • A JPEG quality setting of around 11 achieved a similar file size to what was originally produced by the Canon 10D digital SLR camera (in super-fine mode).
  • Photoshop CS has three modes of saving JPEGs: Baseline, Baseline Optimized and Progressive. The difference in file size between these three modes (progressive was set to 3-scan) was in the order of about 20 KB for a 1 MB file. Of course this is dependent upon the content of the image, but it demonstrates the rough order of magnitude expected in the different modes.
  • An ICC profile was attached to the image, but it is simply the default sRGB, which is relatively insignificant (~ 4 KB).

Photoshop CS and Chroma Sub-sampling

Although it is not advertised, I have determined that Photoshop CS uses chroma subsampling only in certain quality level settings.

Photoshop does not allow the user to select whether or not Chroma Subsampling is used in the JPEG compression. Instead, 2x2 subsampling is used for all Save As at Quality 6 and below, while it is disabled (ie. 1x1 subsampling) for Save As at Quality 7 and higher. Similarly, it is used for all Save For Web operations at Quality 50 and below, while it is not used in Save For Web at Quality 51 and above.

Irfanview - JPEG Compression


  • Irfanview allows a range of quality settings from 0..100 in integer increments.
  • It also allows one to select whether or not Chroma Subsampling is used in the JPEG compression.
  • With Chroma Subsampling disabled, it appears that a quality setting of 96 achieves comparable file size to the original.
  • With Chroma Subsampling enabled, a quality setting of around 98-99 creates a comparable file size to the original. However, it should be noted that the digital camera itself is using chroma subsampling (2x1), so that this figure is not particularly useful. In other words, if the original source used chroma subsampling, then there is no point in resaving it without chroma subsampling — the additional CbCr color information is already gone. In the example image I have used for the above analysis, chroma subsampling offered approximately 25% savings in file size over the same JPEG compression without.

Miscellaneous Topics

Beware of Photoshop Save For Web

Although Photoshop's Save for the Web dialog seems great as it lets one interactively set the image dimension, palette depth and compression quality, it performs one potentially disastrous effect: the removal of EXIF metadata!

You will find that the file sizes after using "Save for the Web" will be smaller than if you had simply chosen "Save As..." with equal compression settings. The difference is in the lost image metadata (time / date / aperture / shutter speed / etc).

The only time that it is worth using "Save for the Web" is when one is creating web graphics (where you are optimizing for speed) or deliberately wants to eliminate the metadata (eg. for photo galleries, privacy, etc.).


Reader's Comments:

Please leave your comments or suggestions below!
 Hi Calvin! Greate effort to enlighten minds.

I'd read almost all your articles and comments but couldn't find a solution (like on many other web sources).
In Resaving and workflow, - Attempt to Recompress Losslessly, etc. You wrote some kind of guideline on how images had to be proccessed while editing\resaving. But it's unclear to me if I'v jpeg that derived directly from phone's camera with Quality=96.95% & Subs. factor=4:2:0 and I need to downsize this image (4000x3000pix) in half what would be the best settings for doing it & for minimizing the loss of pixel info while preserving adequate file size?
Downsize\downscale image via IM\XnConvert with next sett. Q=97 with Ss=4:2:0 or Q=100 with Ss=4:4:4 or Q=97 with Ss=4:4:4 ?
Could you give me your opinion?

Thanks in advance!
2016-06-22Anderson Lima
 Hi, great article. I have been using it, finally I can understand how JPEG files are organized. But I still have some doubts. Can you show in detail how to perform the IDCT?
 Using this online servise you can compress jpeg with deleting exif data and making it progressive:
 Thanks for sharing this!
 Thank you for the link, interesting and useful.

You wrote: "... depending on the input (in-camera) color space, it is possible for the color reproduction to be impacted."

Is there a way (a software) to determine the input color space, so we can use the same one when saving to PNG and thus avoid the color shift?
 Hi Kevin -- The identification of color space is unfortunately not standardized for all JPEG images. In some cases additional APPn markers are used, however it may be best to examine the SOF component index values as a hint at the color space. I plan to integrate color space detection in the next upcoming version of JPEGsnoop (ie. 1.7.5+).
 You wrote: "Note that the Stuffit Image Format (SIF) uses lossless compression of the lossy-compressed original JPEG image. Therefore, there is no further image quality reduction in this Stuffit additional compression."

Does this apply to all the formats using lossless compression when saving from JPG?

I would like to crop a JPG image and then save it as a PNG image. IIUC, it involves a conversion from JPG to PNG. Does it have any negative effects on the final image (introducing artifacts, changes in color reproduction etc)?

 Good question... saving as a 24-bit PNG should not introduce further image quality reduction (as the compression is indeed lossless). However, depending on the input (in-camera) color space, it is possible for the color reproduction to be impacted. Have a look at the following article for some discussion and empirical comparison:
 Sir, Your tutorials are really very helpful in understanding the JPEG algorithm in detail.
Please provide suggestions for following problems in implementation of jpeg encoder:
1. If the image is of size ''72x72'', we will obtain "36x36" sized Cb and Cr components because of 4:2:0 downsampling. But 36 is not a multiple of 8. In that case, how can we adjust these 36x36 blocks to obtain distinct 8x8 blocks of Cb and Cr components.
2. Is there any way to separate EOB marker of Luma components from huffman codes of Chroma components. For example, EOB marker for Luma part is '1010' while '1010' is itself a code in Huffman table for chroma components. So, how can we deal with such conflicts in codes of EOB and actual values?
Thanks in advance.
Please provide suggestions for following problem in implementation of jpeg encoder:
If the image is of size ''72x72'', we will obtain "36x36" sized Cb and Cr components because of 4:2:0 downsampling. But 36 is not a multiple of 8. In that case, how can we adjust these 36x36 blocks to obtain distinct 8x8 blocks of Cb and Cr components.
 With 4:2:0 downsampling, the image will be broken down into 16x16 pixel MCUs. As you realized, a 72x72 pixel image is not an integer multiple of 16x16 MCUs. What will happen is that the original image will be temporarily "extended" from 72x72 to 80x80 before JPEG encoding. Then, in the image header, the actual size of 72x72 will be recorded, causing the decoders to trim off the "partial MCUs".
2015-01-30Bright Chuh

I can NOT find your article on JPEG Quantization Tables, could you please give me a link? Thanks a lot!
 Hi -- I fixed the link to the page after a recent server change. There is also related content in the huffman coding page.
 Hi Calvin - I have gone through many of your pages on jpeg and found them very detailed and explanatory. Thanks. Almost all the resources on jpeg (other internet pages too) mention steps right upto the huffman codes for Jpeg. But resources are silent on details on how the compressed values are written into JFIF file. I mean, after compression you would have lumas compressed and chromas compressed but then in what order will they be written.

Please can you elaborate on this. Thanks in advance.
 Hi Anil -- have a look at my detailed page / tutorial on huffman coding. There, I show how the bitstream is formed. From that, you can read ITU T.81 to see how the scan data is then formatted between JPEG markers.
 I want to write computer programs for converting image formats from one to another, eg jpeg to tif of jpeg to jpeg2000, but i dont know where to begin
 Dear Calvin,
Do you know why quantization table for Y is not symmetrical? Distribution of these coefficients over the table looks really strange.

 Great question -- although I am not certain, I suspect that it could be that those originally developing the tables discovered that the human eye had more sensitivity in one axis than the other (much in the same way that chroma subsampling takes advantage of the limitations of the humvan visual system).
2012-10-04Geri Rogers
 I use picasa, photosketcher and have been saviing as JPEG , I could shot raw but never have.

If I edit mainly in Picasa can I save other than JPeg.

Thank you
 If the luminance is always present in any photo, why should be the image remapped to Y Cr Cb!? I supposed that all colour images had RGB plus Y... am I wrong? Thanks for clarification.
2011-12-10Alex Masolin
 Hi Calvin!
Great website...
I have some comment on your amazing piece of software. And some ideas how to improve it and I'd like to share some feelings I do have about digital image compression, including the unreasonable fail of wavelet-based algorithms :(

Just to let you know that the link to the software from StuffIt is broken.

Hope to here from you by email.
 Hi Alex -- thanks very much for your kind words. I would be interested in hearing your ideas, so feel free to send me an email. Regarding Stuffit, it seems that the company has removed the whitepaper -- thanks for letting me know.
 We have 2 types of scanning devices which produces different-size jpeg files at 300 dpi resoulution for the same document(one is 2MB an other is 4MB).
How we can make them the same size?
Best regards
 The difference is probably due to the fact that the JPEG encoder within each scanner program uses different quantization tables (ie. different compression levels). Some scanner software may enable you to adjust the compression ratio manually, but in many cases the control is quite limited. I have managed to change the compression ratio (by "hacking" the JPEG encoder within the scanner software), but this is probably not worth the effort. An alternate approach is to perform batch recompression on your images after scanning, while being aware of the impact to image quality when this is done.
2010-10-15Mike Mills
 I have used a utility which is included with the full version of Irfanview called the RIOT plugin. This allows the saving of larger pictures in a smaller size. Of course there is a quality loss. The exif data is not removed.
In my case I am making a dvd collection of all the birds of the world, and I believe that reducing many of my pictures to less than 30k enables this. For most computers today reduction below 30k size seems to be pointless from the point of view of saving disk space, because of the block storage scheme used.
I think that the savings in space are significant, and the appearance is satisfactory.
RIOT is available as a stand-alone product, but much handier as a plugin for Irfanview. It is very fast.
2010-10-13curlon dragon
 thanks for the imo it waz quite informative
I wanted to know how is the compression ratio is calculated on the compression stats section of JPEGsnoop?

Thank you
 The compression ratio is calculated as follows: DimensionX * DimensionY * NumComponents * 8b / (# bytes in Scan Data segment).
2010-04-27Yves Lemaire
 I'd like to know if it's possible to calculate a CompressionQuality factor from a JPEG not recorded with the IJG algorithm, and with non standard-based quantization tables.

I guess it would be only an approximation, but perhaps usefull to compare pictures. I didn't find a satisfying solution.

Thanks for your website and tools which were very usefull to me.
 Hi Yves -- The IJG algorithm is only really applicable to deriving the scaling factor from tables scaled up from the "standard" quantization table. If arbitrary quantization tables have been used, then you could apply the same formula, but the result may not be very meaningful.

If a) the camera used a similar scaling factor algorithm and b) the camera based its pre-scaled table on something similar to the table listed in the ITU-T standard, then it may be possible to perform some approximate comparisons of "quality".
 Suppose you are the implementor of an image processing system which allows the user to specify the
image quality using a number q, whose value varies from 0 to 1 inclusive. Design a way to control the
quality and size of JPEG file produced that makes use of q.
 I'm not sure exactly what stuffit is doing, but I don't think it is anything too special. Huffman tables can only compress one aspect of data - frequency, and won't do it efficiently if the values aren't on nice even 2**n boundaries. Arithmetic coding fixes this.

But other aspects are repeated values, small deltas, spectra, etc. The fundamental question in compression is "how can I use the fact that the probability of the next bit being a 1 or 0 is not exactly 50%". Huffman takes the static statistical histogram into account but nothing else. Run length takes sequences into account. But there's lots more which can be done.

For many pictures, each line will differ from the one above or below by a small margin, and adjacent pixels won't differ. You will have many outliers, and JPEG's use of differences helps, but a bunch of differences near zero still result in a long string of near-zero codes, which are repeated vertically.

The "lossless" compression is not very efficient, but it is easy to implement in devices like cameras where you have some processing power and memory. (A microcontroller like a PIC or ATmel would have trouble, PowerPC, MIPS, or ARM, would work). We've had Moore's law going on for a few years, so these newer techniques are now possible, but remembering the technology back then (and when much would be in a customized gate array), this was the best compromise.
 Great comments. I agree that taking advantage of any correlation to prior-row could theoretically provide considerable compression benefits for many photos.
After reading your articles I've an idea:
I'm wondering if it's possible to recompress a file back to the
original state. I mean:

  1. I have a valid JPEG file (nicepic.jpg).
  2. Using PHP GD function I load that file using imagecreatefromjpeg()
    function (it cuts MIME/EXIF data like author, comments, etc. and
    converts it to the RGB)
  3. Using imegejpeg() function with settled parameters the image is
    being converted back to the JPEG format and saved to the file
The images nicepic.jpg and otherpic.jpg are very different (besides
the markers like Huffman tables, SOS, JFIF, Quantization)... I
mean ... they're binary different (visually the same).

There is my question... how to get reverse compression/decompression
of file otherpic.jpg back to the original state (nicepic.jpg) before
all steps I described above?

And another question came out... Is possible to foresee (more or less accurate) the binary
result of JPEG after described conversions? I mean... I
want to get specific byets sequence after conversion... and need to
know how input file should looks like to get that sequence of bytes.

GD library uses The Independent JPEG Group's jpeglib.
 If you have already converted the original JPEG to RGB (in memory), then there is no way to guarantee recovery of the original JPEG compression characteristics (quantization tables, etc.) from the RGB data alone. Taking this RGB raw data and then recompressing with JPEG again is lossy and will likely be with a different set of parameters.

Your second question is more difficult to answer. I believe you are asking: can one determine what "nicepic.jpg" values might produce the output "otherpic.jpg" after the intermediate RGB conversion? The answer is "somewhat" no. Even though you can perform a JPEG decompression on "otherpic.jpg" to RGB, you'll then need to know exactly what compression characteristics (quantization tables, Huffman, subsampling, etc.) are required for the JPEG compression step to create a specific "nicepic.jpg" sequence of bytes.
2009-10-12Veeranjaneyulu Toka
 Currently we have our own decoder and encoder which is developed based on ITU-T81 JPEG pdf. We are very much interested to do optimizations in it. So could you please suggest me some optimization techniques in decoder especially or refer me any link which helps me regarding this.

2009-10-11Veeranjaneyulu Toka
 Especially when we talk about thumbnails in a JPEG, mostly i have seen it sits in the EXIF header. So Could you please let me know is it possible to have thumbnail in JFIF image itself (which does not have the EXIF header)?

 Thumbnails are usually stored in one of a few different places: APP0 (JFIF, JFXX), APP1 (EXIF) or APP2 (Flashpix, > 64KB). So, yes, it should be possible to have only an APP0 thumbnail.
2009-09-18Veeranjaneyulu Toka

I am just started to dig into more internals in the JPEG format found this is nice article.

Recently i came across a file which has got 4 components in START OF FRAME FF C0. Could you please let me know more info on case where a file contains more than three components and what is the purpose. Material on this could be great help.

 If you observe 4 components in the SOF, it often implies that the underlying image data is CMYK rather than YCC. Adobe uses a custom tag (APP14) to indicate the color space, which may suggest that the color space is either YCCK (converted from CMYK), CMYK or YCbCrA (alpha layer).
 FYI - Independent JPEG Group
Has just released version 7 of its library for JPEG image compression.

The 27-Jun-2009 release includes many new features, including JPEG arithmetic coding compression/decompression/re-compression.

Looks like the Software Patents on Arithmetic Coding have finally expired after 17 years !
 Great news! Thank you for the update!
 Excellent site. I think I already have my answer, but I'll ask anyway.

I have some c++ code that I use to encode/decode jpegs, and there's no way to estimate the size of the buffer that will hold the encoded image.

Currently, I send in a buffer equal to height * width * bands * sizeof(unsigned char). I will set the buffer to a minimum size (4096) if it is too small.

There must be a better way to estimate this. I know the ultimate size is dependent on the image contents, but I just need a reasonable maximum that won't fail most of the time. Maybe half the size of the original image? I could use image quality (between 0-100), the sampling (411,422,444) and the type of compression (baseline, progressive, or lossless) in the equation.

Thanks for such an informative site. FYI, I'm using the intelIPP library for the heavy lifting, if it matters.

 An interesting problem, and to be honest, I'm not aware of a foolproof method. Given a particular non-optimized huffman coding table, it is possible that certain DCT coefficient values may take more bits to encode than 8bpc (uncompressed), meaning that setting an upper bound can be tough.

The biggest problem is that the compression ratio is largely determined by the image content (at least for conventional JPEG). Taking a random sampling of MCUs from the image might give you an approximate estimate, but it is not a guarantee.

If anyone knows of a good strategy, please feel free to add!

Is it possible to send you a jpeg file that is JPEG-B abbreviated format (without Q and huffman tables). In spec they suggest to use standard Kx tables to decode but I failed to do it. Your snoop also can't open it. I would appreciate your help with decoding it.
 Hi Artur -- I presume you are referring to Motion JPEG-B. From what I had understood, the format still contains the quantization and huffman tables in the header, but no markers or byte stuffing is used. The current version of JPEGsnoop will not decode Motion JPEG-B, but I could consider adding it. If you can, please email me a sample. Thanks.
Have a question about IrfanView.
I tried to create a new image by IrfanView, but I can't create a new image with 4 or 1 BPP.
Could you give me some help?
Thanks a lot!!!
 I have a question that may seem stupid but I keep reading conflicting info on it so I was hoping you could help me.

Let's say I have a photo called '01.jpg' and I bring it into Photoshop, edit some values then save as '02.jpg' to avoid overwriting the original '01.jpg'. Is the original '01.jpg' decompressed and recompressed? Is there any quality loss to the original image over hundreds of "save as" operations or batch resize operations?

Basically, if I NEVER overwrite '01.jpg' am I going to have any degradation of the image through all of this editing? I can't find a solid answer on this so i'd appreciate your help. Thanks a lot.
 Not a stupid question at all! There is certainly a lot of confusion about this, so hopefully I can clarify it:

The act of opening up an image and then closing it (without making any modifications or "resaving") should never cause any changes to the file itself -- and therefore no recompression will occur either.

The concerns about recompression and generation "losses" only appear when a user instructs the software to save or output the loaded image.

The presumption is that most image editing programs will not resave an image upon closing a file unless you explicitly request it to.
I am a programmer in a mobile phone comapany and basically deals in to mobile phone camera device driver and jpeg decoder encoder.

Though I study online material related to it but I want to enhance my knowledge and reach some bench mark related to my field.

Kindly tell me if there is any kind of online certification available in these areas??

 I'm not aware of any certification, but I would strongly recommend reading the ITU-T spec. From there, if you'd like to confirm your understanding, you can always try to write your own JPEG encoder/decoder! If anyone is aware of an existing certification that would be appropriate, please feel free to comment.

I want to study complete JPEG file format, the way markers are arranged in JPEG file and also about functionality of each marker.

I searched for the material but dint find anything worth. I will feel great if you can suggest some book or online material that can give me complete information about it.
 The best resource is to read through the JPEG standard (ITU-T T.81), which is freely available on the web. There are many great tutorials available online that may help explain the standard from an easier perspective. Good luck!
By reading about the the compression process of JPEG images one doubt is there in my mind.
According to the compression procedure the compression depends on Quantization table (Different quantizaton table must result in differenct image quality).Suppose i take an image by my mobile phone and then look at the stored image on moble phone and then i copy the same JPEG image to my PC and then i look at the image on my machine my any viewer.
So different quantization table will be used in phone and PC so what effect will it has on the final image??
Also as per my knowledge quantization table is present in JPEG image so which quantization table will be used if i watch the image on PC?
 Whether you look at the image on your mobile phone or on your PC, both JPEG decoders will be using the same quantization table -- the one stored within the JPEG file. The JPEG file was originally generated by your mobile phone, so that table will be the one that appears within the file. Simply viewing an image will not cause recompression or the changing of the DQT.
 Nice information you provided on this article.
As you may already know, there's a new application called Image Compressor that claims to create better JPEG?
Have you try it and compare the result?
I think you can get one from here:

What do you think? Is it worthed?
 Thanks for pointing out this program; I had not heard of it until now. I'll take a more in-depth look at Image Compressor soon and report my analysis of its compression claims.
 Dear Calvin,

I'm a web designer, and when I get photos from clients, sometimes they are already in great shape, sharp at 960 pixels wide for a file size of 75K +/-. Others come in huge (3Ms 2500pixels wide), but if I use Photoshop CS3 to resize them, and drop the ppi count to 72, they look terrible with artifacts, etc! Why is this? Was the first set shot in some particular mode? Will I ever be able to get the big ones down to less than 130K with the same clarity as the first set?

Thanks for any help you can give me.
 Hi Nick -- without seeing the images, it will be really hard to determine the cause. The following will be an over-simplification of the issue. However, it is very likely that the first set were resized, sharpened and then resaved with moderate compression (in order to have that file size). Whether or not you will be able to get a given image down to a particular file size is completely determined by the image content (how much high-frequency content there is, vs smooth regions of color). By careful application of sharpening and selection of appropriate compression levels (eg. in Save for Web), you should be able to produce equivalent images as in the first set, provided that the image content is not too "uncompressible".
 A recurring problem: Although in the psd "save for web" settings it will say my image is going to save at a file size of (just an example) 356 bytes, after I've hit "save" the image becomes 4kb. No matter how small my images claim to be in the "Save for web" settings, they never come out less than 4kb. Any thoughts? Thanks!
 When you say they come out at 4KB, where are you seeing that? Are you using the file properties in Windows Explorer (ie. "Size on Disk: 4.00 KB")? I just performed a similar test (creating a JPEG image in Photoshop that Save For Web indicated would be 401 bytes), and the resulting file is in fact 401 bytes. However, the "Size on Disk" will usually show 4KB since that is the smallest unit of measure that the hard drive can use to store your image file -- this is known as the cluster size (usually 4KB, but it depends on how large your hard drive is). If you were to save this for use on the web (which I assume you are doing), I believe only the 401 bytes will be sent, not the full 4KB (plus the usual overhead associated with transmitting files on the web).
2008-09-02gary heiden
 hi guys, im being told I made Big mistake when I used adobe photo album to sharpened or smart fix my hi res. jpeg images for sale on internet. question is what can I do to correct my folly. Can I just revert these edited images back to their original ? if not is there anything I can do other then just sell them as prints or as "as is images". thanks ever so much. you provide a real service. this tech stuf is mindblowing!!! gh
 Unfortunately, if you resaved the JPEG images, it is unlikely that you'll be able to recover the originals. There is no edit history in the JPEG / JFIF file format. Hopefully the software produced reasonable modified images that you can continue to sell as-is, assuming it didn't over-sharpen the images or reduce the resolution. Hope it works out.

Thanks a lot for your software JpegSnoop, it helped me a great deal. I was searching for a way to know if pictures coming from IP cameras were encoded in 4:2:2 or 4:1:1 or else. I thought JpegSnoop and your explanations gave enough information to be able to determine it, but it apparently isn't enough. I succeeded in comparing two pics and saying they got the same 4:x:y, even though they were 4:2:2 for the first and 4:1:1 for the second...

Is there a quick, simple and not-prone-to-error way to know the 4:x:y of a picture with your software (or even without).

If so, you would help me a great deal again by saying what should be done.

Thanks again,

 Hi !
I'm trying to understand how Camera phones store their images. I was wondering if you could help me with that. Thanks !
 Camera phones generally use the same mechanism as other digital cameras, however it appears that many of them are more advanced when it comes to selecting a suitable compression level (quantization table) -- probably out of the fact that most phones don't have that much available on-board memory to store photos.
 Hi Calvin,
After comparing many infos (also in many application and docs) about the DCT of JPEG, related to flood point setting or the old default mode(s). I'm interested in why the flood point is on such a bad position, they say you will no difference, but it can drop the size what I already tested many times.
We are talking about several KBs and its technical supported by close to all CPUs these days.
 Hello Hans -- I believe you must be referring to floating point mode. I'm not sure I understand your question completely, particularly about the bad position or size? The DCT algorithm can be performed with fixed point math or floating point math. The fixed point arithmetic is much less accurate but generally faster. Conversely, the floating point math results in more accurate conversions but tends to be slower.

Nowadays, many processors can handle floating point calculations without as much impact to performance because they have a separate floating point processor (FPU). In my tests in developing JPEGsnoop, I implemented my JPEG color conversion calculations in both integer and floating point routines. Turns out that the performance difference was less than 10 percent!
 I use photoshop "save as" to save an image (size:98k) as another jpg image(quality:12), the file size increase to 300k, why?
 The reasons are fairly complicated, but here's an over-simplification: the original image was produced with a different program (or camera), meaning that when Photoshop saves your image it will need to rethink how the image content gets compressed. When you request quality level 12, you are asking Photoshop to retain nearly all detail to a very precise level. Unfortunately, some of these minor details are compression artefacts ("errors") from the previous program's lower-quality compression techniques. By resaving at a higher-quality compression level, the software tries to record all of these little nuances, which causes it to waste space needlessly.

In general, you generally don't ever need to save in Photoshop over level 10. At that point the increase in file storage requirements typically doesn't outweigh any increases in quality. If the original file were saved with a lower quality to begin with, saving at a higher quality level can't improve the image quality further.
 a new program that use packjog dll to recompress many formats including jpg,png,gif.. after v0.38 it seams working with movies (avi) captured with my casio camera EZ-1050 and provide around 20% size reduction..
 a new program that use packjog dll to recompress many formats including jpg,png,gif.. after v0.38 it seams working with movies (avi) captured with my casio camera EZ-1050 and provide around 20% size reduction..
 after reading your essential infos about compression of pictures I missed a hint about which the technical best Discrete cosine transform simple called DCT is. Even the official wiki page about JPEG has no good advise. They just talking about how it works, but don't decribe the several modes.

I'm just interest in this, because XnView another free pic viewer with basic edit abilities (like IrfanView it has) supports DCT with floating point setting. After a trip through time and many papers about the invention of MMX, SSE instructions and the past of the x86 I'm more confused as before, because a few said this calculation technic is unsecure. So what are the min. requirements to use it, my PC support SSE2 (P4), it should be technical possible to do it, but I've never ever read a recommendation somewhere, maybe anybody didn't realized the resulting file is always the smallest of all settings, compared to the fast and the slow default mode.

Surely, somebody can say use it or ignore it, but this setting can really drop the size and with bigger files a bigger difference is recognizeable. For example a Wallpaper at 1204x768 can be ~10KB smaller, while a little banner get a max. reduction of ~1KB. With a few hundred photo snapshots from a digital cameras around 2MB per file, this could be a good thing that nobody thinking of.

The prob is, I'm just not yet sure how accurate it is visually, because there is no difference, but from the technical calculation side it takes only <1 sec. with the daily common hardware. I'm looking forward what your opinion is about this.

We live in a time where SIMD or other facilities exist, but I think not enough use them, anyway everybody could use it just for free, because its possible and reality. I forget to say, without negative effects for stability.
 Hi Cody -- If I haven't misunderstood, I think you're asking about the accuracy of the floating-point DCT over fixed-point / integer DCT methods and its applicability within an image editor. As you're well aware, the floating point performance is generally slower but with a gain in accuracy. However, you have also referred to a change in the resulting file size upon saving. I would be interested in comparing samples generated by the same editor using different DCT modes during save. My presumption is that most editors (during a save to JPEG operation) perform the bitmap to DCT conversion with the most accurate algorithmic implementations in floating point before rounding to integer (in the quantization step), versus doing the DCT itself in fixed point (as performance during saves is usually a non-issue). Perhaps you could send me some example files so that I understand the issue more clearly.

That said, I'd be hesitant to save my images using the faster, less precise implementations until I had evaluated the differences in the quality of output. With the cost of storage falling so greatly, a file savings of 10KB is not worth it to most unless the reduction in quality is absolutely imperceptible.
 Does anyone know how to turn off Chroma Subsampling while resizing a JPEG file while resizing?

I need to prevent quality loss when I process and resize photos.
Any help will be highly appreciated. I need to do it in coding rather than some kind of tool like infanview
 Calvin -

I have a question regarding compounded errors due to recompression. It makes sense that recompressing with a different factor and/or different quantization matrices would increase error each time. And if an image is editted, even using the same quality factor and quantization would increase the error. The part that I'm unclear on is this - Doesn't the jpeg file contain enough info for you to be able to replicate the compression? That is, if the decompressed image isn't editted, couldn't you use the quantization and scale to produce a jpeg file that has no additional error (I'm not looking to do this, I just want to make sure I understand the concepts correctly)? I've been reading a bunch of the pages on this site today and I haven't seen how the jpeg stores the scaling that goes with the quantization, but I think that value needs to be in the file to avoid having the result be saturated or muted.

BTW - this is by far the best set of pages on jpeg I've seen. I wish I'd found your site a few months ago!
 Great question Brett --

In some sense, yes, the JPEG does store enough information to do the resave with the same parameters. The quantization tables (DQT) dictate the array of constants used to scale/descale during compression/decompression. However, the missing link is that most image editing software programs out there do not support the use of arbitrary quantization tables during save. Only a few programs out there do this, and it's really not a difficult thing to do! It would add some value if the major photo editing apps out there provided it as an option (but still allow the user to specify their own compression level if they choose).
this info helped me a lot.....i have a question...i am looking for image format which has a sub object level intelligence but i learned here that jpeg treats whole image as a single object..can u tell me any other image formats having this intelligence...or is it a relevant question to ask in image formats?
 Hi, guys :

Recently i have been looking for code of jpeg compression in c or c++ language based on self define quality table .Can anyone of you give me some information about this topic ? Thank you very much !
 If you are looking for JPEG compression with a custom quantization table, then the easiest way is to reuse the IJG JPEG library (see the JPEG Decoder Source Code page links). Even though I don't use the IJG code in my own applications, it is one of the most reliable C coded implementations out there. In the source, you'll find a jpeg_add_quant_table() function that will define the custom table for you.
 i wanted you guys to explain in depth how JPEG compression work for my assignment but then at least i got something though not sure i have all the information
 Is there a particular question you have about compression that you are interested in knowing?
 I had some old building drawings saved in jpeg format to save space, as tiffs would have been too big for my computer. The physical size of the drawings is about 65x50 cm. The file sizes are about 2.5-4 MB. They look great when printed at A3 which was my purpose. I had to add some text to them and save them again. Trying to decide the quality setting I experimented with different settings and noticed that file size increased at anything above 5 or 6 in Photoshop. However, contrary to most advice I have found on the web, I could see no deterioration in quality even at 5, zooming in to actual pixels or even closer. True, I did it only once. These drawings are beautiful water coloured drawings from the late 1800s and early 1900s, in terms of image properties perhaps something between normal photographs and line drawings, so jpeg is perhaps not too bad a format for them. At actual pixels you can see halos around black lines but zooming to print size they vanish. Can you explain why the file size grows? I have noticed this with some photos also, but only at maximum settings. Also, I usually have a hard time finding those infamous jpeg artifacts when playing with low quality settings.


Leave a comment or suggestion for this page:

(Never Shown - Optional)