Return to Digital Photography Articles
What is an Optimized JPEG?
What is an Optimized JPEG?
An optimized JPEG is simply a JPEG file that includes custom Huffman tables that were created after statistical analysis of the image's unique content.
What is Huffman Coding?
The huffman coding scheme used in JPEG compression reduces file size further by replacing the fixed-size (eg. 12-bit) codes with variable-length codes (1-16 bit). The hope is that on average, the replacement codes will add up to fewer bits than the original. If, on average, 10% fewer bits are needed to store the quantized DCT coefficients, then this will directly translate into 10% file size reduction. The difficult part is deciding which codes to replace with which variable-length bit-strings. The Huffman Tables define these mappings, and are stored inside every JPG image file. A separate huffman table is required for each of the following four components: luminance DC component, luminance AC components, chrominance DC component and chrominance AC components.
![]() |
| The example photo used for Optimization |
How much reduction in file size?
The following shows an example of the file size reductions that are possible after using an optimized huffman table instead of the standard tables. A 6 megapixel digital photo (from a Canon 10d, #1) was opened up in Photoshop and resaved via both Save As (#3, #4) and Save for the Web (#5, #6). The original photo was also passed into jpegtran to perform a lossless huffman optimization step (Photoshop cannot do this without applying its own quantization tables and hence it isn't lossless).
In the two Photoshop tests, both Optimized and Standard modes were used. Both files in each of the three pairs of comparisons have identical quantization tables (meaning that the JPEG compression quality is the same). In other words, #1 & #2 have the same table, #3 & #4 have the same table and #5 & #6 have the same tables. However, each group (1+2, 3+4 and 5+6) will have different tables.
| File | Image | Mode | Total Size | Overhead | Scan Data | Savings |
|---|---|---|---|---|---|---|
| 1 | Canon 10d | Original | 2,212,302 | 9,822 | 2,202,480 | - |
| 2 | Canon 10d (jpegtran) | Optimized | 2,150,174 | 403 | 2,149,771 | 2.4% |
| 3 | Photoshop CS2 - Save As - Quality 11 | Standard | 1,999,226 | 31,031 | 1,968,195 | - |
| 4 | Photoshop CS2 - Save As - Quality 11 | Optimized | 1,872,068 | 30,774 | 1,841,294 | 6.4% |
| 5 | Photoshop CS2 - Save For Web - Quality 85 | Standard | 1,970,500 | 641 | 1,969,859 | - |
| 6 | Photoshop CS2 - Save For Web - Quality 85 | Optimized | 1,836,522 | 410 | 1,836,112 | 6.8% |
As is shown by the above results, the file size savings over and above the standard tables (which are used by File #1, the original photo) are not particularly great. A lot of extra computation is required to devise this optimum table, all for the sake of 2-6 % savings in file size.
Does JPEG Optimization affect Image Quality?
No! The huffman table optimization is a lossless (reversible) process that has absolutely no effect on the resulting image quality. If one has the option, it is almost always best to enable JPEG optimization. The extra file size savings can't hurt. However, as it may potentially reduce compatibility with some bad JPEG decoders, this may be enough of a reason for you to disable it.
Are JPEG images from Digital Cameras Optimized?
No! It would appear safe to say that every single JPEG photo generated by a recent digital camera uses the "typical" huffman tables provided in the JPEG ITU-T81 standard (tables K3, K4, K5 and K6 and reproduced below). I have confirmed this to be the case after examining photos from over a hundred digital cameras, from high-end full-frame dSLRs to point & shoot digicams.
Why don't digital cameras optimize their JPEG photos?
As you will see later in the section detailing how the optimization is done, generating the optimized huffman tables is quite laborious. It requires considerable counting, sorting and reorganization to produce an optimum result. Most high-end digital SLRs probably have the computing power to process this for every photo, but the extra complexity and performance penalty are probably not outweighed by the slight reduction in average file sizes (when using the standard's suggested Huffman tables). For a 6% file size savings, the additional hardware may not be justifiable.
While optimizing JPG images is likely not worth implementing in digital camera hardware, it certainly is a viable option for any photo editing computer program. The amount of time taken to compute the tables is insignificant when compared with the average user's response time after initiating a save operation!
Are Quantization Tables ever Optimized?
When you select optimized JPEG in Photoshop or other photo editing programs you are almost certainly not getting any optimization (or change) of the quantization tables. Nearly every program includes some predefined quantization tables that are used (and a scaling thereof), depending on the user's desired quality setting. Trying to calculate an optimum quantization table (given targets in file size and knowledge about the characteristics of the human visual system) would be very difficult. I believe this has been the focus of a few published theses, but not adopted in any popular computer software.
How to Calculate an Optimized Huffman Table
NOTE: The following sections are somewhat complicated, so only read on if you are truly curious about the details!
There are many ways of calculating the optimized JPG's huffman table, but one of the simplest methods is to do the following. First, we start with the entropy coding of code-words:
- Calculate the quantized DCT coefficients for all MCU blocks in the image, repeating for the Y, Cb and Cr components
- Reorder the coefficients using the zig-zag sequence starting with the DC coefficient and then all 63 AC coefficients.
- Replace any continuous sequence of zeros with an appropriate code-word whose first nybble defines the number of zeros (i.e. from 0-15). The second nybble defines the number of bits used to represent the bit-size of the signed coefficient that follows the run of zeros (from 1-11). For example, to encode a run of 7 zeros and then the AC coefficient +35, we would select the code word 76 (run=7 zeros, size=6 bits). Note that the calculation of the size field (e.g. 6) from the coefficient value (e.g. +35) is described by Table 5 "Huffman DC Value Encoding" in the Huffman Encoding Tutorial page.
- Replace the DC coefficient of each MCU with the huffman-coded size field ("category"), followed by the value itself. For example, if the DC value were -13, then the code word would be 04 followed by the bit-string "0010". The size field is defined by Table 5 as in the previous step. For this particular example, one would look up -13 in Table 5 and recognize that 4 bits are required to encode any value in the range -15...-8,8...15. Looking at the sequence, one then determines that the "additional bits" required to define -13 is the binary string 0010 (0000 is -15, 0001 is -14, 0010 is -13, etc.)
The above steps simply create the code-words in preparation for optimization. Now, we can perform the optimization. The following is only a very brief summary of the huffman algorithm, so it would be worth looking at other useful references first (e.g. ASCII string into Huffman codes, or any other general huffman coding tutorials).
- Separately for each of the four sets of image data (code for the DC luminance of each block, code for the DC chrominance of each block, codes for the AC luminance of each block and codes for the AC chrominance for each block), count the number of times that each code word is used.
- Sort the code-word list (e.g. for DC luminance) in order from most frequent to least frequent.
- Select two least-frequent codes and add them to the bottom leaf nodes of a binary tree. Record the frequency of each code. Above this pair, create a node with a value equal to the sum of the two nodes added below.
- Again, find the two least-frequent codes (with the new combined nodes replacing the two component nodes). Group these two and add the frequencies together.
- Repeat until all codes have been paired up into a binary tree with a root node equal to 1 (i.e. 100% frequency).
- Assign binary 0 and 1 to each branch of the tree from top to bottom. Each node will have a variable-length bitstring associated with it, which is simply the concatenation of the binary value of each path taken from the root node.
- Traversing the binary tree starting with level 1, from left-to-right, read out each code word and add to the DHT (huffman table definition).
Photoshop and Huffman Tables
Photoshop Save As provides three JPEG save modes: Baseline Standard, Baseline Optimized and Progressive. Photoshop's Save for Web provides either Optimized or not Optimized. For the purposes of this discussion, Progressive mode will be ignored. Save As defaults to no Optimzation, whereas Save for Web defaults to Optimization enabled.
Compatibility of Optimized JPEG
According to Adobe's documentation with Photoshop:
"Optimized: Creates an enhanced JPEG with a slightly smaller file size. The Optimized JPEG format is recommended for maximum file compression; however, some older browsers do not support this feature."
It is highly unlikely that you will ever run into any problems using "optimized" JPEGs. Optimized JPEGs don't use an extra feature, nor is there really anything different except that they have written a huffman table that uses different values than the most common tables in use.
Any standards-compliant JPEG decoder must parse the quantization and huffman tables in the file so there should never be an issue with optimized versus standard. That said, it seems that there may have been a bug in one of the earliest (i.e. 10 years ago) web browsers that did make an assumption about the huffman tables (probably never bothered to decode the DHT entries) and hence the incompatibility. It is almost improbable that anyone would be using such a web browser today!
Optimization according to JPEG Quality
It is interesting to note that in Photoshop CS2, the Save As for the Baseline Standard operation uses different huffman tables depending on the quality level setting you choose. None of these tables match the huffman tables of the JPEG standard. Some academic research has shown that the frequency distribution of huffman codes differs predictably according to the JPEG compression quality (defined by the quantization tables). This is essentially a half-way optimization -- using tables that are optimized for the compression quality but not the image content itself. As this involves using "non-standard" huffman tables, I see very little point in using Baseline Standard over Baseline Optimized.
| Photoshop Quality Levels with same Huffman Table |
DC Lum # codes |
DC Chr # codes |
AC Lum # codes |
AC Chr # codes |
|---|---|---|---|---|
| 0, 1, 2 | 12 | 12 | 98 * | 91 * |
| 3, 4, 5 | 12 | 12 | 114 * | 111 * |
| 6, 7, 8, 9, 10 | 12 | 12 | 162 | 162 |
| 11 | 12 | 12 | 162 | 162 |
| 12 | 12 | 12 | 162 | 162 |
An interesting detail about the use of tables tailored per quality level is shown in the number of code words in the huffman tables written into the JPEG files from Photoshop. For quality levels 0 through 5 (relatively poor quality), not all code words have been included in the huffman tables. Therefore, the expectation is that saving images in Photoshop with these quality levels (by using the built-in quantization tables), it would be impossible to generate the Run+Size AC codewords that were left out. 51 of 162 possible code words are missing from the DHT tables when saving at quality level 5.
In fact, what you may notice is that (for example, quality level 3-5), the AC code-words only have a maximum "size" field of 7 bits, not 10 bits (maximum code word is 0xF7 not 0xFA). This would imply that it is impossible to have a quantized AC DCT coefficient greater than +/- 127.
IrfanView and Huffman Table Optimization
IrfanView can perform lossless re-optimization of a JPEG photo by selecting Option->JPG Lossless Operations menu option, then choosing None for Transformations and enable the Optimize JPG File checkbox. In doing so, one will achieve exactly the same optimization performance (and tables) as what you would get through the IJG utilities such as jpegtran.
By default, IrfanView saves all images with huffman table optimization enabled. There doesn't appear to be any way to save with the standard non-optimized tables -- and realistically, one shouldn't need to anyway.
Note about DHT Ordering
For some reason, the newer Canon dSLR and Point & Shoot digicams store the huffman tables in a different order than all other digital cameras and the older Canon models. I have no idea why this was changed. The last models to use the more traditional ordering (DC-Luminance, DC-Chrominance, AC-Luminance, AC-Chrominance) within each model series are: Canon 10d, 300d, A70, G6 Pro1, S40, etc. All newer models instantiate the huffman tables in the order DC-Luminance, AC-Luminance, DC-Chrominance, AC-Chrominance. This should have absolutely no impact on the image.
Standard Huffman Tables
The following tables are provided in the JPEG standard / specification as examples of "typical" huffman tables. They were apparently generated from "the average statistics of a large set of video images with 8-bit precision". Presumably, these will provide a reasonably close approximation to the compression performance of a true custom-optimized table for the majority of photographic content. The advantage of using these tables is that no compute-intensive second-pass analysis would then be required prior to encoding into the JPEG file format.
You will note that the standard huffman tables provide lookups for all possible codeword bytes. In the DC tables, the first nybble is always 0, and the size field can only be from 0 to 11 (0x0 to 0xB). Therefore, codewords will run from x00 to x0B. Codeword x00 is a special value that basically indicates the end of block (i.e. no values). For the DC entry this simply means that there is no change in value from the previous block (MCU). So, the standard huffman tables for luminance and chrominance both include all 12 possible DC code words.
For the AC tables, the first nybble encodes the run-length of zeros that precede the non-zero quantized DCT coefficient. This can be from 0 to 15 (0x0 to 0xF). The second nybble in the code word indicate the size in bits of the non-zero coefficient that followed the run of zeros. This can be from 1 to 10 (0x1 to 0xA). Together, this would give 16 x 10 = 160 possible code words. However, there are two additional codewords that are used in describing the AC scan entries: 0x00 and 0xF0. x00 represents an End of Block (EOB), which indicates that there are no more non-zero AC coefficients in this component, and that the decoder/encoder will move on to the next component. xF0 represents a Zero Run Length (ZRL) which indicates that there was a run of > 15 zeros. This codeword represents a run of 15 zeros, and will be followed by another codeword that indicates another ZRL or a normal run + size codeword. So, there are in total 162 possible AC code words. The standard huffman AC tables include all 162.
| Standard DC Luminance Huffman Table | |
| Code Length (bits) | Code Words (Size or Category) |
|---|---|
| 2 | 00 |
| 3 | 01 02 03 04 05 |
| 4 | 06 |
| 5 | 07 |
| 6 | 08 |
| 7 | 09 |
| 8 | 0A |
| 9 | 0B |
| Total number of code words in table: 12 | |
| Standard DC Chrominance Huffman Table | |
| Code Length (bits) | Code Words (Size or Category) |
|---|---|
| 2 | 00 01 02 |
| 3 | 03 |
| 4 | 04 |
| 5 | 05 |
| 6 | 06 |
| 7 | 07 |
| 8 | 08 |
| 9 | 09 |
| 10 | 0A |
| 11 | 0B |
| Total number of code words in table: 12 | |
| Standard AC Luminance Huffman Table | |
| Code Length (bits) | Code Words (Run / Size) |
|---|---|
| 2 | 01 02 |
| 3 | 03 |
| 4 | 00 04 11 |
| 5 | 05 12 21 |
| 6 | 31 41 |
| 7 | 06 13 51 61 |
| 8 | 07 22 71 |
| 9 | 14 32 81 91 A1 |
| 10 | 08 23 42 B1 C1 |
| 11 | 15 52 D1 F0 |
| 12 | 24 33 62 72 |
| 15 | 82 |
| 16 | 09 0A 16 17 18 19 1A 25 26 27 28 29 2A 34 35 36 37 38 39 3A 43 44 45 46 47 48 49 4A 53 54 55 56 57 58 59 5A 63 64 65 66 67 68 69 6A 73 74 75 76 77 78 79 7A 83 84 85 86 87 88 89 8A 92 93 94 95 96 97 98 99 9A A2 A3 A4 A5 A6 A7 A8 A9 AA B2 B3 B4 B5 B6 B7 B8 B9 BA C2 C3 C4 C5 C6 C7 C8 C9 CA D2 D3 D4 D5 D6 D7 D8 D9 DA E1 E2 E3 E4 E5 E6 E7 E8 E9 EA F1 F2 F3 F4 F5 F6 F7 F8 F9 FA |
| Total number of code words in table: 162 | |
| Standard AC Chrominance Huffman Table | |
| Code Length (bits) | Code Words (Run / Size) |
|---|---|
| 2 | 00 01 |
| 3 | 02 |
| 4 | 03 11 |
| 5 | 04 05 21 31 |
| 6 | 06 12 41 51 |
| 7 | 07 61 71 |
| 8 | 13 22 32 81 |
| 9 | 08 14 42 91 A1 B1 C1 |
| 10 | 09 23 33 52 F0 |
| 11 | 15 62 72 D1 |
| 12 | 0A 16 24 34 |
| 14 | E1 |
| 15 | 25 F1 |
| 16 | 17 18 19 1A 26 27 28 29 2A 35 36 37 38 39 3A 43 44 45 46 47 48 49 4A 53 54 55 56 57 58 59 5A 63 64 65 66 67 68 69 6A 73 74 75 76 77 78 79 7A 82 83 84 85 86 87 88 89 8A 92 93 94 95 96 97 98 99 9A A2 A3 A4 A5 A6 A7 A8 A9 AA B2 B3 B4 B5 B6 B7 B8 B9 BA C2 C3 C4 C5 C6 C7 C8 C9 CA D2 D3 D4 D5 D6 D7 D8 D9 DA E2 E3 E4 E5 E6 E7 E8 E9 EA F2 F3 F4 F5 F6 F7 F8 F9 FA |
| Total number of code words in table: 162 | |
Optimized Huffman Tables
In contrast to the above tables that are provided in the ITU-T standard, the following is an example of the huffman tables used by Photoshop CS2 when optimization is enabled for an image from a 6 megapixel dSLR.
When comparing this to the non-optimized tables, you'll note a few differences. Most importantly: fewer codewords are available. Not all combinations of runs and AC values are used. A few observations you might notice:
- For the DC Chrominance table: there are no 0A or 0B code words. This indicates that the source photograph did not have any two adjacent MCUs (block) with extreme 10 or 11-bit swings in average color value.
- For the DC Luminance table, the standard table places the 00 code word (marking the End of Block) with a 2-bit value. While this may be optimum for encoding images with wide areas of constant luminance (e.g. perfect white background), this characteristic is not prevalent in the example photo (shown at the top of this page). Instead, the EOB code is assigned to a 5-bit string.
Overall, there are fewer entries in the tables, and on average the entries consume fewer bits to encode. Therefore, the end result is a more efficient representation of the code words in the final JPEG file (meaning a smaller file size). The following tables were derived from a recoding of the JPEG photo (from above) by using jpegtran's lossless optimization method. This method will preserve the original image content and quantization tables, allowing us to isolate the effects of huffman table optimization.
You will note that the number of code words in each table (besides the DC luminance) don't include all possible codewords. This is because the example photo didn't need to use these run+size combinations and so they could be eliminated. By eliminating these codes, other codewords could occupy shorter variable-length codes and lead to decreased file size.
| Optimized example of DC Luminance Huffman Table | |
| Code Length (bits) | Code Words (Size or Category) |
|---|---|
| 2 | 03 04 |
| 3 | 02 05 |
| 4 | 01 06 07 |
| 5 | 00 |
| 6 | 08 |
| 7 | 09 |
| 8 | 0A |
| 9 | 0B |
| Total number of code words in table: 12 out of a possible 12 | |
| Optimized example of DC Chrominance Huffman Table | |
| Code Length (bits) | Code Words (Size or Category) |
|---|---|
| 2 | 02 03 04 |
| 3 | 01 |
| 4 | 05 |
| 5 | 00 |
| 6 | 06 |
| 7 | 07 |
| 8 | 08 |
| 9 | 09 |
| Total number of code words in table: 10 out of a possible 12 | |
| Optimized example of AC Luminance Huffman Table | |
| Code Length (bits) | Code Words (Run / Size) |
|---|---|
| 2 | 01 |
| 3 | 11 02 03 |
| 4 | 00 21 |
| 5 | 31 41 12 04 |
| 6 | 51 61 71 05 |
| 7 | 81 22 13 06 |
| 8 | 91 A1 B1 07 |
| 9 | F0 C1 D1 32 14 |
| 10 | E1 F1 42 23 |
| 11 | 52 08 |
| 12 | 15 |
| 13 | 62 33 24 |
| 15 | 09 |
| 16 | 72 16 82 43 17 92 A2 53 34 25 0A B2 C2 D2 63 44 35 18 F2 73 26 |
| Total number of code words in table: 63 out of a possible 162 | |
| Optimized example of AC Chrominance Huffman Table | |
| Code Length (bits) | Code Words (Run / Size) |
|---|---|
| 2 | 00 01 |
| 3 | 11 02 |
| 4 | 31 03 |
| 5 | 21 12 |
| 6 | 41 51 |
| 7 | 13 |
| 8 | 61 22 32 04 |
| 9 | 71 |
| 10 | 05 |
| 11 | F0 81 91 A1 B1 42 14 |
| 12 | E1 52 23 06 |
| 13 | C1 D1 |
| 14 | F1 |
| 15 | 62 33 |
| 16 | 15 43 07 72 24 16 82 |
| Total number of code words in table: 44 out of a possible 162 | |
8 users online


Reader's Comments:
Please leave your comments or suggestions below!If you run an optimized JPEG through JPEGsnoop, you can report out the frequency distribution of the different huffman codes. This will show you how well the optimization worked. Note that you'll need to compare all DHT tables at once (as it is the combination of all of these that dictates overall compression performance).
For example, I ran a standard JPEG photo through jpegtran -optimize and get the following results (trimmed):
Compression stats: Compression Ratio: 10.84:1 Bits per pixel: 2.21:1 Huffman code histogram stats: Huffman Table: (Dest ID: 0, Class: DC) # codes of length 01 bits: 0 ( 0%) # codes of length 02 bits: 27069 ( 22%) # codes of length 03 bits: 47782 ( 38%) # codes of length 04 bits: 42360 ( 34%) # codes of length 05 bits: 5400 ( 4%) # codes of length 06 bits: 2122 ( 2%) ... Huffman Table: (Dest ID: 1, Class: DC) # codes of length 01 bits: 0 ( 0%) # codes of length 02 bits: 88201 ( 71%) # codes of length 03 bits: 15300 ( 12%) # codes of length 04 bits: 11205 ( 9%) # codes of length 05 bits: 7626 ( 6%) # codes of length 06 bits: 2381 ( 2%) ... Huffman Table: (Dest ID: 0, Class: AC) # codes of length 01 bits: 0 ( 0%) # codes of length 02 bits: 1176666 ( 47%) # codes of length 03 bits: 252266 ( 10%) # codes of length 04 bits: 355289 ( 14%) # codes of length 05 bits: 381904 ( 15%) # codes of length 06 bits: 173519 ( 7%) ... Huffman Table: (Dest ID: 1, Class: AC) # codes of length 01 bits: 0 ( 0%) # codes of length 02 bits: 320855 ( 53%) # codes of length 03 bits: 143907 ( 24%) # codes of length 04 bits: 62023 ( 10%) # codes of length 05 bits: 33520 ( 6%) # codes of length 06 bits: 19568 ( 3%) # codes of length 07 bits: 4956 ( 1%) # codes of length 08 bits: 11638 ( 2%) # codes of length 09 bits: 3126 ( 1%) ...You'll see that in most cases the optimization has produced a selection of shorter code symbols, which will produce smaller overall files. For more details in how the huffman tree is generated, have a look at the htest_one_block function call in IJG's jchuff.c.The paragraph above lists the same order, not different order.
I will be out of the country for several months (in India), so comments will be held and only posted infrequently. Thanks!