Return to Digital Photography Articles

Designing a JPEG Decoder & Source Code

This page is intended to provide some further details about the process of decoding JPEG images, based on my experience in writing the JPEGsnoop application. Source code examples are provided in C++.

Chroma Subsampling and the Huffman Scan data

Decoding the Scan Data segment (Marker: JFIF SOS, 0xFFDA) becomes a little more complicated when you need to deal with chroma subsampling. Since nearly all JPEG images are compressed with chroma subsampling, it is important that the decoder support the most common types of subsampling:

Subsampling HxVDescriptionComments
1x1No SubsamplingVery few digital cameras (Sigma SD is a notable exception) output images without subsampling the color channel. Instead, you'll generally find these files generated from image editing applications when a high quality setting was used.
2x1 Horizontal subsampled The majority of reasonable digital cameras produce JPEG images with 2x1.
1x2 Vertical subsampled Typically from 2x1 subsampled images that have been rotated by 90 degrees.
2x2 Horizontal & Vertical subsampled Cheaper digital cameras and image editors saving photos at low quality settings.
others   Not very common at all for digital photos. Other sources such as miniDV camcorders will use 4x1 chroma subsampling.

 

In each case, the encoding sequence is always to complete the luminance MCUs (1 or more) before going on to the chrominance components. Instead of observing a half or quarter as many color channel components, you'll find twice or four times as many luminance components encoded per block.

Coded Sequence for 1x1 (4:4:4)

The following 3 components make up one MCU, and are repeated in this sequence for every MCU. Note that the MCU represents a pixel area of 8x8 pixels.

[Y0(dc), Y0(ac)],
[Cb(dc), Cb(ac)],
[Cr(dc), Cr(ac)]

Coded Sequence for 2x1 (4:2:2)

The following 4 components make up one MCU. The MCU represents a pixel area of 16x8 pixels. Y0 and Y1 refer to two 8x8 pixel regions adjacent horizontally, while Cb and Cr refer to the single 16x8 pixel region that covers Y0 and Y1. To perform the encoding process, the Cb and Cr channels are first sub-sampled horizontally (this is usually done by averaging each pair of horizontal pixels into a single value). Then an MCU is made up of 8 averaged pixels horizontally by 8 original pixels veritically. The DC and AC coefficients are calculated on this new region.

[Y0(dc), Y0(ac)]
[Y1(dc), Y1(ac)],
[Cb(dc), Cb(ac)],
[Cr(dc), Cr(ac)]


Coded Sequence for 2x2 (4:2:0)

The following 6 components make up one MCU. The MCU represents a pixel area of 16x16 pixels. Y00 and Y10 refer to two 8x8 pixel regions adjacent horizontally, Y01 and Y11 refer to the two 8x8 pixel regions below Y00 and Y10. Cb and Cr refer to a single 16x16 pixel region that covers all four pixels (Y00, Y10, Y01 and Y11). To perform the encoding process, the Cb and Cr channels are first sub-sampled horizontally and vertically (this is usually done by averaging all four pixels into a single value). Then an MCU is made up of 8 averaged pixels horizontally by 8 averaged pixels veritically (which are based on the original region of 16x16 pixels). The DC and AC coefficients are calculated on this new region.

[Y00(dc), Y00(ac)]
[Y10(dc), Y10(ac)],
[Y01(dc), Y01(ac)],
[Y11(dc), Y11(ac)],
[Cb(dc), Cb(ac)],
[Cr(dc), Cr(ac)]

Variable-length Huffman Codes

Reading DHT Tables source code

Interpreting the DHT table in the JPEG JFIF file stream is not as intuitive as it may appear. It was designed to be stored in the most space-efficient manner, which makes reading it much harder. On paper, it is a fairly simple process to fill out the binary tree with the huffman codes as they appear in the DHT marker segment, but this doesn't translate well to the coded form.

The following example source code is completely un-optimized, as it is intended to show the basic mechanism of reading the DHT table and decoding the variable length JPEG huffman codes. It is based upon my experiences in developing the JPEGsnoop JPEG decoder from scratch.

Instead, one will observe that the codes that appear within the same bitlength will be incrementing by one. Going to the next bitlength requires the next code value to be doubled before continuing the count by one. This method is much more apparent when observing the resulting binary values that the codes correspond to.

Note that a two-dimensional array is used here to retain all of the variable-length codes. The first array index is used to reference the DHT table that we are processing (e.g. AC, DC, Luminance, Chrominance, etc.) The second index is simply to cover all of the possible code strings (max 162).

Locate the DHT Marker

repeat {
  class & dest_id = <Read Byte>

  table_set = dest_id*2 + class // Arbitrary mapping
  table_ind = 0

  Reset DhtCodeList[0..255] = 0xFFFF (dummy termination)
  DhtCodesLen[1..16] = <Read Bytes (x16)>

  for (ind_len=1; ind_len <= 16; ind_len++) {
    for (ind_code=0; ind_code < DhtCodesLen[ind_len]; ind_code++) {
    {
      DhtCodeList[dht_ind++] = <Read Byte>
    }
  }

  // Generate variable-length binary bit strings
  code_val = 0;
  dht_ind = 0;

  for (bit_len=1; bit_len <= 16; bit_len++) {
    for (code_ind=1; code_ind <= DhtCodesLen[bit_len]; code_ind++) {

      tmp_mask = ( (1 << (bit_len))-1 ) << (32-bit_len);
      tmp_bits = code_val << (32-bit_len);
      tmp_code = DhtCodeList[dht_ind];

      dht_lookup_bitlen [table_set][table_ind] = bit_len;
      dht_lookup_bits   [table_set][table_ind] = tmp_bits;
      dht_lookup_mask   [table_set][table_ind] = tmp_mask;
      dht_lookup_code   [table_set][table_ind] = tmp_code;

      table_ind++;
      code_val++;
      dht_ind++;
    }
    code_val <<= 1;
  }

  dht_lookup_size[table_set] = table_ind;

} until all DHT tables processed

Once you have stored the variable-length codes into structures that make it easy to search later, we can easily locate the codes in the bitstream.

dht_lookup_mask[][0..# codes] = Left-most bit mask for bitstring compare
dht_lookup_bits[][0..# codes] = Left-most bits for bitstring compare
dht_lookup_code[][0..# codes] = The 8-bit code value corresponding to this bitstring
dht_lookup_bitlen[][0..# codes] = The number of bits in the variable-length code

Searching the DHT Huffman Codes source code

The most simple way to locate a variable length code is to search linearly through the list of codes using the bitmask, looking for a match:

scan_buff = <Next 8 unused bits from file, shifted left align to bit 31>

// Slow search for variable-length huffman code
while (!done) {
  // Does the bit string match?
  if ((scan_buff & dht_lookup_mask[set][ind]) == dht_lookup_bits[set][ind]) {
    code = dht_lookup_code[set][ind];
    bits_used += dht_lookup_bitlen[set][ind];
    done = true;
  }
  ind++;

  // Just in case no match was found, check for end of array
  if (ind >= dht_lookup_size[set]) {
    done = true;
  }
}
					

After the above, bits_used indicates how long the variable-length code portion is, which can be used to shift the scan_buff for the next bit sequence.

JPEG Decoder Source Code

The following are some free resources that provide full source code implementations of other JPEG decoders. Several of these implementations take very different approaches, some simpler than others. Generally, the less refined / optimized versions are easier to understand.

  • IJG 6b - The most widely-used JPEG Encoder / Decoder library, written in C. Fairly complex source code as it has been optimized heavily and is designed to work with a wide range of exception cases. Includes sample command-line applications (such as cjpeg, djpeg and jpegtran).
  • Smaller Animals - Visual C++ code that demonstrates a simple application making use of the IJG 6a JPEG decoder libraries.
  • Cristi Cuturicu - Very simple decoder source in C++. Some ASM optimization. Limited comments.
  • Rich Geldreich - JPEG decoder. Includes MMX optimizations, progressive images. Limited comments.
  • Cornell University - Lossless JPEG only
  • Stanford University - Designed for research and experimentation (not optimized for production). Supports Lossless JPEG.
  • JPEGsnoop (Calvin Hass) - JPEG decoder and Windows application. Fresh ground-up decoder library in Visual C++ with lots of comments :). Full source code not yet available.

Other Sections Coming soon:

JPEG JFIF Markers Parsing

Hex MarkerMarker NameDescription
0x FFC0 SOF0 Start of Frame 0 Baseline DCT
0x FFC1 SOF1 Start of Frame 1 Extended Sequential DCT
0x FFC2 SOF2 Start of Frame 2 Progressive DCT
0x FFC3 SOF3 Start of Frame 3 Lossless (sequential)
0x FFC4 DHT Define Huffman Table  
0x FFC5 SOF5 Start of Frame 5 Differential sequential DCT
0x FFC6 SOF6 Start of Frame 6 Differential progressive DCT
0x FFC7 SOF7 Start of Frame 7 Differential lossless (sequential)
0x FFC8 JPG JPEG Extensions  
0x FFC9 SOF9 Start of Frame 9 Extended sequential DCT, Arithmetic coding
0x FFCA SOF10 Start of Frame 10 Progressive DCT, Arithmetic coding
0x FFCB SOF11 Start of Frame 11 Lossless (sequential), Arithmetic coding
0x FFCC DAC Define Arithmetic Coding  
0x FFCD SOF13 Start of Frame 13 Differential sequential DCT, Arithmetic coding
0x FFCE SOF14 Start of Frame 14 Differential progressive DCT, Arithmetic coding
0x FFCF SOF15 Start of Frame 15 Differential lossless (sequential), Arithmetic coding
0x FFD0 RST0 Restart Marker 0  
0x FFD1 RST1 Restart Marker 1  
0x FFD2 RST2 Restart Marker 2  
0x FFD3 RST3 Restart Marker 3  
0x FFD4 RST4 Restart Marker 4  
0x FFD5 RST5 Restart Marker 5  
0x FFD6 RST6 Restart Marker 6  
0x FFD7 RST7 Restart Marker 7  
0x FFD8 SOI Start of Image  
0x FFD9 EOI End of Image  
0x FFDA SOS Start of Scan  
0x FFDB DQT Define Quantization Table  
0x FFDC DNL Define Number of Lines (Not common)
0x FFDD DRI Define Restart Interval  
0x FFDE DHP Define Hierarchical Progression (Not common)
0x FFDF EXP Expand Reference Component (Not common)
0x FFE0 APP0 Application Segment 0 JFIF - JFIF JPEG image
AVI1 - Motion JPEG (MJPG)
0x FFE1 APP1 Application Segment 1 EXIF Metadata, TIFF IFD format,
JPEG Thumbnail (160x120)
Adobe XMP
0x FFE2 APP2 Application Segment 2 ICC color profile,
FlashPix
0x FFE3 APP3 Application Segment 3 (Not common)
JPS Tag for Stereoscopic JPEG images
0x FFE4 APP4 Application Segment 4 (Not common)
0x FFE5 APP5 Application Segment 5 (Not common)
0x FFE6 APP6 Application Segment 6 (Not common)
NITF Lossles profile
0x FFE7 APP7 Application Segment 7 (Not common)
0x FFE8 APP8 Application Segment 8 (Not common)
0x FFE9 APP9 Application Segment 9 (Not common)
0x FFEA APP10 Application Segment 10
PhoTags
(Not common)
ActiveObject (multimedia messages / captions)
0x FFEB APP11 Application Segment 11

(Not common)
HELIOS JPEG Resources (OPI Postscript)

0x FFEC APP12 Application Segment 12 Picture Info (older digicams),
Photoshop Save for Web: Ducky
0x FFED APP13 Application Segment 13 Photoshop Save As: IRB, 8BIM, IPTC
0x FFEE APP14 Application Segment 14 (Not common)
0x FFEF APP15 Application Segment 15 (Not common)
0x FFF0 ...
0x FFF6
JPG6 JPEG Extension 0 ...
JPEG Extension 6
(Not common)
0x FFF7 JPG7
SOF48
JPEG Extension 7
JPEG-LS

Lossless JPEG
0x FFF8 JPG8
LSE
JPEG Extension 8
JPEG-LS Extension

Lossless JPEG Extension Parameters
0x FFF9 JPG9 JPEG Extension 9 (Not common)
0x FFFA JPG10 JPEG Extension 10 (Not common)
0x FFFB JPG11 JPEG Extension 11
(Not common)
0x FFFC JPG12 JPEG Extension 12 (Not common)
0x FFFD JPG13 JPEG Extension 13 (Not common)
0x FFFE COM Comment  

 

Counting AC Huffman Codes

Makernotes

YCC to RGB Color Conversion

Restart Markers

Variable Length Decode Optimization

 

 


Reader's Comments:

Please leave your comments or suggestions below!
2011-02-01Zachary
 Hi, I'm in grad school for photoraphy and am interested in doing some work editing the source code of a jpeg. The problem is, I know nothing about programming. All that I really want is to be able to access the code of an image, swtich some things around and then turn the code back into an image. Is there any software that will allow me to do this? I am not concerned with the final image looking correct. Really, I basically just want to cause errors.
Thanks,
Zachary
2011-01-24roopa
 Hello sir

Thanks a lot for give me the information about application marker used in jpeg files.

Now m working on Jpeg file repair and recovery software. For that I want to know that how we can repair a file with the help of demaged file .Please tell me the step which are necessry to create a new file with the help of another.I searched a lot and i know the detail about all markers of jpeg file . Please help me sir



Thanks a lot

Roopa saini
2011-01-11roopa
 please mention the application marker format also from app0.........app15 and different application marker images in photo gallery .

Thanks
 I have put some definitions of the APP markers in the table on this page (moved your comment). Hope that helps!
2010-11-19Ian Ollmann
 Chintu wrote @ 2008-07-03:
> Can u tell me why some images contain more than FFD8 and
> FFD9 markers( I found some jpegs which has more than
> 5).Can any form of data or any other information be equivalent
> to FFD8 or FFD9 or may be any other marker like Restart markers
> if searched when image viewed in Hex editor.
> If so then how to distinguish b/w correct markers and and other
> data information equivalent to Markers( FFDA,FFD8,FFD9,FFDD..)

Restart markers only have codes between 0xffd0 and 0xffd7. Values larger than that are not restart markers. ffd8 is start of image and ffd9 is end of image. As to why there are so many restart markers, if you look at the spec it just cycles between them in a modulo fashion (Fig. E3). That being the case, if there is a data loss that spans several restart markers, you should still be able to reconstruct the remaining data so long as no more than seven restart markers worth of data was lost.
 Thanks Ian! It would have been nice if more digicam manufacturers had standardized on using embedded restart markers, especially given the benefits to data recovery.
2010-11-02Chris Lundquist
 The stuff byte sequence for entropy encoded data is 0xFF00, but is shown in the table as 0xFFFF. Perhaps I misunderstood the table.

Chris Lundquist
 That value should not have appeared in the markers table as it is fully handled by the encoded data segment (as 0xFF00, like you pointed out). I have removed this line. Thanks!
2010-10-16kitsemen
 thanks, good article
2010-09-30spcmicro
 Please let me now when you get back and send me your e-mail address so that I can send you a couple of sample files. I really appreciate your taking a look at this Calvin.
 Email sent.
2010-09-23spcmicro
 Hello,

Excellent source of information on this site. The JPEG format is not very easy to understand and your site is a big help. I have been working on an application that strips individual JPEG frames from an MPEG stream coming from a web camera. I'm saving each frame as a JPEG and almost everything looks correct except for the DHT markers. There are 4 tables and they are each marked by the DHT marker. I've seen other files with 4 Huffman tables defined by a single marker. I looked at the specification and it doesn't say you can't do that so I think that's OK. One thing I did find is that the lengths for the 2 AC Huffman tables seem off by -12 bytes. JPEGsnoop fails on the first AC table with the error: Expected marker 0xFF, got E9 @ offset 0x000000176. I tried patching the lengths by adding +12 and JPEGsnoop really doesn't like that. It gets by the bad table now but dumps a lot of errors that I think are self inflicted. I used another tool, jpegdump and it just looks for markers and finds everything in the file. It evidently doesn't use the lengths and just looks for markers. I have looked at many JPEG files I've been saving and they look correct except for the lengths. I would love to have someone look at a sample file for me. I can send a sample file if anyone is interested. Would anyone be willing to take a peek for me?

Thanks in advance...
 Yes... the standard allows you to combine multiple Huffman tables into a single DHT entry or repeat the DHT markers for individual tables. The length includes the header overhead, so this could be part of the difference.

I am currently in India, but could take a look at an example file when I return.
2010-09-09br
 Hi,

I am working on a jpeg encoder in matlab and was experimenting with quantization tables set to all 1's and no sub sampling.
Unfortunately, the quality of the output image wasn't as good as i expected. There were no pixelations as such but the contrast was low compared to the original bmp.

Trying to figure out the problem, i opened the output jpeg file in jpegsnoop and recorded the MCU (0,0) for the jpeg image as
Y(:,:,1) =[...
 -471   -19     5     0   -15     1    -2     3
  -51    37   -41    -5    11     0     3    -5
  -25   -10    42     5    -8     5    -8     5
   11     8   -14     5     6   -12    11     1
   -9   -11    15     8    -1    13    -8     7
  -15     5   -21    -1     6    -4     3   -12
   12    -3    14     1     2     4     2    12
   -6     1   -14    -2     7     4    -4   -11]

Cb(:,:,2)=[ ...
 -165     6     2    -1     5    -2     1     0
   14   -10    11     5    -3    -1     0     2
   12     0   -13    -4     3    -1     1    -3
   -8     0     3    -2    -2     2    -2    -2
    6     2    -3    -2    -1    -3     2     1
    2     4     2    -2    -1    -1     0     2
   -1    -1    -2     1    -2    -4     0    -1
    3     1     2     0    -2     1     0     0]

Cr(:,:,3) =[ ...                   
  215    -2    -3     2    -4    -1     0    -1
  -13     7   -11    -5     3     1    -1     0
  -16     4    11     2    -2    -3     1     0
    5    -2    -1     2     3     0     1     3
   -5     0     2    -1     1     1     0    -1
   -3    -4    -1     3     2    -1    -1    -2
   -1     3    -2    -1     0     2     1     1
   -2    -2    -1     1     3    -1     0     0]
Reverse calculating the original RGB values i got RGB(1,1) =89,41,16
Logic i used is -
Inverse DCT
Add 128
Round off
Convert YCbCr to RGB.

But on opening that output jpeg file in matlab/photoshop, i see that it has decoded the values to RGB(1,1) = 93,51,29.

Really confused here, can you please help me out on why i am seeing this difference.
2010-06-13subhajit chatterjee
 very good paper..........excited..........and thank u.it really helps in my project work.if u know anything abot how to extract pixel value (RGB)from jpeg file...plzz inform me in my email..waiting for ur knd reply
2010-05-12Jeff Mather
 Excellent source of information, especially the marker table. Thanks for sharing this.
 Glad it was useful!
2010-05-03Hi..
 please help me to understand following.

Image resolution is 1280x960,but LCD size on which we display this image has only 128x160.What we do in backgroud to display this big picture on small LCD?
 Most digicams resize for the longest side to fit the LCD display and then use a black background to adapt to the aspect ratio difference.
2010-04-28edie
 Hi..
I am compiling the memory requirements of jpeg decoder as function of size, # of components and mode of operation. Do you already have an analysis for that? Also, in progressive decoding, as compared to baseline, an extra buffer is required ( huge one, capable of holding an entire frame's worth of dct coefficients). is there any other memory constraint in progressive?
2010-04-13JM
 Hi,
I'm having problem decoding a jpeg image with no EOI.
Is a EOI (0xffd9) marker always required to signal End of data to the JPEG decoder? Or is it sufficient to check that all output samples have been generated?
 According to the ITU-T standard, the EOI marker must be used to terminate the image. However, I am sure that a number of decoders are probably robust enough to stop processing upon running out of the specified number of MCUs.
2010-03-17CoolA
 Hi,
I am new in image programming. I just land in a job where i have to find the difference between two jpeg files. By difference i means that one of the jpeg file has some extra information stored in it. So i need to find out what extra information stored in it so that i can extract it and put it another jpeg file. If anybody can help me i will really appreciate that. If not, can any tell me how to read the jpeg header and print in a reader format means in a text file.
2010-03-14Mae
 Interesting... For example, I have a small 8x8 pixel image with 4:2:0 (2:2) subsampling. If four blocks in the MCU are actually representing a 16x16 region, then how can I get my 8x8 image having these four blocks?
 In this case, the decoder will treat the image as having a complete MCU (16x16), but then during display it will only show the top-left portion that is within the range of dimensions. In the past I've referred to this case as a "partial MCU".
2010-03-09Mae
 I really thank you for your help. Although I'm still a bit confused. Please excuse my incompetence, but I know by now how to convert from YCbCr triplet to RGB. What I don't understand is how FOUR blocks (Y00, Cb, Cr), (Y10, Cb, Cr), (Y01, Cb, Cr) and (Y11, Cb, Cr) from a single MCU finally become this ONE 8x8 pixel image block again. How these 16x16 pixels become one 8x8 pixel block? How are they mapped or truncated to this single block? Or in other words, when you have just one Y, you take Y, Cb, Cr, convert to RGB and that's it - you have your decoded 8x8 pixel image block. But what about when you have Y00, Y10, Y01, Y11, Cb and Cr? What to do with those four Y00, Y10, Y01, Y11?

Could you please provide some sort of an algorithm or other means of understanding. I really appreciate your help. It seems that you really do care. Thank you.
 The key point is that the four blocks in the MCU are actually representing a 16x16 region, not an 8x8 region. For details, please see the JPEG chroma subsampling page.
2010-03-09Mae
 Please help me... I really need an answer... I'm decoding a small 4:2:0 (2:2) image. I've come to the part where I have decoded the MCU and now I have Y00, Y10, Y01, Y11, Cb and Cr. The question is how exactly should I get a single rgb block out of these four blocks (Y00, Cb, Cr), (Y10, Cb, Cr), (Y01, Cb, Cr) and (Y11, Cb, Cr)?
 You might want to have a look at the JPEG color conversion page.
2010-02-03stefan
 I wrote my own jpeg dumper to understand the format better. I am right now stuck @ SOS. Length is 12, but there seems to be the actual scan data behind. Is there an easy way to calculate the size? Or does this always span to EOI (ffd9)?
 The SOS header length can be 12 and the actual scan data does follow as you've noticed. However, there is no easy way to precalculate the data length as it depends upon the image content and the efficiency of the huffman coding.
2010-01-30Wayne
 Hi Calvin, this is a great site with practical information about the JPEG format. I am a little confused though when it comes to stuff bytes.

Does byte stuffing only occur in the huffman coded data (i.e. after the Start of Scan header)? What happens if an FF appears in say the length field of a header or in the header data itself? are these bytes stuffed so as to not appear as JFIF markers or is FF ignored within the known length of the header?

I find the itu-81 standard document to be a little vague on the point. Hope you answer me because I am writing a light weight lossless jpeg transform library for use in java CLDC apps.

Thanks!
 The stuff byte mechanism is only used in the scan data segment. There are a few marker values that can appear within this segment (such as RST) which are denoted by 0xFF. The stuff bytes allow there to be no ambiguity between these inband markers and a huffman code that happens to require a run of 8 or more 1's. In the other sections of a JFIF file, the format is fairly well constrained, meaning that most fields are of a defined length, and an 0xFF value (in a header field, for example), is not ambiguous.
2009-09-16Jaime
 I have a question. Do I have to be a programmer to decode an image for jpeg? Please somebody can teach me how can I do that? thanks in advance. JM
 This depends on how you are trying to decode the JPEG image. If you are writing a program, what language are you using? (eg. Visual C++?) In Visual C++ it is very easy to decode and display a JPEG -- there are many examples on the CodeProject website.
2009-07-13mmahmoud
 thanks for your reply...i finally found the error. it is the huffman table.
it was not because i write it wrongly in the file,nor because that the table is wrong
There are some criteria needed to be satisfied ...i found the error using a software called Bad Peggy.
The error was "Bogus Huffman table definition ". i still did not know why my table is wrong... so i used a standared huffman table which works well.

these are some errors that can appear when reading jpeg image
http://svn.ghostscript.com/ghostscript/tags/jpeg-6b/jerror.h
 Great to hear that you managed to localize the problem to your Huffman table! It is very easy to make a mistake in generating these tables, so it usually makes sense to start with a pre-built table and then optimize later on.
2009-07-12 
 i have made my own encoder,and i write a jpeg file ...the problem is that this file can not be viewed by any viewer although it can be decoded correctly and i make sure that all the coefficients are correct
 There are many reasons why an image may not decode correctly (ie. not viewable), but one would need to take a deeper look at the file structure to figure out why it isn't loading. If you email me the file, I may be able to take a look at some point for you.
2009-05-17sabala
 How can i convert the RBG data to Ycbcr ?

I have a situation where in i have a image which is in the RGB color space and the image was created using Irfan viewer. However when tried to view though some other editors the color appears to be different like if i have color as Red it will appear as green. I am trying to save an image of 1x1 pixel with red color in Irfan viewer and when tried to open in the windows gallery the color appears to be green. If some one has encountered this problem will it be possible for them share.
 
2009-05-17Debasis
 Hi ,I was working in JPEG Decoder,I found problem to decoder a particluar image.

In our decoder first we check that whether the stream contains the standard Huffman table or not by comparing no of coeff for a particular length and the corresponding Run/Size value if all matches then we decode its a JPEG stnd table and we decode from the prestored coeffnt.Otherwise we create a huffman tres correnponding to a particular length where we store the code word and the Run/Size.

For this particular image,it is having standard huffman table but the code word is different for a particlur Run/Size and gives corrupted output.

My question is that do we need to compare the code word along with Run/Size to make a decision for a standard JPEG huffman table or not?
 I assume that you are using this particular method as an optimization. However, I don't think this method will work as there is far too much flexibility in how Huffman tables are generated. As you have noted, between different JPEG images, the codeword Run/Size pairs are not always consistent (it depends on the JPEG encoder and whether Huffman table optimization has been used).
2009-04-12dexders
 How to convert 4:2:0 YUV to RGB?
2009-04-06helpme
 I want to know how should the output of JPEG decoder look like in memory?
 This really depends on the application. Most often, I would expect that the decoder will store the data in a DIB (Device-Independent Bitmap) array, for use with standard Windows graphics routines for display.
2009-03-30Hoangvu
 Thank for your article. Can your tell me more detail about how to process the YCbCr data in a MCU. I mean, I have 3 matrixs 8x8 data of Y, Cb and Cr, how can I downsample these matrixs
2009-03-14Larry Fischer
 I love your site and the program. It's amazing how many phtoos I've found that I would have sworn were unretouched that werent.
I am just starting to scan all my pictures from my old photo albums to archive and was wondering if you had any thoughts about which scanners do a decent enough job for such a task and/or which to avoid. I have a Dell All-in-One scananer/printer that seems to do okay. But I'd hate to find out hundreds of scans into this project that it's not a good choice in the long run. Any suggestions or can you suggest a site that covers this topic? Thanks!
 Great question... As the technology has changed considerably over the past I would recommend posting a question on dpreview. There are many serious photo enthusiasts there who will have qualitative experience with various scanner models and can help you decide which to use before starting on your project.

That said, I had begun to do a similar task, scanning in all of my old photo albums. Originally I had looked at scanners with document feeders (eg. one from HP), but I didn't like the way that the photos were spun through the feeder (concerned about damaging / creasing the prints). I have since moved on to use my Canon flatbed scanner (8400) and use the multi-scan mode to capture & crop 3 or 4 4x6 prints at a time.

Good luck!
2009-03-11mzatanoskas
 Thanks for the quick reply! I promise you I won't be bothering you much longer!

Quick question about this line from JpegSnoop:

[0x00000286.0]: ZRL=[ 1] Val=[ 3] Coef=[02..03] Data=[0x DF AA CA 5F = 0b (1101111- -------- -------- --------)]
What does the "1101111" part mean? I assumed it was where the program had matched one or more codes. That made sense in the previous lines. However this code cannot be split up into any set of smaller codes because of the consecutive 1s, yet I can't find an instance of a "1101111" code in either the Huffman tables I generated or the ones JpegSnoop generated (which are the same as far as I checked). This is what is making me think I'm misunderstanding something...
2009-03-10mzatanoskas
 Thanks again Calvin, I think I get the gist of it now, but I'll stick to my slow hacky way until I get the rest of the decoding/encoding process down.

So now that I can programatically recreate the huffman tables and have a basic search thing going, I'm back to manually decoding the actual image data so that I can understand what my program will have to do.

Can I check a couple of points with you?

1) The image data must be stripped of its byte stuffings before/during decoding. I think wikipedia says that a byte stuffing is 0xFF00, but in your table you say it is 0xFFFF... could it be both or am I misunderstanding something?

2) If there are 4 Huffman tables and the ffc0 marker says there are 3 image components then Huffman Table 1 = lum DC, Huffman Table 2 = lum AC, Huffman Table 3 = Chrm Cb & Cr DC, Huffman Table 4 = Chrm Cb & Cr AC.

3) The order of the decoding process is:
a) Lum DC - (First code is length of useful bits. Followed by this length). In my example I get: 1110=0x06, so the next 6 bits are: 010101. 010=0x01 and 101=0x04. So the Huffman decoded part of lum dc is 0x06,0x01,0x04.
b) Lum AC - (First code is split into run, size.) I get 100=0x03 so a run of 0, then 3 bits of useful stuff. The next 3 bits are also 100. So final component is 0x03,0x03.
c) Chrm Cb DC - (First code is length of useful bits. Followed by this length)

This is where my decoding breaks down. I get a binary code of 110=0x03. The next 3 bits are 111 which is illegal. When I compare what I've done with JpegSnoop, I notice that JpegSnoop has all the above entries under a Lum table:
Lum (Tbl #0), MCU=[0,0]
[0x00000284.0]: ZRL=[ 0] Val=[ -42] Coef=[00= DC] Data=[0x E5 64 DF AA = 0b (11100101 01------ -------- --------)]
[0x00000285.2]: ZRL=[ 0] Val=[ 4] Coef=[01..01] Data=[0x 64 DF AA CA = 0b (--100100 -------- -------- --------)]
[0x00000286.0]: ZRL=[ 1] Val=[ 3] Coef=[02..03] Data=[0x DF AA CA 5F = 0b (1101111- -------- -------- --------)]

The first two entries seem to match my decoding, but why are they under a Lum table? The third entry gives 1101111. I've tried using all sorts of combinations of different Huffman tables but I can't work out how you could get this result...
I also noticed the ZRL=1 bit but in my Huffman Tables there are only two F0 symbols and the codes for them are completely different to what I have here.

Basically I'm completely lost here and clearly doing stuff wrong!
 You're very close. To answer your questions:

  1. The byte stuffs in the coded bitstream are indeed 0xFF00. I believe there may be allowances for a string of 0xFF's to appear, but don't recall running into this.
  2. This is the usual sequence, but you have to check the "class" codes to ensure that the channel vs AC/DC selection is in the order you expect.
  3. I think the place where you are going wrong is with your Lum AC. You need to keep decoding coefficients until you have either encountered an "EOB" or you have filled all 64 coefficients in the MCU. This must be done before moving on to the Chrominance components.
Hope that helps!
2009-03-06mzatanoskas
 Sorry to pester you Calvin, but it's not so much the Huffman decoding itself I don't understand (I spent a couple of hours the other day drawing out the binary trees!) but it's how you implement it in the code.

Specifically:

tmp_mask = ( (1 << (bit_len))-1 ) << (32-bit_len);
tmp_bits = code_val << (32-bit_len);
tmp_code = DhtCodeList[dht_ind];

These lines seem to be the all important ones of getting the actual binary code for the symbol in question, but I just can't work out how they work!
 Basically the intent of these lines is to generate each coded bitstring into "tmp_bits", but have it start in the most significant bits (ie. bit 31 down). A similar operation is done to produce the mask (indicates which of these most significant bits will be compared in the huffman code search). I use the most-significant bits as my "bit buffer" that I use to process the data stream is always shifted so that the next bits to process are aligned with the most significant bit (ie. bit 31). In the above code, the code_val contains the next coded value to save in the table. You'll notice that we generate all codes of one bit length, and then multiply the code by 2 (shift left by 1) and repeat the process (the code sequence generating part is a little less intuitive).
2009-03-05mzatanoskas
 Hi Calvin, I've moved to this page now! Your site truly is the one stop shop for all jpeg decoding purpouses!

I hope you don't mind me asking for a bit more help. I've moved on from hex editors (decoding by hand is just too painful) and on to the java now. Unfortunately I'm just too thick to get how you figure out the binary codes for each symbol within your nested for loops. I tried to work it out by myself as well but can only come up with very convoluted methods which are clearly highly unefficient.

I wonder if you could explain that step to me with an example or two?!
 Thanks! I suggest you have a look through my full example on my Huffman Coding Example.
2009-01-14Rcmaniac25
 Hello again.
As I stated in a previous comment I am working on a web capture system. I got my mjpeg decoder done (thanks for the help on that) but I am getting an image with very distorted color information (when compared to JPEGsnoop's output). I traced my problem to the inverse DCT component used in my decoder. The values I get are either two high or two low. I used the equation that was in the JPEG standard, what equation did you use?
 When I first wrote the decoder, I was basing the algorithm directly on the widely-published IDCT formulae. However, my original attempts also had issues which implied that my overall constant multipliers were incorrect. In all likelihood you'll probably find that you are missing a divider by a factor of 4 or 8, for example. This will depend on the range you are using in the frequency domain (eg. for fixed point math, etc.)
2008-12-29OOmph
 How do I verify the output of JPEG Decoder (8x8 pixel out) without reconstructing JPEG image? I want to compare the output of hardware JPEG decoder with a software JPEG decoder in 8x8 pixel format (YCBCr).
2008-12-18Rcmaniac25
 Your website is great, I have a question regarding MJPEGS. I am working on a Web cam capture program and I want to include the codecs internally so I don't have to carry around any extra DLLs. I am having trouble finding information n MJEPG decoding. I only know that they are missing the Huffman table, and are very similar to JPGs (guess thats why it has JPG in the name, ;). Your JPEGsnoop program helps a lot. Is there anything specific I have to do to decode them or do I just decode them the same way I would a JPG?
 I found a lot of information on the AVI RIFF file format useful. From there, you can generally just force the the standard huffman table. The rest is relatively comparable to a single-frame JPEG decoder.
2008-12-04Josh
 Hey again Calvin,

I appreciate your quick response, and I'm sure I'm missing something, but the tutorial you'd provided only provided an example with a DC encoded value, and leaving all AC values as 0. My problem is that I'm struggling to understand how to decode a DCT with non-zero AC values. The way I understand it, it's not as easy as just adding the AC to the DC and you have the specified pixel. I believe I have to calculate an arccosine on the values somehow, or something to that effect. Can you point me in the right direction, taking it step by step? I'd be very appreciative, thank you.
 You're right... I had forgotten that I only detailed the DC. I will consider writing up a Part 2 to give an AC example as I think many others would also be interested. The AC component decode adds quite a bit more complexity to the process. Essentially once you end up with your 8x8 coefficient matrix, you will need to perform an Inverse Discrete Cosine Transformation (IDCT). The IDCT process will result in another 8x8 matrix that is in fact representing intensity values (per channel) in the spatial domain (ie. no longer a frequency domain representation). Unfortunately, I don't have much time right now to document the process in detail, but thankfully this particular step is fairly well described in many online papers.
2008-12-02Josh
 Hey Calvin

This product and your explanations have been tremendous. I'm just having one apparently hard time trying to figure out how to implement the IDCT. I've found the matrix transform formula, but I don't understand how to iterate through the matrices and come up with the original pixel luminance and chrominance values. Can you give me an example of how to go through one MCU and figure out the values before level shift?

This is from JPEGsnoop, in MCU[0,0]. This was with subsampling set to 4:2:0. My idea was to go into MS Paint and set the image width and height to 16x16, and set the top left pixel to a red color. Then I saved the image to JPEG. This is what I came up with.

Y0 table:
DCT Matrix=
[  992   -30   -30   -24   -24   -20     0     0]
[  -30   -42   -42   -40   -26   -29   -30     0]
[  -28   -42   -40   -36   -20   -29     0     0]
[  -28   -36   -33   -30   -26     0     0     0]
[  -18   -33   -38   -28   -34     0     0     0]
[  -12   -18   -28   -32     0     0     0     0]
[    0   -32     0     0     0     0     0     0]
[    0     0     0     0     0     0     0     0]

Cr table:
DCT Matrix=
[    0     9     0     0     0     0     0     0]
[    9    11    13     0     0     0     0     0]
[    0    13     0     0     0     0     0     0]
[    0     0     0     0     0     0     0     0]
[    0     0     0     0     0     0     0     0]
[    0     0     0     0     0     0     0     0]
[    0     0     0     0     0     0     0     0]
[    0     0     0     0     0     0     0     0]

All other tables' values were all set to 0. (Y1, Y2, Y3 and Cb). So again, I'm looking to decode these values to the RGB space myself, and I'm having trouble understanding the IDCT formula. Could you walk me through this?
 Sure. Because you only set a single pixel, you are creating a "step function" which ends up like a "sync" function in the frequency domain. This is what you are observing in the Y0 table.

I did exactly this sort of walkthrough in the JPEG Huffman Coding Tutorial page. Hopefully that should point you in the right direction.
2008-11-05Brian
 Need for speed...

I have a JPEG Decoder that I've written based on varying sources for use in Silverlight and coded in C#.
All works well albeit slow - I only need to generate a small thumbnail from the source JPEG and so I have gone the path of decoding some of the MCUs and sampling the results - images are too ugly to use.
I noticed your App does either DC or DC and AC ... how do I do same for performance gain?
I have an IDCT but am uncertain of how to work with just the DC Table vs DC and AC.
thanx.
 Because I did not write the decoding algorithms in JPEGsnoop for performance (the goal was more robustness and information-gathering), the AC decode was not particularly fast when I first implemented it. By just decoding and representing the DC components, you can save a huge percentage of your overall decoding time.

So, if you simply rely on DC components only, you will still need to do all of the Huffman VLC decode for all the DC and AC components, but you don't have to perform the inverse discrete cosine transform (IDCT). Simply take each of the DC components and convert them to the "average" color for the entire MCU. If you're just after creating a thumbnail, this is an easy way to display an approximate 1/8 view of the original image.
2008-09-09IKM
 Hi
I am having a tough time extracting thumbnails from Thumbs.db. Extracted images from windows 2000 Thumbs.db is not stangard JPEG (Seems so). Can you point out how i can handle the situation? Shall i try to send you one or two sample pics <10K)
Regards,
IKM
 I've never looked at the Thumbs.db file format, so I'm not sure how it is arranged. If you send me an example file that you were having trouble with, I could take a look.
2008-07-03Chintu
 Hello
Nice page
Can u tell me why some images contain more than FFD8 and FFD9 markers( I found some jpegs which has more than 5).Can any form of data or any other information be equivalent to FFD8 or FFD9 or may be any other marker like Restart markers if searched when image viewed in Hex editor.
If so then how to distinguish b/w correct markers and and other data information equivalent to Markers( FFDA,FFD8,FFD9,FFDD..)
Please answer asap if possilble.
Thanks
2008-06-15ashu
 Hi
I want to know how to find these block refers to Y , Cb or Cr??
Also How we do decoding like for one 8*8 pixel values we have 8*8 MCU for Y Cb Cr in series or do we have all Y values and then CC values.....while decoding what is expected.
Also while using JPEG snoop I couldn't know what offset marker means.....also tell me how to find coded stream of bits of any image using JPEG snoop like in the one u discussed in a example of 8*16 black white image...I am asking about colored images also
...
2008-05-15giritharan
 hi,

thks for reply, How the DRI markers are writing into the JPEG file, it is required for my decoder part.

thks in advance,
giri
2008-05-12giritharan
 hi,
Why DRI marker and when it is required...

thanks ,
giri
 DRI Markers are used to define whether or not restart markers (RST) are used, and how often they should be expected within the scan data (entropy coded segment). Restart markers are very useful in that they help a decoder regain synchronization if there is an error in the file. Plus, in some cases it can help a decoder skip forward in a file without having to decode all of the rest of the image data.
2008-04-08chunchang
 I want to decode a Jpeg file into Bitmap in SoC Linux, who can show me how to do ?
2008-04-01Casiopaya
 Hello,

which part of the JPEG-encoding is the one which takes the much calculating time? I think either die DCT or the Huffman-Encoding. Do someone knows a source or a www-site which explains _in detail_ the calculting-time-usage of the different parts in JPEG-Encoding?

Im also (or especially!) interested in knowing this about jpeg 2000. Why takes the encoding so much time, is it the DWT or the arithmetic coding (in case of JPEG 2000 its the special EBCOT).

thank you very much,
Casiopaya
2008-03-30Jo
 Hi, I need help with this, given Y, CB and CR values, how do I extract DC and AC components? Are they somehow interrelated?
 Going from YCC to DC + AC is part of the JPEG compression process, and it is dependent upon previous pixels in the image. To keep things simple, let's assume that the entire image was created with 8x8 pixel blocks (of the same color/intensity, and no chroma subsampling). The average YCC value of each block (or MCU) is calculated. This is coded (with huffman coding) into the DC coefficient. Each coded DC coefficient is calculated as the change in DC YCC value from the previous DC coefficient's YCC value. AC components are more complicated in that they represent the variation in color/intensity of the 64 pixels in each 8x8 pixel block, but do so in the frequency domain (see discrete cosine transform).
2008-03-26Chuck
 Finally, a direct and simple explanation of JPEG H & V sampling to the mysterious 4:4:4, 4:2:2, and 4:2:0 notations.

JPEGsnoop is very nice also. You may want to add these JPEG decoder implementations: Cornell and Stanford.
Please see: http://www.faqs.org/faqs/jpeg-faq/part2/section-15.html

The information about ijg not supporting lossless is not correct - you just have to find the right patches.

The link for the Stanford version is dead: try: http://hpux.cs.utah.edu/ftp/hpux/X11/Graphics/JPEG-1.2.1/JPEG-1.2.1-src-11.00.tar.gz
 Hey Chuck -- Thanks very much for providing these additional sources -- it's very helpful.
2008-02-09Maria
 Btw the dc value in the dct matrix (in jpeg snoop) is still the relative value, but I should use the absolute value when doing dct right? Thanks!
2008-02-09Maria
 Hi Calvin,

I've successfully get the 64 dct coefficients (after huffman decoding) then I apply inverse dct on them. The problem is after the idct, each coefficient is not 8 bit, it's a lot more. I've checked my dct coefficients using JPEG snoop and they're correct. Then I check my idct result using matlab (i input the dct coefficients into the idct function) and again the result is correct. How can I make the idct result to become 8 bit unsigned value? Thanks alot.

 


Leave a comment or suggestion for this page:

(Never Shown - Optional)