ࡱ> PRO7 GbjbjUU ">7|7|El0008h|l)&T,C)E)E)E)E)E)E)$* ,vi)...i)8~)888.C)8.C)88P"{$$ 7Tgc0B$$|)0)$,P-8P-$8*Jt6HYPERLINK "Information Theory.doc"Previous lectureHYPERLINK "..\\Dxf\\CAD_dxf.doc"Next Lecture HYPERLINK "..\\320Syllabus.doc" SyllabusHYPERLINK "..\\..\\Homework\\Week2\\Week 2 Homework Assignment.doc"HomeworkSimple cryptography One of the oldest methods of encoding a message is to construct a table of character transformations. This was typical of military and diplomatic codes used 100 years ago. Message: PULL THE TROOPS ON YOUR LEFT FLANK BACK TO THE BRIDGE. Table ABCDEFGHIJKLMNOPQRSTUVWXYZGHJKLFDSAZXCVBNMPIOUYTREWQ Encoded message: MYCC USL UISSMO NB WNYI CLFU FCGBX HGJX UN USL HIKDL. This method of encryption is not used today because it is too easy to crack. See for example the Sherlock Holmes story The Adventure of the Dancing Men by A. Conan Doyle. In English, some letters such as E and T occur very frequently, while others, such a Q and Z are rare. You can construct a table of how often characters occur and decode the message. For example, the short message has 5 Us, 5 Ns, 4 Cs and 4 Ls. We can guess that two of these four symbols represent T and E. Seeing the string USL twice we can guess that it means THE and we are on our way to cracking the code. [Miano Table 6.1] Huffman coding The idea that some letters occur more often than others can be used for data compression. Why should every letter get 8 bits? We can represent the most common characters (such as space, E, T) in 3 or 4 bits and use 12 or more bits for the unusual characters. This is the idea behind Huffman encoding, invented by D.A. Huffman in 1952. This idea works as long as some letters (or colors) occur much more often than others. If we are encoding white noise (any character is equally likely) then Huffman encoding will not achieve any compression. In Huffman coding, we construct a usage table for a typical message [or for a specific message]. We then use the frequencies of usage to construct a binary tree from the bottom up. The binary tree then gives the bit code for each character. Method: Find the two values with the lowest frequencies and remove them from the pool. Ties can be broken arbitrarily. [see below] Make a new tree node whose branches are the two items from the previous step. Make the frequency of the tree node be the sum of the branch frequencies. Add the new node to the pool. Repeat until the pool has only one symbol When the tree is done, assign 0 to all the left branches and 1 to all the right branches. The code for each symbol is found by tracing the tree. Examples [Miano pp. 63-65] [Sayood Example 3.2.1] Minimum Variance Sayood, Figure 3.4 shows an example of two different Huffman codes that have the same average length per symbol. However, one of them has codewords of size 1 to 4 bits and the other has only 2 and 3 bit codewords. Minimum variance is a desirable property, since it can reduce the buffer size needed on a channel. To get minimum variance, place combined codes as high on the list as possible. In other words, when there are several choices for what to put into a node on the tree, favor single letters. JPEG Huffman coding We have shown that a binary tree can be used to generate a table of Huffman codes. A tree is not the most computationally efficient way to generate the codes. It is better to first make a table of the code lengths for each symbol. [Miano, pp.66-67] Once we know how many bits we need for each symbol, we can use the following algorithm to generate the codes: void GenerateHuffmanCodes ( int NumberOfCodes, /* in; number of symbols to encode */ unsigned char CodeLengths[], /* in; # bits in each code index: 0 to NumberOfCodes-1 */ int Codes[] /* out; Huffman code; index: 0 to NumberOfCodes-1 */ ) { int HuffmanCodeCounter = 0; unsigned char CodeLengthCounter = 1; for (int i = 0; i < NumberOfCodes; ) { if (CodeLengths[i] == CodeLengthCounter) { Codes[i] = HuffmanCodeCounter++; } else { HuffmanCodeCounter <<= 1; CodeLengthCounter++; } if (i == CodeLengthCounter) i++; } } [Miano, pp. 68-69] Notice that GenerateHuffmanCodes() doesnt know or care what symbols go with the codes. The association of symbols to codes is implied by the order in which the symbols are sorted. What is the most compact way to store the Huffman codes? A binary tree structure has lots of overhead. We could store the Huffman table as arrays of bytes: SymbolCode LengthCode (binary)A10Space3100N3101L41100M41101P41110C511110. (period)511111 We can see that we are wasting space for the codes (64 bits). We could replace the third column with: 01001011100110111101111011111 This is 29 bits, or 32 bits if we pad it to fit in four bytes. In fact, we dont need to store the third column in our file at all. When we need it, we can generate it in RAM using GenerateHuffmanCodes(). The second column has 8 entries, one for each symbol. If we know that the maximum size of a Huffman code is 5 bits, then we can replace the second column with a list of the number of codes of each length. implied index#codes of this length1120324352 In JPEG, the maximum length of a Huffman code is 16 bits. JPEG Huffman tables are formatted as: Field sizeDescription1 byteTable class16 bytesCount of Huffman codes of size 1 to 16.Sum of values in previous 16 bytes1-byte symbols sorted by Huffman code Limiting the size of Huffman codes If you want to reduce the maximum size of a Huffman code, you can rearrange tree nodes to make it so. [Miano fig. 6.2] Working with an array of length counts, we can reduce a symbol. Find a symbol whose code length is at least 2 less than the length to be reduced. Change this symbol to a branch, and shift the longer symbols. HYPERLINK "PCX.doc"Previous lectureHYPERLINK "..\\Dxf\\CAD_dxf.doc"Next Lecture HYPERLINK "..\\320Syllabus.doc" SyllabusHYPERLINK "..\\..\\Homework\\Week2\\Week 2 Homework Assignment.doc"Homework #$%5678XYZfghi[m6 Tklmj Uj] Uj U B*ph5>* CJOJQJOJQJj1UjUjU0JjU jUC7higeeeee$$Ifl\:,"04 la$If EF   !#%$If%&(*,.02468:<>@BDFHJLNPRTVXZ$IfFfZ[\m   x o p A B 5 6 >  R p ,5G & FFfPG^_pjk"j|}D*,.Nw%/Uvw(6$If679;=>DFJKMOSTV|||4|||$|||(|$If|$$IflF bUr0    4 laVX]^`bghjlqrtv||(|(|,|$$IflF bUr0    4 la$If|}VP|||zzzzzz||$If|$$IflF bUr0    4 la $Ifi$$Ifl0J04 laT_kls!P(i$$Ifl0 04 la$If!klCDg$$Ifl\:,"04 la$If 789ABGj U0J jU DEFG 1h/ =!"#$%DyK F INFORM~1.DOC2,Information Theory.docDyK FDxf\CAD_dxf.doc$Dxf\CAD_dxf.docDyK F320Syllabus.docDyK F.Homework\Week2\Week 2 Homework Assignment.doc`$$Ifl@6)z m `SF9,} !QQQQQQQQQQQQQQQQQQQQQQQQQQ0hhhh4 la`$$Ifl@6)z m `SF9,} !QQQQQQQQQQQQQQQQQQQQQQQQQQ0hhhh4 laDyK FPCX.docPCX.docDyK FDxf\CAD_dxf.doc$Dxf\CAD_dxf.docDyK F320Syllabus.docDyK F.Homework\Week2\Week 2 Homework Assignment.doc i8@8 NormalCJ_HaJmH sH tH T@T Heading 2$<@& 56CJOJQJ\]^JaJN@N Heading 3$<@&5CJOJQJ\^JaJ<A@< Default Paragraph Font.U@. Hyperlink >*B*ph>V@> FollowedHyperlink >*B* ph,, Header  !, ", Footer  !G > z z z z zl xG9#7h   !#%&(*,.02468:<>@BDFHJLNPRTVXZ[\m xopAB56> R p , 5 G ^ _ p j k " j | } D*,.Nw%/Uvw(679;=>DFJKMOSTVX]^`bghjlqrtv|}VT_kls!klCDH000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0 0 0 0 00000000000000000000000000000000000000000000000(00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000G%ZG6V|!DGF$57Yfhl8AGXXXXXXXX cs320_PCXcs320_Dictionary _Hlt496510091 _Hlt521469712 _Hlt514209246 _Hlt514209247 CS320_HuffmanCS320_Huffman_more _Hlt525969598 _Hlt525969599-\l H@@@@@ @.]l H67gh5 6 ;  < E i j k p   Va256H`q ;MarlBDH67gh j k l y  HP *.25RZ{~ $%-lBDH33333333333333333333tfolsom1D:\DigiPen\Sum2001\CS320\Lectures\RLE\Huffman.doctfolsomZC:\WINNT\Profiles\tfolsom\Application Data\Microsoft\Word\AutoRecovery save of Huffman.asdtfolsomZC:\WINNT\Profiles\tfolsom\Application Data\Microsoft\Word\AutoRecovery save of Huffman.asd Tyler Folsom2D:\DigiPen\Fall2001\CS320\Lectures\RLE\Huffman.doc Tyler Folsom+\\Dit_euclid\CS320\Lectures\RLE\Huffman.doc Tyler FolsomZC:\WINNT\Profiles\tfolsom\Application Data\Microsoft\Word\AutoRecovery save of Huffman.asd Tyler FolsomZC:\Winnt\Profiles\tfolsom\Application Data\Microsoft\Word\AutoRecovery save of Huffman.asd Tyler Folsom2F:\DigiPen\Fall2003\CS420\Lectures\RLE\Huffman.doc*)̡:^`o()^`.pLp^p`L.@ @ ^@ `.^`.L^`L.^`.^`.PLP^P`L.*)7h   !#%&(*,.02468:<>@BDFHJLNPRTVXZ[(679;=>DFJKMOSTVX]^`bghjlqrtv|}T_klslCDH@ 66dgg6?6  G@UnknownGz Times New Roman5Symbol3& z Arial?5 z Courier New"1hU8{xh *$0=2QSimple cryptographytfolsom Tyler FolsomOh+'0   < H T`hpxSimple cryptographyimptfolsomfolfol Normal.dott Tyler Folsomgra8leMicrosoft Word 9.0@pT@H@?gc՜.+,D՜.+,\ hp   DigiPen Institute Of Technology*  Simple cryptography Title 8@ _PID_HLINKSAT0OY4..\..\Homework\Week2\Week 2 Homework Assignment.doc ...\320Syllabus.docQ...\Dxf\CAD_dxf.docl" PCX.docOY 4..\..\Homework\Week2\Week 2 Homework Assignment.doc ...\320Syllabus.docQ...\Dxf\CAD_dxf.doc9.Information Theory.doc !"#$%&')*+,-./0123456789:;<=>@ABCDEFHIJKLMNQRoot Entry F4>TgcSData  1Table(P-WordDocument">SummaryInformation(?DocumentSummaryInformation8GCompObjjObjectPool4>Tgc4>Tgc  FMicrosoft Word Document MSWordDocWord.Document.89q