1


2

 We wish Dr. Folsom a speedy recovery…
 …and need to pick up where he left off
 Thanks for being flexible!

3

 Proceed from week 3
 Makeup for last week as we go
 Rely on readings and office hours

4

 Follow syllabus
 Introduce material based on my experience emphasizing engineering
perspective

5

 Seniors
 3 textbooks!
 Discuss Implications
 Not going to reread texts in class

6

 Attendance (please sign in)
 Figure out where we are
 Resubmit assignments
 Make up the missing week

7

 Who am I?
 Contacting me
 Office Hours
 Website
 Assignments/Grading
 Miscellaneous

8

 Who am I? (a Digital Media guy)
 Silicon Graphics  Digital Media
 Microsoft  Direct Animation
 DigiPen – Developed CS245
 Intro to Dynamic Audio Synthesis
 Audio Anecdotes:
 Volume 1 in Library
 Volume 2 sent to pring
 (Will have occasional readings)

9

 Contacting me
 Email: keng@sworks.com
 Best way!
 (I don’t check my DigiPen email often)
 Cell: 206.310.5253

10

 Office Hours:
 Immediately before or after class?
 Best to contact me in advance

11

 Website:
 http://cs420.sworks.com
 Will make class materials available

12

 Practical implications of compression
 Tradeoffs

13

 What algorithm would you suggest?
 Why?
 Tradeoffs?

14


15

 Lossy/Lossless
 Which? (rle, jpeg, ocr)
 Why?
 Artifacts?
 Reuse?

16

 Engineering is about managing tradeoffs
 How strong should a bridge be?
 Model T’s Axel
 Color TV
 Amazing tradeoffs based on understanding human physiology

17


18

 Application?
 Programming environment?
 Operating System?
 File System?
 Hardware?

19


20

 Digital Negative Specification
 Proposed universal standard
 (replaces propietary ‘raw’ formats)
 free software tool to convert 65 camera’s raw formats to DNG
 What is wrong with jpeg?
 Why is ‘raw’ data so important?

21

 All the information captured camera sensor
 Before any incamera processing

22

 How is info stored?
 Endianess?

23

 For instance Symes states:
 Teleconferencing as an example of symmetric compression
 I disagree for an important subset of cased
 Discuss

24

 Motion JPEG
 Video as sequence of JPEG images
 I worked on COSMO
 Professional MJPEG for SGI
 Why MJPEG and not MPEG?

25


26


27

 What are the characteristics of
 Computer generated images?
 Text?
 Photos?
 How does this relate to A vs Digital?

28

 Representing analog as digital
 Everyone clear on
 Sampling theory?
 Sources of error?
 Eliminating aliasing?
 Limiting quantization noise?

29

 Please resubmit
 Questions?

30

 Please resubmit
 Discuss
 Questions?

31

 What is entropy?
 What is information?

32

 Information is
 Value
 Unique, nonpredictable

33

 Assumes generation of information is probabilistic
 Event E
 Occurs with probability P(E)
 Contains I(E) units of information
 I(E)=log2(1/p(E))

34

 Example
 P(E)=1/4
 Then I(E)= log2(1/4)
 = log2(4)
 = 2 bits of
information

35

 Source can produce symbol set
 A = {a1, a2, …. Aj}
 (called a source alphabet)
 These can be:
 English letters, words, or even pixel values

36

 Given:
 P(aj)
 Probability source will produce symbol aj
 I(aj)=log2P(aj)
 Amount of information generated by that symbol

37

 If a large number, k, symbols generated
 Then on average
 Each symbol will occur kP(aj) times

38

 If a large number, k, symbols generated
 Average amount of information from k outputs

39

 First order entropy of the info source:
 Pairs of symbols provide 2^{nd} order entropy:

40

 Entropy of the source
 Limit of an infinitely long sequence of symbols
 This is the same as the first order entropy
 When there is no correlation between symbols
 (independent and identically distributed)

41

 Entropy
 Is the avg. amount of info provided by source
 Maximized when all symbols are equally probable
 Ex: Can estimate by examining a histogram of luminance values in image
 Max amount of compression achieved by variablelength coding <= 1^{st}
order entropy

42

 Most efficient coding
 Uniquely decodable (no ambiguity)
 Avg bits per symbol >= entropy of source

43


44

 Histogra/1^{st} order estimates

45

 H = (3/8) log2(3/8)  (1/8) log2(1/8)  (1/8) log2(1/8)  (3/8)
log2(3/8)
 = [3/4 log(3/8) – ¼ log(1/8)] /
log(2)
 = [0.75 * 0.42597 – 0.25 *
0.90309] / 0.30103
 = 1.811 bits / pixel
 Recall: log2(x) = log(x) / log(2)

46

 Since
 4*8 = 32 pixels in the original image
 Best compression (using variable length coding)

47

 2^{nd} order estimate of source entropy
 Analyzes pairs of pixels
 Lets examine horizontal pairs and assume image wrapping

48

 One 2^{nd} order estimate
 H=1.25 bits / pixel

49

 Difference between
 1^{st} order entropy estimate
 Higher order entropy estimates
 Give the amount of intersymbol redundancy
 If symbols are statistically independent
 Then entropy estimates are equal

50

 1^{st} order
 2^{nd} order
 Difference: 0.56 bit/pixel

51

 Applying a subtractive filter image becomes

52

 First order estimate:
 H = (1/2) log2(1/2)  (1/8)
log2(1/8)  (3/8) log2(3/8) = 1.41 bits/pixel
 Thus we can represent this image with
 32 * 1.41 = 46 bits (still short of the 2^{nd} order entropy of
1.25 bits/pixel)

53

 Can compute:
 Average length of code by
 Adding up the number bits in each symbol
 Multiplied by the probability of the symbol

54

 Efficiency a code is the
 Ratio of entropy to average code length
 Ex:
 If entropy is 2.14 bits/symbol
 And a code has avg. length of 2.2 bits/symbol
 Then efficiency is 2.14/2.2

55

 Huffman code provides most efficient coding if symbols must be
individually coded
 Higher order entropy analysis and Shannon tells us we can do better
 Take advantage of intersymbol correlation

56

 A code is worthless if it can’t be unambiguously decoded

57

 Comparison of different codings:

58

 Code 1
 Most efficient
 Provides the most compression
 Dubious
 Can’t distinguish between ‘e’ and ‘t’

59

 Code 2
 Ambiguous
 100 codes either “tee” or “ts”

60

 Codes 3, 4 are unambiguous
 Code 4
 Receiver doesn’t know it has complete letter until it receives a 0
starting next

61

 Code 3 instantaneous
 Receiver can recognize any letter as last bit is received
 All codes are leaves as represented as a tree

62

 For any unambiguous nonprefix code
 Always possible to find a prefix code
 With same compression

63

 Concerns?
 What do you want to tell me?
