1
|
|
2
|
- We wish Dr. Folsom a speedy recovery…
- …and need to pick up where he left off
- Thanks for being flexible!
|
3
|
- Proceed from week 3
- Makeup for last week as we go
- Rely on readings and office hours
|
4
|
- Follow syllabus
- Introduce material based on my experience emphasizing engineering
perspective
|
5
|
- Seniors
- 3 textbooks!
- Discuss Implications
- Not going to re-read texts in class
|
6
|
- Attendance (please sign in)
- Figure out where we are
- Re-submit assignments
- Make up the missing week
|
7
|
- Who am I?
- Contacting me
- Office Hours
- Website
- Assignments/Grading
- Miscellaneous
|
8
|
- Who am I? (a Digital Media guy)
- Silicon Graphics - Digital Media
- Microsoft - Direct Animation
- DigiPen – Developed CS245
- Intro to Dynamic Audio Synthesis
- Audio Anecdotes:
- Volume 1 in Library
- Volume 2 sent to pring
- (Will have occasional readings)
|
9
|
- Contacting me
- Email: keng@sworks.com
- Best way!
- (I don’t check my DigiPen email often)
- Cell: 206.310.5253
|
10
|
- Office Hours:
- Immediately before or after class?
- Best to contact me in advance
|
11
|
- Website:
- http://cs420.sworks.com
- Will make class materials available
|
12
|
- Practical implications of compression
- Tradeoffs
|
13
|
- What algorithm would you suggest?
- Why?
- Tradeoffs?
|
14
|
|
15
|
- Lossy/Lossless
- Which? (rle, jpeg, ocr)
- Why?
- Artifacts?
- Re-use?
|
16
|
- Engineering is about managing tradeoffs
- How strong should a bridge be?
- Model T’s Axel
- Color TV
- Amazing tradeoffs based on understanding human physiology
|
17
|
|
18
|
- Application?
- Programming environment?
- Operating System?
- File System?
- Hardware?
|
19
|
|
20
|
- Digital Negative Specification
- Proposed universal standard
- (replaces propietary ‘raw’ formats)
- free software tool to convert 65 camera’s raw formats to DNG
- What is wrong with jpeg?
- Why is ‘raw’ data so important?
|
21
|
- All the information captured camera sensor
- Before any in-camera processing
|
22
|
- How is info stored?
- Endianess?
|
23
|
- For instance Symes states:
- Teleconferencing as an example of symmetric compression
- I disagree for an important subset of cased
- Discuss
|
24
|
- Motion JPEG
- Video as sequence of JPEG images
- I worked on COSMO
- Professional MJPEG for SGI
- Why MJPEG and not MPEG?
|
25
|
|
26
|
|
27
|
- What are the characteristics of
- Computer generated images?
- Text?
- Photos?
- How does this relate to A vs Digital?
|
28
|
- Representing analog as digital
- Everyone clear on
- Sampling theory?
- Sources of error?
- Eliminating aliasing?
- Limiting quantization noise?
|
29
|
- Please re-submit
- Questions?
|
30
|
- Please re-submit
- Discuss
- Questions?
|
31
|
- What is entropy?
- What is information?
|
32
|
- Information is
- Value
- Unique, non-predictable
|
33
|
- Assumes generation of information is probabilistic
- Event E
- Occurs with probability P(E)
- Contains I(E) units of information
- I(E)=log2(1/p(E))
|
34
|
- Example
- P(E)=1/4
- Then I(E)= -log2(1/4)
- = log2(4)
- = 2 bits of
information
|
35
|
- Source can produce symbol set
- A = {a1, a2, …. Aj}
- (called a source alphabet)
- These can be:
- English letters, words, or even pixel values
|
36
|
- Given:
- P(aj)
- Probability source will produce symbol aj
- I(aj)=-log2P(aj)
- Amount of information generated by that symbol
|
37
|
- If a large number, k, symbols generated
- Then on average
- Each symbol will occur kP(aj) times
|
38
|
- If a large number, k, symbols generated
- Average amount of information from k outputs
|
39
|
- First order entropy of the info source:
- Pairs of symbols provide 2nd order entropy:
|
40
|
- Entropy of the source
- Limit of an infinitely long sequence of symbols
- This is the same as the first order entropy
- When there is no correlation between symbols
- (independent and identically distributed)
|
41
|
- Entropy
- Is the avg. amount of info provided by source
- Maximized when all symbols are equally probable
- Ex: Can estimate by examining a histogram of luminance values in image
- Max amount of compression achieved by variable-length coding <= 1st
order entropy
|
42
|
- Most efficient coding
- Uniquely decodable (no ambiguity)
- Avg bits per symbol >= entropy of source
|
43
|
|
44
|
- Histogra/1st order estimates
|
45
|
- H = -(3/8) log2(3/8) - (1/8) log2(1/8) - (1/8) log2(1/8) - (3/8)
log2(3/8)
- = [-3/4 log(3/8) – ¼ log(1/8)] /
log(2)
- = [-0.75 * -0.42597 – 0.25 *
-0.90309] / 0.30103
- = 1.811 bits / pixel
- Recall: log2(x) = log(x) / log(2)
|
46
|
- Since
- 4*8 = 32 pixels in the original image
- Best compression (using variable length coding)
|
47
|
- 2nd order estimate of source entropy
- Analyzes pairs of pixels
- Lets examine horizontal pairs and assume image wrapping
|
48
|
- One 2nd order estimate
- H=1.25 bits / pixel
|
49
|
- Difference between
- 1st order entropy estimate
- Higher order entropy estimates
- Give the amount of inter-symbol redundancy
- If symbols are statistically independent
- Then entropy estimates are equal
|
50
|
- 1st order
- 2nd order
- Difference: 0.56 bit/pixel
|
51
|
- Applying a subtractive filter image becomes
|
52
|
- First order estimate:
- H = -(1/2) log2(1/2) - (1/8)
log2(1/8) - (3/8) log2(3/8) = 1.41 bits/pixel
- Thus we can represent this image with
- 32 * 1.41 = 46 bits (still short of the 2nd order entropy of
1.25 bits/pixel)
|
53
|
- Can compute:
- Average length of code by
- Adding up the number bits in each symbol
- Multiplied by the probability of the symbol
|
54
|
- Efficiency a code is the
- Ratio of entropy to average code length
- Ex:
- If entropy is 2.14 bits/symbol
- And a code has avg. length of 2.2 bits/symbol
- Then efficiency is 2.14/2.2
|
55
|
- Huffman code provides most efficient coding if symbols must be
individually coded
- Higher order entropy analysis and Shannon tells us we can do better
- Take advantage of inter-symbol correlation
|
56
|
- A code is worthless if it can’t be unambiguously decoded
|
57
|
- Comparison of different codings:
|
58
|
- Code 1
- Most efficient
- Provides the most compression
- Dubious
- Can’t distinguish between ‘e’ and ‘t’
|
59
|
- Code 2
- Ambiguous
- 100 codes either “tee” or “ts”
|
60
|
- Codes 3, 4 are unambiguous
- Code 4
- Receiver doesn’t know it has complete letter until it receives a 0
starting next
|
61
|
- Code 3 instantaneous
- Receiver can recognize any letter as last bit is received
- All codes are leaves as represented as a tree
|
62
|
- For any unambiguous non-prefix code
- Always possible to find a prefix code
- With same compression
|
63
|
- Concerns?
- What do you want to tell me?
|