1	Graphics File Format and Data Compression Techniques (take 2) Week 9
2	Administrivia Attendance Exams graded but not curved Can see yours over break Will have them back next week (or in office?)
3	Tonight Discuss Exam HW Lossy Compression
4	Exam How did you like it?
5	Exam Grades
6	Exam #1 Huffman Coding A backbone of compression Used as backend to most lossy algs. (need to understand this as we move forward) Many people failed to construct the tree
7	Exam #3 Dice Info/Entropy 3A Which has higher entropy? Throw of: 1 Die 2 Dice I intended this to be a reasoning problem
8	Exam #3 Dice Info/Entropy Unfortunately two conflicting factors to reason: 2 dice have more possibilities (11 instead of 6) -> RAISE ENTROPY 2 dice have uneven distribution -> LOWER ENTROPY Instead requires solution of the entropy calculation (or determine relative effect of factors-> consider rock in canoe puzzle)
9	Exam #3 Dice Info/Entropy 1 Die 6 possibilities each with p(e)=1/6 H(x)=-1/6 log(1/6) - … 1/6 log(1/6) = -6*1/6 log(1/6) = -log(1/6) = 2.587bits We know it must be < 3 bits since 3 bits can encode 2^3=8 values…
10	Exam #3 Dice Info/Entropy 2 Dice: 36 possibilities, 11 outcomes P(2)=1/36, P(3)=2/36, P(4)=3/36, P(5)=4/36, P(6)=5/36, P(7)=6/36, P(8)=5/36. P(9)=4/36, P(10)=3/36, P(11)=2/36, P(12)=1/36 We know that H(X) < 4 since 4 bits can encode 2^4=16 values…
11	Exam #3 Dice Info/Entropy Crunching the numbers for 2 dice: H(x)= 3.2744 Lower entropy than 1 11-sided die Higher entropy than 1 6-sided die Will receive credit for 3a (Which has greater entropy) So long as you attempted to answer it
12	Exam #3c,d Dice Info/Entropy Best compression for roll of 1 die: Not compressible Huffman would yield optimal coding Best compression for roll of 2 dice: Huffman compression Uneven probability distribution No intra symbol dependencies (result of one roll not dependent on previous)
13	Exam #5 Convert to Grey unsigned malloc(2563+100*100); Grey scale palette Each component contains same value (have to assume that our RGB space is already normalized to perception) Luminance = Sum of components (we discussed this during color matching)
14	Exam Any other questions?
15	HW7 (Week 8) Delta coding ‘work’ for everybody?
16	HW7 (Week 8) Huffman wasn’t supposed to offer much compression Why did gzip compress the audio?
17	HW7 (Week 8) Anybody design a better predictor?
18	HW7 (Week 8) Fascinating idea: A reversible filter can: ‘decorrelate’ a signal Eliminates statistical dependencies Result can be entropy coded
19	Post midterm Pre-midterm Lossless Post-midterm Lossy But also: Frequency domain techniques
20	3 Compression Techniques Sampling and Quantization Coding Psychologically/Physiologically inspired detail removal
21	1 Quantization We have covered this already: Quantization is data representation Converting an analog world into digital samples
22	Quantization choices: Sampling frequency Spatial resolution (images) Temporal resolution (sound) Quantization Range of values (record signal 0 to 96dB) (record brightness 0 to 100 candela) Resolution Encode levels in N bits
23	Quantization Somewhat cut and dry We basically know what we want to capture and capture it
24	2 Coding Spent 1^st ½ of class on this Information theory How to remove redundant information
25	Coding as representation Coding really creates an optimal representation Consider Huffman encoding ASCII list of die rolls From 16 bits to 3 bits -> 5.3:1 improvement
26	Coding techniques We know the theoretical coding limits And how close to the limit we can achieve Room for improvement: Use of predictive filters However our techniques are very good Huffman coding: optimal for cases w/o intra-symbol coherence Turbo codecs: remarkably close to Shannon limit
27	3 Psychological/Physiological techniques Understanding JND: just noticeable differences to discard undetectable detail Lossy techniques Tremendous opportunities
28	Weber’s Law ΔI/I = k For many phenomena the ratio of detectable intensity I constant k, called the weber constant Only true for ‘prothetic’ sensations Relating to increase in intensity Not ‘metathetic’ sensations relating to change in quality ie pitch of sound E. H. Weber experimented on thresholds of perception of lifted weights
29	JND is statistical For instance could refer to point correctly detected on 75% of trials
30	Signal Detection Theory Abstracts 2 essential aspects of decision making Uncertainty Bias Into a statistical model Enables valid conclusions from data
31	Naïve model Stimulus occurs Processing is conducted by the subject Subject decides if experienced a sensation
32	Naïve model
33	Hammer Strike Test Test range Sound preceded strike by up to 2ms Sound follows strike by up to 2ms Subjects questioned Which came first Data Random responses across range What is going on?
34	Hammer Strike Test Obviously tested range was insufficient Fails thought experiment: Hitting glass at arms length Seems instantaneous So 2ms is obviously not long enough: Assume 2 foot arm Sound travels roughly 1ms per foot
35	Hammer Strike test Should have varied timing until Subject achieved 75% correct judgment Signal Detection theory helps explain this value
36	Psychophysics Study of mapping between stimulus and sensation Stimulus threshold value Level of intensity or duration of a stimulus Below which a human cannot sense Example Sound/Touch precedence undetectable at 2ms
37	Proposal to detect threshold Present subject a series of stimuli Varying above/below estimated threshold Count number of “yes” responses Define threshold at value receiving 75% correct “yes” responses Any problem with this proposal?
38	One problem with proposal: Bias Fails to consider effect of non-sensory variables on the human decision to say “yes” This method assumes that the stimulus completely determines the probability of a “yes” response Example: (eye test) If task is to determine a blinking light Subject is biases toward saying “yes”
39	Another problem: Noise Does not include the possibility that any part of the experiment has some uncertainty Uncertainty is “noise”
40	Solution: Model of detection includes Bias Noise Monolithic threshold Replaced with statistical model
41	Consider Paintball Game To win: Shoot before being shot Be vigilant Any twig snapping could spell danger!
42	Paintball Noise and bias added to detection:
43	Noise Important addition to model Obvious: Detection impossible in tremendous noise Noise found in both: Environment i.e. rustling leaves in paintball field Sensory system i.e. fluid in the ear
44	Noise as a statistic Can model noise as a Gaussian probability distribution ‘blob’:
45	Anatomy of a blob Y axis probability of a given X axis value: Pressure, temperature, etc. Peak of curve indicates: Which X value is most likely For a symmetric blob It is the mean value
46	Anatomy of a blob Spread of blob Indicates the likely range of X values (if we took successive measurements) Loosely: the variance
47	Some blobs Gaussian noise distributions with increasing variance from left to right:
48	Blobs Leftmost blob is the least noisy Very narrow range of expected values Rightmost blob is the most noisy Very broad range of expected values
49	Noise blobs Model combines External (environment) Internal (sensory system) noise Represents result by a single blob Blobs are normalized to an area of 1.0 Equates areas with probabilities
50	Signals can be noisy too Can also be represented as statistical blob Internal/external signal noise in same blob Revealing to: Plot signal and noise on the same axes View Signal + Noise blob as Copy of the Noise blob shifted by intensity of the signal
51	Combined graph
52	In the real world If signal is weak compared to background noise (Paintball forest) The signal can be partially masked by a noise source
53	Implication for Signal Detection The weaker the signal (perhaps a distant twig snap) The more the signal is masked by noise Correspondingly it more difficult to discern Intersection between the Noise and Signal + Noise curves represents the amount of uncertainty. A stimulus in the intersection range could be: Just Noise Or Signal (we desperately want to detect)
54	Uncertainty graph Uncertainty where signal, noise overlap
55	Noise/Signal overlap When signal and noise blobs are far apart Near 100% certainty is possible As the signal decreases in intensity (blobs eventually overlap) chance for detecting event decreases to 50% Pure chance Same result one would achieve from guessing.
56	Signal strength The following sequence of blobs shows a sequence of progressively weaker signals:
57	Bias Bias is present in any decision Inherent in the signal detector Us Even a machine/algorithm We are told that bias is bad It is unavoidable Not inherently bad
58	Noise/Signal overlap When signal and noise blobs are far apart Near 100% certainty is possible As the signal decreases in intensity (blobs eventually overlap) chance for detecting event decreases to 50% Pure chance Same result one would achieve from guessing.
59	Signal strength The following sequence of blobs shows a sequence of progressively weaker signals:
60	Bias Bias is present in any decision Inherent in the signal detector Us Even a machine/algorithm We are told that bias is bad Not inherently bad It is unavoidable
61	Decision Matrix Two bits of information Four outcomes:
62	Paintball Paranoia In paintball Price of inaction outweighs price of action Four outcomes:
63	Paranoia is internal bias Given this cost/benefit analysis Paintball warrior subconsciously biased to: Favor false positive Over false negative
64	Graphical Bias Bias may be represented on our graph: Stimuli right of bar are considered detected
65	Graphical Decision Matrix Shaded regions depict probabilities Wrong in article! (articles are pre-edits)
66	Cost benefit analysis Every situation has different Signal strength Noise level Appropriate Detector Bias Based on benefits/costs of False positive False negative Correct detection Correct rejection
67	A walk in the park Consider this graph:
68	The ROC ROC: Receiver Operating Characteristic Curve Secret WWII RADAR program
69	ROC Graph Example:
70	Understanding the ROC Each point on the curve Represents a different bias threshold For given noise, noise+signal blobs
71	Corresponding Blob graphs (shading wrong in pre-print)
72	ROC / Blob Equivalence For a given Noise and Signal + Noise pair The ROC is the curve traced out from (x,y) = (0,0) to (1,1) as decision threshold is swept from positive infinity to negative infinity
73	ROC / Blob Equivalence x-coordinate is the area of the Noise blob y-coordinate is the area of the Signal + Noise blob to the right of the threshold
74	ROC Topography Interesting ROC points: Pascal’s Wager (1,1) Paintball Warrior Toward (1,1) Walk in the park Closer to (0,0)
75	Signal + Noise recoverable from ROC Blobs are normalized have an area of 1.0 equivalent to probabilities Probability of False Positive is 1.0 - probability of Correct Reject Probability of Correct Detect is 1.0 - probability of False Reject Consequently: Each point on the ROC curve Encodes all signal and threshold info
76	ROC may compare different signal strengths
77	Performance = area under curve For a given Signal and Noise the area under the corresponding ROC curve defines the performance of a subject in a signal detection task independent of any particular bias By taking the area under the curve we factor out the influence of bias!
78	Zero-signal ROC Area under curve Chance of correct detection
79	Perfect performance ROC Signal is devoid of noise
80	Interesting Plots Extreme plots aren’t interesting Pure chance Perfect performance
81	Bias and humans Can’t ask a human to change bias Would confuse them w/o extensive training (interestingly can set a machine’s threshold) So with human subjects Can only change the strength of the signal
82	ROC of flawed haptic experiment Original experiment A Signal strength (time magnitude) too low
83	JND Phenomena Discrete change Continuous change (Much more sensitive to)
84	JND Visual Phenomena Smallest object/highest frequency (minutes of retinal arc subtended) Dimmest object Smallest movement (implies continuity) Smallest displacement (implies discrete)
85	JND temporal visual phenomena Smallest timing difference Fusion frequency
86	JND Color Phenomena Smallest intensity difference Smallest hue difference Highest frequency seen in shadow
87	JND Hearing Phenomena Smallest intensity difference Smallest frequency difference Smallest timing difference Audio masking (sounds undetectable in presence of related sound) Largest jitter undetectable Largest error undetectable
88	JND Haptics Smallest pressure Smallest pressure change (weight) Smallest temperature change
89	JND Smell/Taste Smallest concentration of chemical PPB (parts per billion) Smallest change in concentration
90	JND Geometry Largest distortion undetectable Microsoft Talisman affine transform instead of re-render