1	Graphics File Format and Data Compression Techniques (take 2) Week 5
2	Business Attendance (please sign in) Break(s) Show’n’Tell Textbooks
3	Breaks I will try my best to have two 10 minute breaks 1^st around 7:00 and the 2^nd about an hour later Please be sure to make it back and be settled within the 10 minutes
4	SIGGRAPH 2004 Proceedings SIGGRAPH = Sexiest computer graphics, animation, art and multimedia research Seattle SIGGRAPH Chapter Signup for announcements: Website: http://seattle.siggraph.org Special DigiPen Video Review meeting Looking for student volunteers (contact keng@sworks.com)
5	Audio Anecdotes Volume2 Advanced copy arrived after class last week Will be sharing copies of articles that pertain to class
6	Shown’n’tell Please take a look at the books over break
7	Textbooks Borders group order? Only one response so far? 20% discount on one book (other is full price)
8	Homework 4 What did you think of the HW?
9	HW4 Media Types Compression ratios .txt .c .wave .mp3 .bmp
10	HW4 gzip worst case How did you construct the file What is going on? Is there a max expansion?
11	HW4 Classifier Anybody get it to work? What went wrong? How should I have specified the comparison?
12	Fragility of bitstreams What a difference a bit can make Flipped or dropped bit Turn one pixel dark OR Change a $0 balance to 4 $294,967,296 OR Render the data unrecoverable
13	Effect of Compression on Robustness Compression by removing redundancy in data simultaneously reduces robustness If a file could be reconstructed w/o a given bit then that bit is unnecessary
14	Bit errors in compressed formats Consider implication of single bit error in increasingly more efficient schemes: RLE Huffman Encoding LZW w/Dynamically constructed Codebooks
15	Sources of bit errors Bit errors are caused by physical realities: High energy Gamma rays or alpha particles Can change the charge on the capacitors that store bits in DRAM Scratch/smudge on a CD CD pressing is imperfect Flakey hunk of Ferris oxide (rust) on tape/disk Noise on phone line disrupt MODEM Noise on digital bus
16	Statistical noise and information theory Noise is inescapable Noise may be modeled statistically as a Gaussian normal probability density Relative loudness of signal and noise define how much information may be communicated over any channel
17	Shannon Limit Theoretically states max information capacity for any channel: C <= W log2(1+S/N) C: channel capacity (bits per second) W: raw bandwidth in Hz S: Signal strength (raw power not dB) N: Noise strength (raw power not dB)
18	Shannon Limit and intuition A channel with: infinite bandwidth should be able to carry infinite information Less clear is how the relationship of signal to noise affects capacity
19	Effect of noise on a channel If the Signal to Noise ratio is too low You can’t determine the signal from the noise Ex: My speaking very softly OK, but why would a channel with: LIMITED bandwidth but No noise Yield a channel with infinite carrying capacity?
20	0 Noise = Infinite Capacity Consider quantization (from last lecture) Theoretically: W/O noise There are infinite levels of amplitude Between 0 and the stated signal strength This also explains why infinite signal strength yields infinite capacity Infinite levels above noise floor
21	Fundamentals Capacity is large when: Bandwidth is high Signal is strong Noise is low
22	Error Detection vs Correction Noise is inevitable (consider the ‘tails’ on the Gaussian curve) Bits will get flipped What do we do?
23	Simple strategies Shout! Repeat everything 3 times (and use best out of 3)
24	Best out of Three Not bad… Will correct single errors Can detect up to two errors Just not very efficient
25	Error Correcting Codes Parity CRC’s (cyclic redundancy checks) Crypto Digests (SHA1, MD5) Hamming Code Reed-Solomon Codes Turbo Codes
26	Parity Used in memory systems Count number of 1’s in a word Add a 0 or 1 parity bit to make the count Even (even parity) Odd (odd parity)
27	Parity Calculation Even Parity XOR all the data bits Odd Parity Negate the XOR of all the bits
28	Even Parity Example Data: 0 0 1 0 0 0 1 1 Parity = 0 ^ 0 ^ 1 ^ 0 ^ 0 ^ 0 ^ 1 ^ 1 Parity = 1 Change any bit? Change 2 bits?
29	Parity Detects single bit errors (an even number of incorrect bits is undetectable) Can not determine which bit is incorrect ie can’t reconstruct the correct word Practically The computer or program will crash instead of providing bad results
30	General ECC “Shout 3 times and vote” seems better than parity Provides: 1 bit error correction 2 bit error detection But Requires sending 3x the bits Fortunately there are better error correcting codes
31	Hamming Codes Developed in 1950’s by a Bell Labs researcher To combat errors in punch card readers Consider: In 7 bit message 7 possible 1 bit errors So (potentially): 3 bits could identify and correct error
32	Hamming Rule d + p + 1 <= 2^p d number of data bits p number of parity bits c = d+p size of code word (c,d) set describing code
33	Hamming (7,4) Most used code 3 control bits added to every 4 data bits Can detect 2 bit errors and correct 1 bit errors Is considered lossless since a channel with 2 out of 4 bit error rate wouldn’t be used
34	Reed-Solomon Codes Data sent as encoded block Encodes message as pts in a polynomial Coefficients of polynomial are the data Plot over determines the coefficients Missing coefs may be recovered from subsets of plotted pts. (consider correcting a curve by interpolating over a gap) Best when errors come in bursts Multiple bit errors in the same value count as a single error
35	Reed-Solomon Code usage Invented in 1960 Too expensive to then implement Used to send images back from Voyager Launched 1977 Used on CD’s 1980 Most advanced versions within 1.5 dB of Shannon Limit
36	Turbo codes Introduced in a 1993 paper (but no one believed them) Remarkably close to achieving Shannon Limit! Presently used in deep space probes Issues: Relatively large latency (hence its use in probes)
37	Latency We discuss latency in my Audio Class Also covered in Audio Anecdotes V1 Discussed it briefly in lecture
38	Now for something different
39	Binary File Format What factors complicate binary file formats?
40	File Format Challenges How to: Formally describe the format Be sure you have implemented the format correctly Understand the implications of platform architecture Write code that will work across platforms
41	ASCII Impunity ASCII representations can be passed from machine to machine with impunity Binary representations have complications
42	Architecture issues Word size Padding Alignment Signed number representation Floating point representation ‘endianess’
43	Endianess Big-endian Store information big end first MSB in lower address LSB in higher address Historically most machines big-endian IBM 370, Motorola CPU’s (68000) Most RISC architectures Some (like MIPS) are ‘bi-endian’ Internet protocols are big-endian
44	Little Endian Send the little end first Machines: DEC PDP-11 and VAX Intel Microprocessors (hence WINTEL PCs) RS-232 serial protocols Send least significant bit first
45	What does this print? short ptr; char array[2]; ptr = (short )array; *ptr = 0x1234; printf(“1st value: %02x\n, array[0]); printf(“2nd value: %02x\n, array[1]);
46	Depends My little endian WINTEL laptop: 1st value: 34 2^nd value: 12 My big-endian MIPS R4000 SGI: 1^st value: 12 2^nd value: 34
47	Inter-computer Problem Problem whenever one computer reads binary information another writes: Tapes, disks, files Network
48	Compilers add problems Padding Packing Alignment
49	Solution Carefully specify formats/protocols Write code that runs consistently (has same semantics) on any architecture/compiler NOT like the previous example!
50	Standardization Can now safely assume: 2’s complement signed integers IEEE floating point Word sizes of 1,2,4 and now 8 8 bit bytes
51	Now for something different
52	What is color? I believe to understand and utilize phenomena one needs to study The physics of the phenomenon The related human perception psychology & physiology
53	Color By understanding light & vision We can make intelligent decisions as to what to: Capture Store Re-synthesize (render) (Will get us ready for lossy algorithms)
54	Color What phenomena do we perceive as color? How does out color perception work? How do we capture & store this information?
55	The eye
56	Visual Angle Angle subtended by a structure When seen from nodal pt inside the eye
57	Diopter Measure of the power of a lens Reciprocal of the focal length of lens in meters Ex: lens with focal length of .1m Has power of 10 diopters
58	Cornea Human cornea has optical power of 40 diopters Due to curvature and refraction Strongest focusing element in the eye
59	Lens Surrounded by ciliary body Pushes at sides of lens When relaxed Lens is stretched radially Flat Reduced optical power Light focused as far from lens as possible
60	Lens When ciliary muscles tensed Push on sides of lens Thickens lens Optical power increases Focal pt moves closer to lens
61	Lens summary When relaxed Longest focal length (focus at infinity?) When tensed Shortest focal length (focused on close objects) Accommodation: Ability to stretch Elasticity reduces with age Young children lens accomodates from 10 to 30 diopters After 45: lens largely inelastic
62	Retina Light focused by the lens falls on the retina Covers 200° of the back of the eye Contains photoreceptors Rods Cones
63	Photoreceptors Cones Responsible for color perception Rods Intensity perception 10x more sensitive to light Smaller than cones (Can be packed tighter == higher res)
64	Fovea Most of retina is photosensitive Fovea Small region at center of visual axis Subtends less than 2° of visual angle Contains only cones (densest concentration of cones on retina) ≈147,000 Cones per mm² Contrast to hawks with ≈1,000,000 receptors per mm²
65	Photoreceptor distribution Increasingly higher concentration of rods as distance from fovea increases Retina contains approximately 120 million rods 6 million cones
66	Optic Nerve Optic nerve contains ≈1 million fibers Hence much processing takes place in eye The blind spot Where optic nerve meets retina No receptors there at all