1
|
|
2
|
- Attendance
- Final
- Monday December 13th
- 6:00-7:30
- Michelangelo
|
3
|
- Lots of unanticipated changes
- Thanks for being flexible
- I have enjoyed this class
|
4
|
- Wednesday 6:00, Plato
- (might move the room)
- Robert Quattlebaum, Voria Studios
- 2D’s Not Dead
- On building an animation studio on your own tools + open source
- http://seattle.siggraph.org
|
5
|
- Keep those volunteers coming
- See me after class
- Email me: keng@sworks.com
|
6
|
- Blow through
- MPEG-4
- Wavelets
- Fractal Compression
- Final Review
|
7
|
- Covers many forms of digital audio
- Low bitrate speech
- Full bandwidth audio
- Structured Audio
|
8
|
- 6 Components
- SAOL: Orchestra Language
- SASL: Score Language
- SASBF: Sample Bank Format
- MIDI semantics
- Describe how to control SAOL with MIDI
- Scheduler
- Takes above and creates sound
- AudioBIFS
- Make audio soundtracks in MPEG-4 using tools and effects processing
techniques
|
9
|
- Synthesis language
- Describes synthesizers
- Most any technique may be described
- FM, physical modeling, sampling, granular synth, subtractive synth,
etc.
|
10
|
- Simple language to control SAOL synthesizers
- Tells SAOL
- What note to play
- How loud to play note
- Tempo
- How long notes last
- How to modulate notes
- Very Similar to MIDI
|
11
|
- Format to encode/transmit banks of sound samples
- Used in wavetable/sampling sythesis
- Somewhat compatible with DLS (MIDI downloadable sounds)
|
12
|
- Binary Format for Scene Description
- Describes how different objects in structured audio scene fit together
- MPEG-4 components:
- Video clips
- Sounds
- Animation
- BIFS was based on VRML (virtual reality modeling language)
- For example BIFS might describe the audio ‘mix’ for different
instruments, and vocals including attributes like reverb
|
13
|
- Similar to OSI stack’s Session Layer
- One below presentation layer
- One above transport layer mpeg
- Ties video, audio, and meta data into a common stream for transmission
- This is where synchronization and media interleaving is performed
|
14
|
- Based on
- Quantum field theory
- Signal Analysis
- Function Space Theory
- Replaces
|
15
|
- Replacement for FFT in
- Also used for
|
16
|
- Breaks a signal into constituent
- Fourier Analysis
- Series of sinewaves of different frequency
- Wavelet Analysis
- Wavelets
- Shifted/scaled versions of the mother wavelet
|
17
|
- Localized in frequency
- Not in time
- (small change in freq. make changes everywhere in time domain)
|
18
|
- Localized in frequency (scale)
- Localized in time
|
19
|
- Sine wave: smooth, infinite duration
- Wavelet: irregular, localized
- Supports signals with discontinuities or sharp transitions
- Localize features
- Images: Joshua Altmann
|
20
|
- Continuous
- Smooth (all derivative should exist)
- Orthogonal
- Oscillatory
- Should integrate to 0
- ‘Waves’ as much above 0 as below 0
|
21
|
- Diminutive ‘let’
- Well localized
- Zero outside of a finite region
- Easy calculation of direct and inverse transform
- Admissibility (band-pass like spectrum)
|
22
|
- Stable representation
- Small change to image yield small change to transform
- Uniqueness
- Transform for an image should be unique
|
23
|
- FFT (Fast Fourier Transform)
- Fast Wavelet Transform
|
24
|
- Simplest/Oldest Wavelet (used for >80 years)
- Step function
- 1 on [0,1/2)
- -1 on [1/2,1)
- Can approximate any continuous function
|
25
|
|
26
|
- Not continuous
- Hence uneconomical approximation of smooth functions
- Many ‘families’ of wavelets
|
27
|
- If
- One wavelet acts as a bandpass filter
- Then
- Multiple dilated filters act as a filter bank
- (subband coding)
- Can analyze a signal by iterating over filter bank
- (each time splitting the signal)
|
28
|
|
29
|
- .jp2
- Motivated by 8x8 blockiness of JPEG
- Wavelet based
- Very Large Images
- 232 x 232 vs JPEG’s 216 x 216
- Bitrate of < 0.25bpp (for highly detailed B/W images)
- Progressive transmission
|
30
|
- Perceived Patent danger
- Highly protected area of mathematics
- JPEG2000 is not license-free
- But licenses for core tech made freely available
|
31
|
- JPEG2000 has a lossless mode
- Not goot at coding large number of identical pixels
- Hence
- JPEG2000 will be used for photos
- PNG for diagrams
- (Consider GIF and JPEG presently)
|
32
|
|
33
|
- JPEG2000 on the left
- JPEG on the right
- Both Images
- 0.5bpp, 1:48 compression ratio
- Images from Aleks Jakulin
|
34
|
|
35
|
- Representing images as sum of smooth oscillating waves
- Good for smooth gradients
- Ringing on sudden transitions
- JPEG2000, Original, JPEG
|
36
|
- JPEG 2000 Pros:
- Eliminates blocky artifacts
- JPEG 2000 Cons:
- Creates blurring/smoothness
- Slightly increases ringing
- (The cons are seen as less objectionable)
|
37
|
- Consider Real world content
- Magazine articles
- TV News
- Websites
- Mixture of
- Images/gradients
- Text/Line art
|
38
|
- Separate text and photo layers in image
- Compress each layer using most appropriate compression scheme:
- Photos: JPEG like
- Diagrams: PNG like
- Text: OCR then store ASCII, font and color info
|
39
|
- JBIG, DjVu, Microsoft’s document format
- Example compression:
- Tiff raw image: 31MB
- JPEG: 604kB
- DjVu: 70kB
- In general these techniques produce high quality representations in
roughly the same size as an HTML page: 50KB
- Viewers available free
- Compressors expensive
|
40
|
- Interesting application of fractals
- Explored by Michael Barnsley and later by Arnaud Jacquin in the 1980’s
- Some early commercial ‘wins’
- Used in Microsoft’s Encarta Encyclopedia
- In part hasn’t caught on due to patents
|
41
|
- Highly compressed fractal images look better than JPEG images
- Potentially very quick to decode
- Resolution independent
- Appear more ‘natural’ when zoomed in
- Barnsley claims an effective 2,500:1 compression ratio
- But this is based on an expanded (zoomed) image
|
42
|
- Slower (than JPEG) to encode
- requires human intervention?
- Patented
|
43
|
- Original
- Enlarged JPEG
- Enlarged Fractal
- Notice ‘new’ realistic detail in the fractal?
|
44
|
- Self similar patterns
- Interesting properties
- Fractional Dimensionality
- Shoreline is somewhere between 1D and 2D
- Example:
|
45
|
- How long is a shoreline?
- Answer
- It all depends on what scale you mention length!
- Imagine going into each inlet, cove, grain of sand…
|
46
|
- Point: 0D
- Line: 1D
- Plane: 2D
- Space: 3D
- Jaggy shoreline: ≈1.25
- Smooth shoreline: ≈1.0
|
47
|
- Mandelbrot Set
- Zn + 1 = Zn x Zn + C
- Iterate at each coordinate (in
imaginary plane) until equation diverges to infinity
- Coordinates that never diverge
- Are part of set
- Colored black
- Other coordinates are colored based on how quickly they diverge
|
48
|
- Entire set
- Fully ‘zoomed out’
- Notice self similar ‘blobs’
|
49
|
|
50
|
- Recursive family of equations first explored around 1905
- Required fast computers to analytically explore the space
- 1975 Mandelbrot released Les Objets Fractals: Forme, Hasard et Dimension
|
51
|
- Consider a photo copier
- It shrinks the original image
- Then prints multiple overlapping copies
- Repeat ad-infinitim
- Affine transforms may be applied:
- Translate, scale, shear, rotate images
|
52
|
- When a given IFS is run
- A unique image emerges
- Examples:
- Sierpinski’s triangle
- Barnsley’s fern
- Interactive demo at:
|
53
|
- Real objects may be defined/constructed in a similar way:
- Real Ferns
- Brains
- Lungs
- Bones
|
54
|
- Proves for a large class of real-world images
- Compact fractal representations must exist
- However no general purpose algorithm is known
|
55
|
- Stephen Wolfram
- (Creator of Mathematica)
- Spent 10 years creating the book
- Makes argument that all phenomena may be explained by cellular automata
theory
|
56
|
- Fabulously complicated objects (ie Mandelbrot set) may be encoded in
tiny form
- Probably not very interesting for image compression (wavelet approaches
currently better)
- Unless the infinite zoom capability is desireable
- A great way to encode:
- Textures, clouds, ferns, etc.
|
57
|
- From the Greek: Hidden message
- Hiding things in plain site
- Nobody but the intended recipient knows existence of message
|
58
|
- Named from Johannes Trithemius’s 1492 book Steganographia
- Cryptographic/Steganographic treatise disguised as a book on black
magic!
- Spirits communicating over a distance
- See: http://www.esotericarchives.com/tritheim/stegano.htm
|
59
|
- Hiding the meaning of the message
- Not the presence of the message
- A steganographic message often may 1st be encrypted
|
60
|
- ‘Hot’ topics
- Watermarking
- Used by recording industry
- Communication w/o interception
|
61
|
- Used to mark media so as to identify owner
- Media
|
62
|
- Can’t degrade quality
- But by definition: have to change the content
|
63
|
- Has to be robust (Difficult to remove) over common operations
- Imperfect copy
- Lossy compression
- Cropping
- Change of eq/brightness
|
64
|
- Then the delivery mechanism is secondary
- You don’t care about:
- Degradation of the carrier unless it exposes the presence of a message
- How robust the message is to change, so long as it reaches its
intended destination
|
65
|
- Designing text such that every nth letter is part of message
|
66
|
- Need a student volunteer to turn these in
|
67
|
- Emphasizing material from 2nd ˝
- Big concepts from 1st ˝ that pertain to lossy compression
will be there too
|
68
|
- Expect much the same mix of problems as midterm
- Show me that you understand the concepts and can apply them
|
69
|
- People lost potential credit on midterm
- For full credit:
- Show all work
- Show units for all values
- Start a problem even if you can’t complete it
|
70
|
- Do all readings!
- Review all notes (online)
- Understand how to ‘work’ classic problems/algorithms
|
71
|
- Same as midterm
- Closed book
- No computers/calculators
|
72
|
- JPEG
- Color space conversion
- Color reduction
- Macro blocks
- DCT
- Quantization/Q-Tables
|
73
|
- MPEG
- MP3
- Wavelets
- Fractals
- IFS
- Fractal Compression
- Steganography
|