Posted onEdited onWord count in article: 58kReading time ≈53 mins.
Note for the course GAMES 101 - Modern Computer graphics. This course
will gives a comprohensive introduction to four foundamental areas of
modern computer graphics: 1. rasterization, 2. geometry, 3. ray tracing,
4. animation and simulation.
The course is orignally given in Chinese. But the course notes are
available in English.
1. Overview of Computer
Graphics
Why study computer graphics?
Fundamental intellectual
challenges
* Creates and interacts with realistic virtual world
* Requires understanding of all aspects of physical world
* New computing methods, displays, technologies
Technical challenges
* Math of projections, curves, surfaces
* Physics of lighting and shading
* Representing / operating shapes in 3D
* Animation / simulation
* 3D graphics software programming and hardware
Course topics
Rasterization
* Project geometry primitives (3D triangles / polygons) onto the screen
* Break projected primitives into fragments (pixels)
* Gold standard in Video Games (Real-time applications)
Curves and meshes
* How to represent geometry in Computer Graphics
Ray tracing
* Shoot rays from the camera though each pixel
* Calculate intersection and shading
* Continue to bounce the rays till they hit light sources
* Gold standard in animations / movies (offline applications)
Animation / simulation
* Key frame Animation
* Mass-spring system
2. Review of Linear Algebra
Graphics' Dependencies
Basic mathematics
Linear algebra, calculus, statistics
Basic physics
Optics, mechanics
Mics
Signal processing
Numerical analysis
A bit of aesthetics
Cross product in graphics
Determine left or right
Determine inside or outside
3. Transformation
Why study transformation
Modeling
Translation
Scaling
Viewing
(3D to 2D) projection
2D transformations
Representing transformations using matrices
Rotation, scale, shear
Scale transform
Non-uniform scale transform
Reflection matrix
Horizontal reflection
Shear matrix
For example, horizontal shift is 0 at , horizontal shift is at . Vertical shift is always 0.
Rotate
(about the origin ,
count-clockwise by default)
Linear
transformation = Matrix multiplication
Homogeneous coordinate
Translation
Why homogeneous coordinates
Translation
cannot be represented in matrix form
But we don't want translation to be a special case
Is there a unified way to represent all transformations (and what is
the cost)?
Solution: Homogenous
coordinates
Add a third coordinate (-coordinate)
2D point =
2D vector =
The reason why for 2D
vector is that the vector is constant over translation
Valid operatioin if -coordinate of result is 1 or 0
vector + vector = vector
point - point = vector
point + vector = point
point + point = midpoint (see following definition)
In homogeneous coordinates, $
, w $
Matrix representation of translations:
Affine transformations
Affine map = linear map +
translation
Using homogenous coordinates
Inverse transform
is the inverse of
transform in both a matrix and
geometric sense
Composing transformations
Transform ordering matters
Matrix multiplication is not commutative
Note that matrices are applied right to left
Sequence of
affine transforms ,
...
Compose by matrix multiplication
Very important for performance
Pre-multiply matrices to
obtain a single matrix representing combined transform
Decomposing complex
transforms
How to rotate an angle
around a given point ?
Translate center to origin
Rotate
Translate back Matrix representation?
3D transforms
Use homogeneous coordinates again:
3D point =
3D vector =
In general, is the
3D point: for
Use matrices for
affine transformations
3D transformations
Scale
Translation
Rotation around x-, y-, or
z-axis
Compose any 3D rotation from ?
So-called Euler angles
Often used in flight simulators: roll, pitch, yaw
Rodrigues' rotation formula
Rotation by angle around
axis
View / camera transformation
What is view transformation?
Think about how to take a photo
Find a good place and arrange people (model transformation)
Find a good "angle" to put the camera (view transformation)
Cheese! (projection transformation)
How to perform view
transformation?
Define the camera first
Position
Look-at / gaze direction
Up direction (assuming
perpendicular to look-at)
Key observation
If the camera and all objects move together, the "photo" will be the
same
How about that we always transform the camera to
The origin, up at Y, look at -Z
And transform the objects along with the camera
Transform the camera by
So it's located at the origin, up at , look at
in math:
Let's write
Translates to origin
Rotates to
Rotates to
Rotates to
Consider their inverse rotation: to , to ,
to
Transform objects together with the camera
Also known as ModelView Transformation
Projection transformation
Projection in computer graphics
3D to 2D
Orthographic projection
The
camera is considered located at infinite far away
A simple way of understanding
Camera located at origin, looking at , up at
Drop coordinate
Translate and scale the resulting rectangle to
In general
We want to map a cuboid to the "canonical" cube
Method: slightly different orders to the "simple way"
Center cuboid by translating
Scale into "canonical" cube
Transformation matrix
Translate (center to origin) first, then then scale (length /
width / height to 2):
Caveat
Looking at / along is making
near and far not intuitive ()
FYI: That is why OpenGL uses left hand coordinates
Perspective projection
The camera is considered as a point at a certain distance
Most common in computer graphics, art, visual system
Further objects are smaller
Parallel lines not parallel; converge to single point
Recall: property of homogeneous coordinates
all represent the same point in 3D
How to do perspective
projection
First "squish" the frustum into a cuboid ,
Do orthographic projection (, already known)
In order to find a transformation
Recall the key idea: find the relationship between transformed
point and
the original points :
In homogeneous coordinates,
So the "squish" (persp to ortho) projection does this:
It is already good enough to figure out part of
Observation: the third row is responsible for
Any point on the near plane will not change
So the third row must be of the form
Here has nothing to do with
and
Any point's on the far
plane will not change
They give: and
Finally, every entry in is
filled
4. Rasterization
Perspective projection
What is near plane's ?
If explicitly specified, good
Sometimes people prefer vertical field-of-view () and aspect ratio = width /
height
Assume symmetry, i.e.
How to convert from and
aspect to ?
What is after MVP?
Model transformation (placing objects)
View transformation (placing camera)
Projection transformation
Orthographic projection (cuboid to "canonical" cube )
Perspective projection (frustum to "canonical" cube)
Canonical cube to screen
What is a screen?
An array of pixels
Size of the array: resolution
A typical kind of raster display
Raster == screen in German
Rasterize == drawing onto the screen
Pixel (FYI, short for "picture element")
For now: a pixel is a little square with uniform color
Color is a mixture of (red, green, blue)
Defining the screen space
Slightly different from the "tiger book"
at the left-down corner.
-axis toward right. -axis toward up
Pixels' indices are in the form of , where both and are integers
Pixels' indices are from
to
Pixel is centered at
The screen covers range
to
Irrelevant to
Transform in -plane: to
Viewport transform matrix:
Rasterization: Drawing
to raster displays
Triangles -
Fundamental shape primitives
Why triangles
Most basic polygon
Break up other polygons
Unique properties
Guaranteed to be planar
Well-defined interior
Well-defined method for interpolating values at vertices over
triangle (barycentric interpolation)
What pixel values
approximate a triangle?
Input: position of triangle vertices projected on screen
Output: set of pixel values approximating triangle
A simple approach: sampling
Sampling a function: Evaluating a function at point is sampling. We
can discretize a function by sampling
Sampling is a core idea in graphics. We sample time (1D), area (2D),
direction (2D), volume (3D) ...
Thus define binary function: inside(tri, x, y), here
x, y are not necessarily integers
1 2
inside (t, x, y) = 1 if Point (x, y) in triangle t 0 otherwise
Rasterization =
Sampling a 2D indicator function:
1 2 3
for (int x = 0; x < xmax; ++x) for (int y = 0; y < ymax; ++y) image[x][y] = inside(tri, x + 0.5, y + 0.5);
Evaluating inside(tri, x, y):
Three cross products should give same sign
Edge cases: you can define the situation by yourself
Do not check all pixels on the screen. Use an axis-aligned
bounding box (AABB)
A faster way: incremental triangle traversal: suitable for thin and
rotated triangles
Will lead to aliasing (Jaggies)
Antialiasing
Sampling
Sampling is
Ubiquitous in computer graphics
Photograph = Sampling image sensor plane
Video = Sampling time
Sampling
artifacts (errors / mistakes / inaccuracies) in computer graphics
Jaggies (staircase pattern)
This is also an example of "aliasing" - a sampling error
Moiré patterns in imaging
Occur because of skip odd rows and columns
Wagon Wheel Illusion (False motion)
Problem with human eye - sampling in time slower than movement of an
object
Many more...
Behind the aliasing artifacts
Signals are changing too fast (high frequency), but sampled too
slowly
Antialiasing
idea: Blurring (Pre-Filtering) before sampling
Antialiased edges in rasterized triangle where pixel values take
intermediate values
But why aliasing and
pre-filtering?
Why undersampling introduces aliasing?
Why pre-filtering then samling can do antialiasing?
Fundamental reasons
Frequency domain
Fourier transform
Represent a function as a weighted sum of sines and cosines
Fourier transform decomposes a signal into frequencies
Functions in spatial domain is Fourier transformed into new
functions in frequency domain
From a function in frequency domain, we can see that:
For low-frequency signal: sampled adequately for reasonable
reconstruction
High-frequency signal is insufficiently sampled:reconstruction
incorrectly appears to be from a flow frequency signal
Undersampling creates frequency aliases
Two frequencies that are indistinguishable at a given sampling are
called "aliases"
Filtering
= Getting rid of certain frequency contents. Also, Filtering =
Convolution (=Averaging)
Convolution
Point-wise average on a sliding box
Convolution theorem
Convolution in the spatial domain is equal to multiplication in the
frequency domain, and vice versa
Option 1:
Filter by convolution in the spatial domain
Option 2:
Transform to frequency domain (Fourier transform)
Multiply by Fourier transform of convolution kernel
Transform back to spatial domain (inverse Fourier)
Box filter
Box function = "Low pass" filter
Wider filter kernel = Lower Frequencies
Sampling = Repeating
frequency contents
Aliasing = Mixed frequency contents
Dense sampling = repeated frequency contents does not overlap
Sparse sampling = repeated frequency contents overlaps which gives
aliasing
How can we reduce aliasing
error?
Option 1: Increase sampling rate
Essentially increasing the distance between replicas in the Fourier
domain
Higher resolution displays, sensors, framebuffers
But: costly & may need very high resolution
Option 2: Antialiasing
Making Fourier contents "narrower" before repeating
i.e. Filtering out high frequencies before sampling
Limiting, the repeating
A practical pre-filter
A 1 pixel-width box filter (low pass, blurring)
Antialiasing by computing average pixel value
In rasterizing one triangle, the average value inside a pixel area
of =
inside(triangle, x, y) is equal to the area of the pixel
covered by the triangle
Antialiasing by supersampling (MSAA)
Supersampling
Approximate the effect of the 1-pixel box filter by sampling
multiple locations within a pixel and averaging their values
Step 2: Average the NxN samples "inside" each pixel
No free lunch! What is the cost of MSAA?
More computation
More antialiasing method: Milestones
FXAA (Fast Approximate AA)
TAA (Temporal AA)
Super resolution / super sampling
From low resolution to high resolution
Essentially still "not enough samples" problem
DLSS (Deep Learning Super Sampling)
Visibility / occlusion
Painter's algorithm
Inspired by how painters paint: Paint from back to front, overwrite
in the framebuffer
Requires sorting in depth ( for triangles)
Can have unresolvable depth order (ordering in a cycle)
Not very good, thus we use Z-buffer algorithm
Z-buffering
Idea
Store current min. z-value for each sample (pixel)
Needs an additional buffer for depth values
Frame buffer stores color values
Depth buffer (z-buffer) stores depth
Important: For simplicity we suppose that is always positive
Smaller is closer
The Z-Buffer algorithm
Initialize depth buffer to
During rasterization:
1 2 3 4 5 6 7
for (each triangle T) for (each sample (x, y, z) in T) if (z < zbuffer[x, y]) framebuffer[x, y] = rgb; zbuffer[x, y] = z; else // do nothing, this sample is occluded
Z-Buffer complexity
Complexity
for triangles (assuming constant coverage)
Assuming that each triangle is not too big
How is it possible to sort
triangles in linear time?
Because we did not sort them, just find the minimum
Drawing triangles in different orders?
Working fine
Most important visibility algorithm
Implemented in hardware for all GPUs
5. Shading
Definition
In Meriam-Webster dictionary
The darkening or coloring of an illustration or diagram with
parallel lines or a block of color
In this course
The process of applying a material to an object
Properties of shading
Shading is local:
Compute light reflected toward camera as a specific shading
point
Shading will ignore all other objects
Thus no shadows will be generated
Shading != shadow
Inputs:
Viewer direction:
Surface normal:
Light direction: (for
each of many lights)
Surface parameters: color, shininess, ...
are
all unit vector
A simple
shading model (Blinn-Phong reflectance model)
Perceptual observations
and modeling
Specular highlights
Intensity depends on view direction
Bright near mirror reflection direction
close to mirror
direction half vector near
normal Measure "near" by dot product of unit vectors
Here
: specularly reflected
light
: specular coefficient
: the cosine decreases too
slowly, thus increasing narrows
the reflection lobe
Usually set to 100-200 in Blinn-Phong model
Diffuse reflection
Light is scattered uniformly in all directions
Surface color is the same for all viewing directions
But how much light (energy) is received?
Lambert's cosine law
In general, light per unit area is proportional to
Lambertian (Diffuse) shading: shading independent of view
direction
Here:
: diffusely reflected
light
: diffuse coefficient
(color)
: energy arrived at the
shading point
:
energy received by the shading point
Ambient lighting
Shading that does not depend on anything
Add constant color to account for disregarded illumination and fill
in black shadows
Best to get vertex normals from the underlying geometry
e.g. consider a sphere
Otherwise have to infer vertex normals from triangle faces
Simple scheme: average surrounding face normals
Shade each pixel (Phone
shading)
Interpolate normal vectors across each triangle
Compute full shading model at each pixel
Not the Blinn-Phong reflectance model
Defining per-pixel normal
vectors
Barycentric interpolation of vertex normals
Don't forget to normalize the interpolated directions
Graphics (Real-time
rendering) pipeline
Application => Input is vertices in 3D
Vertex processing => Vertices positioned in screen space
Triangle processing => Triangle positioned in screen space
Rasterization => Fragments (one per covered sample)
Fragment processing => Shaded fragments
Framebuffer operations => Output: image (pixels)
Display
Graphic pipeline
implementation: GPUs
Specialized processors for executing graphics pipeline
computations
Heterogeneous, Multi-core processor
Modern GPUs offer 2 - 4
Tera-FLOPs of performance for executing vertex and fragment shader
programs
Shader programs
Program vertex and fragment processing stages
Describe operation on a single vertex (or fragment)
Shader function executes once per fragment
Outputs color of surface at the current fragment's screen sample
position
For example a GLSL fragment shader program. This shader performs
a texture lookup to obtain the surface's material color at this point,
then performs a diffuse lighting calculation
1 2 3 4 5 6 7 8 9 10 11
uniformsampler2D myTexture; // program parameter uniformvec3 lightDir; // program parameter varyingvec2 uv; // per fragment value (interp. by rasterizer) varyingvec3 norm; // per fragment value (interp. by rasterizer)
void diffuseShader() { vec3 kd; kd = texture2d(myTexture, uv); // material color from texture kd *= clamp(dot(-lightDir, norm), 0.0, 1.0); // Lambertian shading model gl_FragColor = vec4(kd, 1.0); // output fragment color }
Goal: Highly complex
3D scenes in Realtime
100's of thousands to millions of triangles in a scene
Complex vertex and fragment shader computations
High resolution (2 - 4 megapixel + supersampling)
30 - 60 frames per second (even higher for VR)
Texture mapping
Surfaces are 2D
Surface lives in 3D world space
Every 3D surface point also has a place where it goes in the 2D
image (texture)
Visualization of texture coordinates
Each triangle vertex is assigned a texture coordinate
Textures can be used multiple times (tiled)
Interpolation
across triangles: Barycentric coordinates
Interpolation across
triangles
Why do we want to interpolate?
Specify values at vertices
Obtain smoothly varying values across triangles
What do we want to interpolate?
Texture coordinates, colors, normal vectors, ...
How do we do interpolation?
Barycentric coordinates
Barycentric coordinates
A coordinate system for triangles :
Every point in the
surface of this triangle can be computed with
Here are coordinates of
vertices of the triangle and The point is
inside the triangle if all three coordinates are non-negative
Geometric viewpoint - proportional areas, for a point in the triangle:
What is the barycentric coordinate of the centroid?:
There is also a formula for any point which can be found on all
kinds of documents
Using Barycentric
coordinates
Linear interpolate values at vertices
To get the value inside a
triangle with values
at its vertices
can be positions,
texture coordinates, color, normal, depth, material attributes,
...
However, barycentric coordinates are not invariant under
projection
Applying textures
Simple texture mapping:
diffuse color
1 2 3 4
for each rasterized screen sample (x, y): (u, v) = evaluate texture coordinate at (x, y); texcolor = texture.sample(u, v); set sample's color to texcolor;
Texture
magnification (What if the texture is too small?)
Easy case
Generally don't want this - insufficient texture resolution
A pixel on a texture - a texel
Bilinear interpolation
Want to sample texture value f(x, y) at a non-integer
position
Take 4 nearest sample locations and fractional offsets to these 4
locations
Linear interpolation (1D):
Two helper lerps (horizontal):
Fianl vertical lerp, to get result:
Bilinear interpolation usually gives pretty good results at
reasonable costs
Hard case: What if
the texture is too large?
Point sampling texture
Upsampling: magnification
Downsampling: minification
Will supersampling do antialiasing?
Yes, high quality but costly
When highly minified, many texels in pixel footprint
Signal frequency too large in a pixel
Need even higher sampling frequency
Let's understand this problem in another way
What if we don't sample
Just need to get the average value within a range
Point query vs range query
Mipmap
Allowing (fast, approximate, square) range queries
"Mip" comes from the Latin "multum in parvo", meaning a multitiude
in a small space
Mip hierarchy
Storage overhead: times of
the original picture
Trilinear interpolation
Linear interpolation based on continuous mapmap value
Mipmap limitations
Overblur
Partially solved by anisotropic filtering
Better for rectangle area but not un-regular area
Storage overhead: times of
the original picture
Slowly solved by EWA filtering
Use multiple lookups
Weighted average
Mipmap hierarchy still helps
Can handle irregular footprints
Many, many uses for texturing
In modern GPUs, texture = memory + range query (filtering)
General method to bring data to fragment calculations
Many applications
Environment lightnng
Store microgeometry
Procedural textures
Solid modeling
Volume rendering
...
Cube map
A vector maps to cube point along that direction. The cube is
textured with 6 squares
Gives much less distortioin
Need dir -> face computation
Textures can affect shading
Textures doesn't
have to only represent colors
What if it stores the height / normal?
Bump / normal mapping
Fake the detailed geometry
Bump mapping
Normal mapping
Adding surface detail without adding more triangles
Perturb surface normal per pixel (for shading computations
only)
"Height shift" per texel defined by a texture
How to modify vector?
How to perturb the normal (in flatland)
Original surface normal n(p) = (0,1)
Derivative at p is
dp = c * [h(p+1) - h(p)]
Perturbed normal is then
n(p) = (-dp, 1).normalized()
How to perturb the normal in 3D
Original surface normal n(p) = (0, 0, 1)
Derivates at p are:
dp/du = c1 * [h(u + 1) - h(u)]
dp/dv = c2 * [h(v + 1) - h(v)]
Pertubed normal is
n = (-dp/du, -dp/dv, 1).normalized()
Note that this is in local coordinate
Displacement
mapping - a more advanced approach
Uses the same texture as in bumping mapping
But here, we actually moves the vertices
Use much more triangle, thus resources
3D Procedural noise + Solid
modeling
For example, Perlin noise
Provide percomputed shading
For example, ambient occlusion texture map
Shadows
How to draw shadows using rasterization?
Shadow mapping!
Shadow mapping
An image-space algorithm
No knowledge of scene's geometry during shadow computation
Must deal with aliasing artifacts
Key idea:
The points that are NOT in shadow must be seen both by the light and
by the camera
Pass 1: Render from light
Depth image from light source
Pass 2A: project to light
Porject visible points in eye view back source
Reprojected depths match for light and eye, then visible
Reprojected depths from light and eye are not the same, then
blocked
Well known rendering technique
Basic shadowing technique for early animations and in EVERY 3D video
game
Problems with shadow maps
Hard shadows (point lights only)
Quality depends on shadow map resolution (general blem with
image-based techniques)
Involves equaliy comparison of floating point depth values means
issues of scale, bias, tolerance
Hard shadows vs. soft shadows
0 or 1 shadows
Non-binary shadows
6. Geometry
Many ways to represent
geometry
Implicit geometry
For example:
Algebraic surface
Level sets
Distance functions
...
Based on classifying points
Points satisfy some specified relationship
E.g. sphere: all points in 3D, where
More generally:
Sampling can be hard
What points lie on
Inside / outside test is easy
Pros:
Compact description (e.g. a function)
Certain queries easy (inside object, distance to surface)
Good for ray-to-surface intersection
For simple shapes, exact description / no sampling error
Easy to handle changes in topology (e.g. fluid)
Cons:
Difficult to model complex shapes
More
implicit representations in computer graphics
Algebraic surfaces (Implicit)
Surface is zero set of a polynomial in
More complex shapes?
Constructive solid goemetry (CSG)
Combine implicit geometry via Boolean operations
Distance functions
Instead of Booleans, gradually blend surfaces together using
distance functions:
Giving minimum distance (could be signed distance) from anywhere to
object
Level set methods
Closed-form equations are hard to describe complex shapes
Alternative: sotre a grid of values approximating function
Surface is found where interpolated values equal zero
Provides much more explicit control over shape (like a texture)
Level sets encode, e.g., constant tissue density
Fractals
Exhibit self-similarity, detail at all scales
"Language" for describing natural phenomena
Hard to control shape
Explicit geometry
For example:
Point cloud
Polygon mesh
Subdivision, NURBS
...
All points are given directly or via parameter mapping
Generally:
Sampling is easy
What points lie on this surface? Just plug in values
Inside / Outside test is hard
More
explicit representations in computer graphics
Point cloud
Easiest representation: list of points (x, y, z)
Easily represent any kind of geometry
Useful for LARGE datasets (>> 1 point/pixel)
Often converted into polygon mesh
Difficult to draw in undersampled regions
Polygon mesh
Store vertices & polygons (often triangles or quads)
Eaier to do processing / simulation, adaptive sampling
The Wavefront Object file (.obj) format
Commonly used in graphics research
Just a text file that specifies vertices, normals, texture
corrdinates and their connectivities
No "best"
representation - geometry is hard
Curves
Bézier curves
Defining cubis Bézier curve with tangents
Evaluating
Bézier curves (de Casteljau algorithm)
Consider three points (quadratic Bézier)
Repeat recursively
Cubic Bézier curve - de Casteljau
Four input points in total
Same recursive line interpolations
Albebraic formula
de Casteljau algorithm gives a pyramid of coefficients
Example: quadratic Bézier curve from three points
General algebraic formula
Bernstein form of a Bézier curve of order :
here
is the Bézier curve of
order (vector polynomial of
degree )
is the Bézier control
points (vector in )
is the Berstein
polynomial (scalar polynomial of degree )
Which is:
Properties of Bézier curves
Interpolates endpoints
For cubic Bézier: ;
Tangent to end segments
Cubic case: ;
Affine transformation property
Transform curve by transforming control points
Convex hull property
Curve is within convex hull of control points
Piecewise Bézier curves
Higher-order Bézier curves?
Very hard to control
Piecewise Bézier curves
Instead, chain many low-order Bézier curve. Piecewise cubic Bézier
is the most common technique. Widely used
Continuity:
continuity:
continuity:
Other types of splines
Spline
A continuous curve constructed so as to pass through a given set of
points and have a certain number of continuous derivatives
In short, a curve under control
B-splines
Short for basis splines
Require more information than Bézier curves
Satisfy all important properties that Bézier curves have (i.e.
superset)
Bézier surfaces
Extend Bézier curves to surfaces
Bicubic Bézier surface patch
points to form a
surface
Evaluating Bézier surfaces
Evaluating surface position for parameters
For bi-cubic Bézier surface patch
Input: control
points
Output is 2D surface parameterized by in i
Method: separable 1D de Casteljau Algorithm
Use de Casteljau to evaluate point on each of the 4 Bézier curves in . This gives 4 control points for the
"moving" Bézier curve
Use 1D de Casteljau to evaluate point on the "moving" curve
Mesh operations: Geometry
processing
Mesh subdivision
Upsampling: Increase resolution
Mesh simplification
Downsampling: Decrease resolution; try to preserve shape /
appearance
Mesh regularization
Same number of triangles: modify sample distribution to improve
quality
Mesh subdivision algorithm
Loop subdivision
Common subdivision rule for triangle meshes
First, create more triangles (vertices)
Split each triangle into four
Second, tune their positions
Assign new vertex positions according to weights
New / old vertices updated differently
Update
For new vertices: update to
3/8 * (A + B) + 1/8 * (C + D), where A,
B are nodes on the same node as the new vertex and
C, D are in the same triangle as the new
vertex that is not A, B
For old vertices: Update to
(1 - n*u) * original_position + u * neightbor_position_sum,
where n is the vertex's degree, u = 3/16 if
n = 3, u = 3 / (8n) otherwise
Catmull-Clark
subdivision (General mesh)
Quad face: surface with # edges == 4
Non-quad face: surface with # edges != 4
Extraordinary vertex: vertex with degree != 4
Each subdivision step:
Add vertex in each face
Add midpoint on each edge
Connect all new vertices
After one subdivision
All non-quad faces disappear while each non-quad face adds a
extraordinary vertex
Catmull-Clark vertex update rules
Separated into face point, edge point, vertex point
Mesh simplification
algorithm
Goal: reduce number of mesh elements while maintaining the overall
shape
Collapsing an edge
Suppose we simplify a mesh using edge collapsing
Quadric error metrics
How much geometric error is introduced by simplification?
Not a good idea to perform local averaging of vertices
Quadric error: new vertex should minimize its sum of square distance
(L2 distance) to previously related triangle planes
Which edges? Assign score with quadric error metric
Approximate distance to surface as sum of distances to planes
containing triangles
Iteratively collapse edge with smallest score
Greedy algorithm gives great results
7. Ray tracing
Why ray tracing?
Rasterization could't handle global effects well
(Soft) shadows
And especially when the light bounces more than once
Rasterization is fast, but quality is relatively low
Ray tracing is accurate, but is very slow
Rasterization: real-time, ray tracing: offline
~10K CPU core hours to render one frame in production
Basic ray tracing algorithms
Light rays
Three ideas about light rays
Light travels in straight lines (though this is wrong)
Light rays do not "collide" with each other if they cross (though
this is still wrong)
Light rays travel from the light sources to the eye (but the physics
is invariant under path reversal - reciprocity)
"And if you gaze long into an abyss, the abyss also gazes into
you"
Ray casting
Generate an image by casting one ray per pixel
Check for shadows by sending a ray to the light
Recursive (Whitted-style)
ray tracing
"An improved illumination model for shaded display". Simulate how
light bounces around
Ray-surface intersection
Ray equatioin
Ray is defined by its origin and a direction vector: Here
: a point along the
ray
: "time"
: origin
: (normalized)
direction
Ray intersection with sphere
Sphere:
The intersection must satisfy
both ray equation and sphere equation. So: Solving: where
Ray intersection with
implicit surface
General implicit surface:
Substitute ray equation: , and solve for
real, positive roots
Ray intersection with
triangle mesh
Why triangle?
Rendering: visibility, shadows lighting ...
Geometry: inside / outside test
How to computer? Let's break this down:
Simple idea: just intersect ray with each triangle
Simple, but slow
Can have 0, 1 intersections (ignoring multiple intersections)
Triangle is in a plane
Ray-plane intersection
Test if hit point is inside triangle
Plane equation: plane is defined by normal vector and a point on
plane Then we can solve for intersection:
Set and solve
for :
Check:
Möller Trumbore algorithm
A faster approach, giving barycentric coordinate directly are
barycentric coordinates and we solve for
Accelerating ray-surface
intersection
Ray tracing - performance
challenges
Simple ray-scene intersection
Exhaustively test ray-intersection with every object
Find the closest hit (with minimum )
Problem:
Naive algorithm = #pixles #objects #bounces
Very slow
Bounding volumes
Quick way to avoid intersections: bound complex object with a simple
volume
Object is fully contained in the volume
If it doesn't hit the volume, it doesn't hit the object
So test the bounding volumes first, then test the objects
Ray-intersection with box
Box is the intersection of 3 pairs of slabs. Specifically, we often
use an axis-aligned bounding box (AABB), i.e. any side of the bounding
box is along either x, y, or z axis
Ray intersection with
axis-aligned box
2D example (3D is the same): Compute intersections with slabs and
take intersection of intervals
A box (3D) = three pairs of infinityly large slabs
Key ideas
The ray enters the box only when it enters all pairs of slabs
The ray exits the box as long as it exits any pair of slabs
For each pair, calculate the and (negative is fine)
Fox the 3D,
If , we known that the ray stays a while in the box
so they must intersect
However, ray is not a line
Should check whether is
negative for physical correctness
What if ?
The box is "behind" they ray - no intersection
What if
and ?
Then the ray's origin is inside the box - have intersection
In summary, ray and AABB intersect iff
and
Why axis-aligned?
For the general surface, the computation is complex. But, for
example, for slabs perpendicular to x-axis, we have:
Uniform spatial partitions
(Grids)
Preprocess - build acceleration grid
Find bounding box
Create grid
Store each object in overlapping cells
Ray-scene intersection
Step through grid in ray traversal order
For each grid cell:
Test intersection with all objects stored at that cell
Grid resolution?
One cell: no speedup
Too many cells: Inefficiency due to extraneous grid traversal
Heuristic:
Number of cells = C * number of objects
C 27 in 3D
Spatial partitions
KD-Tree pre-processing
Data structure for KD-Trees
Internal nodes store
Split axis: x-, y-, or z- axis
Split position: coordinate of split plane along axis
Children: pointers to child nodes
No objects are stored in internal nodes, only in leaves
Leaf nodes store
List of objects
Traversing a KD-Tree
Going throw the tree and check if the ray intersect each internal
nodes. If it does, goes down in the tree. If it is a leaf node, check if
the ray intersect with all of the objects in the leaf node
Heuristic #1: Always chose the longest axis in node
Heuristic #2: Split node at location of median object
Data structure for BVHs
Internal node store
Bounding box
Children: pointers to child nodes
Leaf nodes store
Bounding box
List of objects
Nodes represent subset of primitives in scene
All objects in subtree
Pseudo code
1 2 3 4 5 6 7 8 9 10 11
Intersect(Ray ray, BVH node) { if (ray misses node.bbox) return; if (node is a leaf node) test intersection with all objs; return closest intersection; hit1 = Intersect (ray, node.child1); hit2 = Intersect (ray, node.child2); return the closer of hit1, hit2; }
Spatial vs Object partitions
Spatial partition (e.g. KD-tree)
Partition space into non-overlapping regions
An object can be contained in multiple regions
Object partition (e.g. BVH)
Partition set of objects into disjoint subsets
Bounding boxes for each set may overlap in space
8. Basic radiometry
Motivation why we study
radiometry
Observation
In Blinn-Phong model, we might use for example the light intensity
. But what is it? What is the
unit?
Do you think Whitted style ray tracing gives correct results?
All the answers can be found in radiometry
Radiometry
Measurement system and units for illumination
Accurately measure the spatial properties of light
New terms: Radiant flux, intensity, irradiance, radiance
Perform lighting calculations in a physically correct manner
Radiant energy and flux
(power)
Def: Radiant energy is the energy of electromagnetic radiation. It is
measured in units of joules, and denoted by the symbol: Def: Radiant flux (power) is the energy emitted, reflected,
transmitted or received, per unit time Flux can also be considered as #photons flowing through a
sensor in unit time
Important light
measurements of interest
Light emitted from a source: Radiant intensity
Light falling on a surface: Irradiance
Light traveling along a ray: Radiance
Radiant intensity
Def: The radiant (luminous) intensity is the power per unit solid
angle emitted by a point light source
Angles and solid angles
Angle: Ratio of subtended arc length on circle to radius. Circle has
radians
Solid angle: Ratio of subtended area on sphere to radius squared.
Sphere has steradians
Differential solid angles:
Sphere:
Isotropic point source
Irradiance
Def: The irradiance is the power per (perpendicular / projected) unit
area incident on a surface point Lambert's cosine law: Irradiance at surface is proportional to
consine of angle between light direction and surface normal which is how
the Lambert's cosine law is defined
Irradiance get lower and lower the further away from the light
source
Radiance
Radiance is the fundamental field quantity that describes the
distribution of light in an environment
Radiance is the quantity associated with a ray
Rendering is all about computing radiance
Def: The radiance (luminance) is the power emitted, reflected,
transmitted or received by a surface, per unit solid angle, per
projected unit area Here accounts
for projected surface area
Recall:
Irradiance: power per projected unit area
Intensity: power per solid angle
So:
Radiance: Irradiance per solid angle
Radiance: Intensity per projected unit area
Incident radiance
Incident radiance is the irradiance per unit solid angle arriving at
the surface, i.e. it is the light arriving at the surface along a given
ray (point on surface and incident direction)
Exiting radiance
Exiting surface radiance is the intensity per unit projected area
leaving the surface, e.g. for an area light it is the light emitted
along a given ray (point on surface and exit direction)
Irradiance vs. Radiance
Irradiance: total power received by area
Radiance: power received by area from "direction"
Here is the unit
hemisphere
Bidirectional
Reflectance Distribution Function (BRDF)
Reflection at a point
Radiance from direction
turns into the power that receives. Then power will become the radiance to any other
direction
Differential irradiance incoming:
Differential radiance exiting (due to ):
BRDF
The BRDF represents how much light is reflected into each outgoing
direction from each
incoming directioin
The reflection equation
Challenge: recursive equation
Reflected radiance depends on incoming radiance
But incoming radiance depends on reflected radiance (at another
point in the scene)
The rendering equation
Re-write the reflection equation by adding an emission term to it
general in case if the object is itself a light source Rendering equation as integral equation is a Fredholm Integral
Equation of second kind with canonical form Or a linear operator equation:
Ray tracing and extensions
General class numerical Monte Carlo methods
Approximate set of all paths of light in scene With Binomial theorem: Here:
: Emission directly from light
sources
: Direct illumination on
surfaces
: Indirect illumination
(one bounce indirect)
: Indirect illumination
(two bounce indirect)
Shading in rasterization consider only
Monte Carlo path tracing
Monte Carlo Integration
We want to solve an integral, but it can be too difficult to solve
analytically. Let's define the Monte Carlo estimator for the definite
integral of given function
Random variable:
Monte Carlo estimator:
For example, uniform random variable: , then: Some notes:
The more sample, the less variance
Sample on , integrate on
Path tracing
Path tracing can be almost 100% correct, a.k.a PHOTO-REALISTIC
Whitted-style ray tracing
Always perform specular reflections / refractions
Stop bouncing at diffuse surfaces
These simplifications are not always reasonable
But the rendering equation is correct
However it involves:
Solving an integral over the hemisphere
Recursive execution
A simple Monte Carlo
solution
Suppose we want to render one pixel (point) for direct illumination
only. The the rendering equation is just an integration over directions.
So we can solve it using Monte Carlo integration.
What's our ?
What's our pdf? Assume uniformly sampling the hemisphere.
Which gives: Which can be written in pseudo code:
1 2 3 4 5 6 7 8
share(p, wo) Randomly choose N directions wi ~ pdf Lo = 0.0 For each wi Trace a ray r(p, wi) If ray r hit the light Lo += (1/N) * L_i * f_r * cosine / pdf(wi) return Lo
Introducing global
illumination
One more step forward: What if a ray hits an object? It is also a
light source then. Which will change the previous code to:
1 2 3 4 5 6 7 8 9 10
share(p, wo) Randomly choose N directions wi ~ pdf Lo = 0.0 For each wi Trace a ray r(p, wi) If ray r hit the light Lo += (1 / N) * L_i * f_r * cosine / pdf(wi) Else If ray r hit an object at q Lo += (1 / N) * shade(q, -wi) * f_r * cosine / pdf(wi) Return Lo
Problems with this kind
of path tracing
Explosion of number of rays as number of bounces go up
Solution: Only 1 ray is traced at each shading point. This is
then called Path tracing.
ray_generation(camPos, pixel) Uniformly choose N sample positions within the pixel pixel_radiance = 0.0 For each sample in the pixel Shoot a ray r(camPos, cam_to_sample) If ray r hit the scene at p pixel_radiance += 1 / N * shade(p, sample_to_cam) Return pixel_radiance
2. The recursive algorithm will never stop
* Solution: Russian Roulette (RR)
* Russian Roulette is all about probability.
* With probability $0<P<1$, you are fine * With probability $1-P$, otherwise
* Previously, we always shoot a ray at a shading point and get the shading result $L_o$
* Suppose we manually set a probability $P$, shoot a ray and return the shading result divided by $P$: $L_o/P$
* With probability $1-P$, don't shoot a ray and you will get 0
* In this way, you can still get a expect value of $Lo$: $$ E = P \times (L_o/P) + (1-P)\times 0 = L_o $$
* Which gives the following pseudo code:
```pseudocode shade(p, wo) Manually specify a probability P_RR Randomly select ksi in a uniform dist. in [0, 1] If (ksi > P_RR) return 0.0 Randomly choose One direction wi~pdf(w) Trace a ray r(p, wi) If ray r hit the light Return L_i * f_r * cosine / pdf(wi) / P_RR Else If ray r hit an object at q Return shade(q, -wi) * f_r * cosine / pdf(wi) / P_RR
Now, the path tracing is correct
However, still not really efficient
Understanding the reason of being inefficient: There will be 1
ray hitting the light. So a lot of rays are "wasted" if we uniformly
sample the hemisphere at the shading point
Sampling the light
Monte Carlo methods allows any sampling methods, so we can sample
the light (therefore no rays are "wasted")
Assume uniformly sampling on the light: Because But the rendering equation integrates on the solid angle:
Recall: for Monte Carlo integration, sample on then integration on :
Need to make the rendering equation as an integral of . Need the relationship between and .
Projected area on the unit sphere: Here is the
angle between the normal of the light and the vector from to
Then the rendering equation become: Which gives the pseudo code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
shade(p, wo) # Contribution from the light source L_dir = 0.0 Uniformly sample the light at xp (pdf_light = 1 / A) Shoot a ray from p to xp If the ray is not blocked in the middle L_dir = L_i * f_r * cos theta * cos thetap / |xp - p|^2 / pdf_light # Contribution from other reflectors L_indir = 0.0 Test Russian Roulete with probability P_RR Uniformly sample the hemisphere toward wi (pdf_hemi = 1 / 2pi) Trace a ray r(p, wi) If ray r hit a non-emitting object at q L_indir = shade(q, -wi) * f_r * cos theta / pdf_hemi / P_RR Return L_dir + L_indir
Now, Path tracing is finally done
9. Materials and Appearances
What is material in
computer graphics?
Material == BRDF
Diffuse / Lambertian material:
Light is equally reflected in each output direction. Suppose the
incident lighting is uniform: Which gives albedo (color):
Some materials
Glossy material (BRDF)
Ideal reflective / refractive material (BSDF)
Isotropic / Anisotropic materials (BRDFs)
Key: directionality of underlying surface
Anisotropic BRDF: Brushed metal
Fresnel reflection / term
Reflectance depends on incident angle (and polarization of light).
This terms differs from different material. There is a very complex
formulate. To make it accurate, you need to consider polarization.
Schlick's approximation: Here
Microfacet Material
Microfacet theory
Rough surface
Macroscale: flat & rough
Microscale: bumpy & specular
Microfacet BRDF
Key: the distribution of microfacets' normals
Concentrated <=> Glossy
Spread <=> Diffuse
What kind of microfacet reflect to ? Here
: half vector
: Fresnel term
: Shadowing masking term
: Distribution of normals
Isotropic / Anisotropic
materials (BRDFs)
Key: directionality of underlying surface
Anisotropic BRDF: For example, brushed metal
Properties of BRDFs
Non-negativity
Linearity
Reciprocity principle
Energy conservation
Isotropic vs. anisotropic
If isotropic,
Then, from reciprocity:
Measuring BRDFs
Motivation
Avoid need to develop / derive models
Automatically includes all of the scattering effects present
Can accurately render with real-world materials
Useful for product design, special effects, ...
Theory and practical do not align
Image-based BRDF measurement
Goniorefectometer, also called spherical gantry at UCSD. General
approach:
1 2 3 4 5
foreach outgoing direction wo move light to illuminate surface with a thin beam from wo foreach incoming direction wi move sensor to be at direction wi from surface measure incident radiance
Improving efficiency
Isotropic surfaces reduce dimensionality from 4D to 3D
Reciprocity reduces number of measurements by half
Clever optical systems...
Challenges in measuring
BRDFs
Accurate measurements at grazing angles
Important due to Fresnel effects
Measuring with dense enough sampling to capture high frequency
specularities
Retro-reflection
Spatially-varying reflectance, ...
Representing measured BRDFs
Desirable qualities
Compact representation
Accurate representation of measured data
Efficient evaluation for arbitrary pairs of directions
Good distributions available for importance sampling
10. Advanced topics in
rendering
Biased vs. unbiased
Monte Carlo estimators
An unbiased Monte Carlo technique does not have any systematic error
The expected value of an unbiased estimator will always be the
correct value, no matter how many samples are used
Otherwise, biased
One special case, the expected value converges to the correct value
as infinite number of samples are used -> consistent
Biased == blurry
Consistent == not blurry with infinite number of samples
Bidirectional path tracing
(BDPT)
Recall: a path connects the camera and the light
BDPT
Traces sub-paths from both the camera and the light
Connects the end points from both sub-paths
Suitable if the light transport is complex on the light's side
Unbiased
Difficult to implement & quite slow
Metropolis light transport
(MLT)
A Markov chain Monte Carlo (MCMC) application
Jumping from the current sample to the next with some PDF
Works great with difficult light paths
Unbiased
Difficult to estimate the convergence rate
Does not guarantee equal convergence rate per pixel
So, usually produces "dirty" results
Therefore, usually not used to render animations
Photon mapping
A biased approach & a two-stage method
Very good at handling Specular-Diffuse-Specular (SDS) paths and
generating caustics
Approach (variations apply)
Stage 1 - photon tracing
Emitting photons from the light source, bouncing them around, then
recording photons on diffuse surfaces
Stage 2 - photon collection (final gathering)
Shoot sub-paths from the camera, bouncing them around, until they
hit diffuse surfaces
Calculation - local density estimation
Idea: areas with more photons should be brighter
For each shading point, find the nearest photons. Take the surface area they
over
Small -> noisy
Large -> blurry
Why biased?
Local density estimation:
But in the sense of limit
More photons emitted -> the same photons covers a smaller -> is closer to
Vertex connect and merging
A combination of BDPT and photon mapping
Key idea
Let's not waste the sub-paths in BDPT if their end points cannot be
connected but can be merged
Use photon mapping to handle the merging of nearby "photons"
Instant radiosity (IR)
Sometimes also called many-light approaches
Key idea
Lit surface can be treated as light sources
Approach
Shoot light sub-paths and assume the end point of each sub-path is a
Virtual Point Light (VPL)
Render the scene as usual using these VPLs
Pros: fast and usually gives good results on diffuse scenes
Cons:
Spikes will emerge when VPLs are close to shading points
Cannot handle glossy materials
Advanced appearance modeling
Non-surface models
Participating media
Hair / fur / fiber (BCSFD)
Granular material
Surface models
Translucent material (BSSRDF)
Cloth
Detailed material (non-statistical BRDF)
Procedural appearance
Non-surface models
Participating media: cloud /
mist
At any point as light travels through a participating medium, it
can be (partially) absorbed and scattered
Use phase function to describe the angular distribution of light
scattering at any point within
participating media
Rendering
Randomly choose a direction to bounce
Randomly choose a distance to go straight
At each "shading point", connect to the light
Hair appearance
Marschner model
Glass-like cylinder
3 types of light interactions: , , , here:
Reflection:
Transmission:
Human hair vs animal fur
Cortex
Contains pigments
Absorbs light
Medulla
Complex structure
Scatters light
Animal fur has a much larger medulla than human hair
Cuticle
Covered with scales
Granular material
Lots of small particles
Can we avoid explicit modeling of all granules?
Yes, with procedural definition
Surface models
Translucent material: i.e.
jade
Subsurface scattering
Visual characteristics of many surfaces caused by light existing at
different points than it enters
Violates a fundamental assumption of BRDF
Scattering functions
BSSRDF
Generalization of BRDF
Exitant radiance at one point due to incident differential
irradiance at another point
Generalization of rendering equation
Integrating over all points on the surface and all directions
Dipole approximation
Approximate light diffusion by introducing two point sources
Cloth
A collection of twisted fibers
Two levels of twist
Cloth includes several cloth
Yarn includes several ply
Ply includes several fibers
Render as surface
Given the weaving pattern, calculate the overall behavior
CameRender using BRDF
Render as participating media
Properties of individual fibers and their distribution ->
scattering parameters
Render as a participating medium
Render as actual fibers
Render every fiber explicitly
Too complex
Detailed appearance
Not looking realistic: Looking too perfect. Real world is more
complex
Add detailed
Wave optics
Procedural appearance
Can we define details without textures?
Yes, compute a noise function on the fly
3D noise -> Internal structure if cut or broken
Thresholding (noise -> binary noise)
11. Cameras, Lenses and Light
Fields
Camera
What's happening inside the
camera?
Pinholes & lenses form image on sensor
Shutter exposes sensor for precise duration
Sensor accumulates irradiance during exposure
Why not sensors without
lenses?
Each sensor point would integrate light from all points on the
object, so all pixel values would be similar, i.e. the sensor records
irradiance
Field of view (FOV)
For a fixed sensor size, decreasing the focal length increases the
field of view
Focal length v. Field of
view
For historical reasons, it is common to refer to angular field of
view by focal length of a lens used on a 35mm-format film (36x24
mm)
Examples of focal lengths on 35mm format
17mm is wide angle
50mm is a "normal" lens
200mm is telephoto lens
Careful! When we say current cell phones have approximately 28mm
"equivalent" focal length, this uses the above convention
Sensor sizes
Maintain FOV on smaller
sensor?
To maintain FOV, decrease focal length of lens in proportion to width
/ height of sensor
Exposure
Exposure = time
irradiance
Exposure time ()
Controlled by shutter
Irradiance ()
Power of light falling on a unit area of sensor
Controlled by lens aperture and focal length
Exposure controls in
photography
Aperture size
Change the f-stop by opening / closing the aperture (if camera has
iris control)
F-number (F-stop): Exposure level
Written as FN or F/N. N is the f-number
Informal understanding: the inverse-diameter of a round
aperture
Formal: the f-number of a lens is defined as the focal length
divided by the diameter of the aperture
Shutter speed
Change the duration the sensor pixels integrate light
Side effect of shutter speed
Motion blur: handshake, subject movement
Doubling shutter time doubles motion blur
Motion blur is not always bad
Rolling shutter: different parts of photo taken at different
times
ISO gain
Change the amplification (analog and/or digital) between sensor
values and digital image values
Film: trade sensitivity for grain
Digital: trade sensitivity for noise
Multiply signal before analog-to-digital conversion
Linear effect (ISO 200 needs half the light as ISO 100)
Constant exposure:
F-stop vs. Shutter speed
If the exposure is too bright/dark, may need to adjust f-stop and /
or shutter up / down
Photographers must trade off depth of field and motion blur for
moving subjects
Thin lens approximation
Real lens designs are highly complex
Ideal thin lens - focal
point
All parallel rays entering a lens pass through its focal point
All rays through a focal point will be in parallel after passing the
lens
Focal length can be arbitrarily changed (in reality, yes with
several lenses)
The thin lens equation
Ray tracing ideal thin
lenses
Ray tracing for defocus blur
(One possible) setup:
Choose sensor size, lens focal length and aperture size
Choose depth of subject of interest z
Rendering:
For each pixel on the
sensor (actually, film)
Sample random points on lens plane
You know the ray passing through the lens will (using the thin lens
formula)
Depth of field
Set circle of confusion as the maximum permissible blur spot on the
image plane that will pear sharp under final viewing conditions
Circle of confusion for
depth of field
I.e. depth range in a scene where the corresponding CoC is considered
small enough
Light field / lumigraph
The Plenoptic function
is intensity of light
Seen from a single view point
At a single time
Averaged over the wavelengths of the visible spectrum
Improved version is intensity of light
Seen from ANY viewpoint
Over time
As a function of wavelength
This is the holographic movie
Can reconstruct every possible view, at every moment, from every
position, at every wavelength
Contains every photograph, every movie, everything that anyone has
ever seen. It completely captures our visual reality
Only plenoptic surface
The surface of a cube holds all the radiance information due to the
enclosed object
12. Color and Perception
Physical basis of color
The fundamental components
of light
Newton showed sunlight can be subdivided into a rainbow with a
prism
Resulting light cannot be further subdivided with a second
prism
Spectral power distribution
(SPD)
Salient property in measuring light
The amount of light present at each wavelength
Units:
Radiometric units / nanometer (e.g. watts / nm)
Can also be unit-less
Often use "relative units" scaled to maximum wavelength for
comparison across wavelengths when absolute units are not important
What is color?
Color is a phenomenon of human perception, it is not a universal
property of light
Different wavelengths of light are not "colors"
Biological basis of color
Retinal
photoreceptor cells: Rods and Cones
Rods are primary receptors in very low light ("scotopic" conditions),
e.g. dim moonlight
~ 120 million rods in eye
Perceive only shades of grey, no color
Cones are primary receptors in typical light levels ("photopic")
~ 6-7 million cones in eye
Three types of cones, each with different spectral sensitivity
Provide sensation of color
Spectral response of
human cone cells
Three types of cone cells: S, M, and L (corresponding to peak
response at short, medium, and long wavelengths)
Fraction of three cone cell types varies widely
Distribution of cone cells at edge of fovea in 12 different humans
with normal color vision. High variability of percentage of different
cone types
Now we have three detectors (S, M, L cones cells)
The human visual system
Human eye does not measure and brain does not receive information
about each wavelength of light
Rather, the eye "sees" only three response values (S, M, L), and
this is only info available to brain
Metamers
Metamers are two different spectra (-dim) that project to the same (S,
M, L) (3-dim) response
These will appear to have the same color to a human
The existence of metamers is critical to color reproduction
Don't have to reproduce the full spectrum of a real world scene
Example: A metamer can reproduce the perceived color of a real-world
scene on a display with pixels of only three color
This is the theory behind color matching
Color reproduction / Matching
Additive color
Given a set of primary lights, each with its own spectral
distribution (e.g. RGB display pixels)
Adjust the brightness of these lights and add them together
The color is now described by the scalar values:
CIE RGB color matching
experiment
Same setup as additive color matching before, but primaries are
monochromatic light (single wavelength): 700nm, 546.1nm, 435.8nm
Graph plots how much of each CIE RGB primary light must be combined
to match a monochromatic light of wavelength given on x-axis
For any spectrum , the
perceived color is matched by the convolution of the spectrum and the
CIE RGB color match graph
Color spaces
Standard color spaces (sRGB)
Makes a particular monitor RGB standard
Other color devices simulate that monitor by calibration
Widely adopted today
Gamut is limited
A universal color space: CIE
XYZ
Imaginary set of standard color primaries X, Y, Z
Primary colors with these matching functions do not exist
Y is luminance (brightness regardless of color)
Designed such that
Matching functions are strictly positive
Span all observable colors
Separating luminance, Chromaticity
Luminance: Y
Chromaticity: x, y, z, define as
Since , we only
need to record two of the three
Usually choose and , leading to coordinates at a specific
brightness
The curved boundary
Named spectral locus
Corresponds to monochromatic light (each point representing a pure
color of a single point)
Gamut
Different color space gives a different gamut.
Perceptually organized
color spaces
HSV color space
(Hue-Saturation-Value)
Axes correspond to artistic characteristics of color
idely used in a "color picker"
Hue
the "kind" of color, regardless of attributes
colorimetric correlate: dominant wavelength
artist's correlate: the chosen pigment color
Saturation
the "colorfulness"
colorimetric correlate: purity
artist's correlate: fraction of paint from the colored tube
Lightness (or value)
the overall amount of light
colorimetric correlate: luminance
artist's correlate: tints are lighter, shades are darker
CIELAB space (aka. L* a* b*)
A commonly used color space that strives for perceptual
uniformity
L* is lightness (brightness)
a* and b* are color-opponent pairs
a* is red-green
b* is blue-yellow
Opponent color theory
There is a good neurological basis for the color space dimensions in
CIE LAB
The brain seems to encode color early on using three axes:
White - black, red - green, yellow - blue
The white - black axis is lightness, the others determine hue and
saturation
One pieces of evidence: you have a light green, a dark green, a
yellow green, or a blue green, you can't find a red-green
CMYK: A subtractive color
space
Subtractive color model
The more you mix, the darker it will be
Cyan, magenta, yellow, and key
Widely used in printing
13. Animation
Animation
"Bring things to life"
Communication tool
Aesthetic issues often dominate technical issues
An extension of modeling
Represent scene models as a function of time
Output: sequence of images that when viewed sequentially provide a
sense of motion
Film: 24 frames per second
Video (in general): 30 fps
Virtual reality: 90 fps
History
First firm
Originally used as scientific tool rather than for
entertainment
Critical technology that accelerated development of animation
First hand-drawn feature-length (>40 mins) animations: 1937
First digital computer-generated animation: 1963
Early computer animation: 1972
Digital dinosaurs: 1993
First CG feature-length film: 1995
Computer animation becomes quite good: 2009
Keyframe animation
Animator (e.g. lead animator) creates keyframes
Assistant (person or computer) creates in-between frames
Keyframe interpolation
Think of each frame as a vector of parameter values
Keyframe interpolation of each parameter
Linear interpolation usually not good enough
Splines for smooth / controllable interpolation
Physical simulation
Physical based animation
Generate motion of objects using numerical simulation
Mass
spring system: Example of modeling a dynamic system
A simple idealized spring
Force pulls points together
Strength proportional to displacement (Hooke's law)
is a spring coefficient:
stiffness
Non-zero length string
Spring with non-zero rest length Here is the rest
length
Problem: infinite oscillation
Introducing energy loss
Simple motion damping:
Behaves like viscous drag on motion
Slows down motion in the direction of velocity
is a damping
coefficient
Problem: Slows down all motion
Want a rusty spring's oscillations to slow down, but should it also
fall to the ground more slowly
Internal damping for spring
Damp only the internal, spring-driven motion
Viscous drag only on change in spring length
Won't slow group motion for the spring system (e.g. global
translation or rotation of the group)
Note this is only one specific type damping
Structures from springs
Behavior is determined by structure linkages
Sheets
Blocks
Others
Particle systems
Model dynamical systems as collections of large numbers of
particles
Each particle's motion is defined by a set of physical (or
non-physical) forces
Popular technique in graphics and games
Easy to understand, implement
Scalable: fewer particles for speed, more for higher complexity
Challenges
May need many particles (e.g. fluids)
May need acceleration structures (e.g. to find nearest particles for
interactions)
Particle system animations
For each frame in animation
[If needed] create new particles
Calculate forces on each particle
Update each particle's position and velocity
[If needed] remove dead particles
Render particles
Particle system forces
Attraction and repulsion forces
Gravity, electromagnetism, ...
Springs, propulsion, ...
Damping forces
Friction, air drag, viscosity, ...
Collisions
Walls, containers, fixed objects, ...
Dynamic objects, character body parts, ...
Simulated flocking as an ODE
Model each bird as a particle
Kinematics
Forward kinematics
Articulated skelecton
Topology (what's connected to what)
Geometric relations from joints
Tree structure (in absence of loops)
Joint types
Pin (1D rotation)
Ball (2D rotation)
Prismatic joint (translation)
Forward kinematics
Animation is described as angle parameter values as a function of
time
Kinematics pros and cons
Strengths
Direct control is convenient
Implementation is straightforward
Weaknesses
Animation may be inconsistent with physics
Time consuming for artists
Inverse kinematics
Direct inverse kinematics: for two-segment arm, can solve for
parameters analytically
Why is the problem hard?
Multiple solution in configuration space
Solutions may not always exist
Numerical solution to general N-link IK problem
Choose an initial configuration
Define an error metric (e.g. square of distance between goal and
current position)
Compute gradient of error as function of configuration
Apply gradient descent (or Newton's method, or other optimization
procedure)
Rigging
Rigging is a set of higher level controls on a character that allow
more rapid & intuitive modification of pose, deformations,
expressions, etc.
Important
Like strings on a puppet
Captures all meaningful character changes
Varies from character to character
Expensive to create
Manual effort
Requires both artistic and technical training
Blend shapes
Instead of skeleton, interpolate directly between surfaces
E.g. model a collection of fracial expressions
Simplest scheme: take linear combination of vertex positions
Spline used to control choice of weights over time
Motion capture
Data-driven approach to creating animation sequences
Record real-world performances (e.g. person executing an
activity)
Extract pose as a function of time from the data collected
Strengths
Can capture large amounts of real data quickly
Realism can be high
Weaknesses
Complex and costly set-ups
Captured animation may not meet artistic needs, requiring
alterations
Single particle simulation
First study motion of a
single particle
To start, assume motion of particle determined by a velocity vector
field that is a function of position and time:
Computing position of particle over time requires solving a first
ODE:
We can solve the ODE, subject to a given initial, particle position
, by using forward numerical
integration
Euler's method
Simple iterative method
Commonly used
Very inaccurate
Most often goes unstable
Errors of Euler method
With numerical integration, errors accumulate. Euler integration is
particularly bad
Instability of the Euler
method
Inaccuracies increase as time step increases
Instability is a common, serious problem that can cause
simulation to diverge
lack of stability is a fundamental problem in simulation, and
cannot be ignored
Some methods to combat
instability
Midpoint method / modified Euler
Average velocities at start and endpoint
Adaptive step size
Compare one step and two half-steps, recursively, until error is
acceptable
Implicit methods
Use the velocity at the next time step
Position-based / Verlet integration
Constrain positions and velocities of particles after time step
Midpoint method
Compute Euler step
Compute derivative at midpoint of Euler step
Update position using midpoint derivative
Adaptive step size
Technique for choosing step size based on error estimate
Very practical technique
But may need very small steps
Repeat until error is below threshold:
Compute an Euler step, size
Compute two Euler step,
size
Compute error
If (error > threshold), reduce step size and try again
Implicit Euler method
Informally backward methods
Use derivatives in the future, for the current step Solve nonlinear problem for and
Use root-finding algorithm, e.g. Newton's method
Offers much better stability
How to determine / quantize
"stability"
We use the local truncation error (every step) / total accumulated
error (overall)
Absolute values do not matter, but the orders w.r.t. step
Implicit Euler has order 1, which means that
Local truncation error:
Global truncation error:
Understanding of
If we halve , we can expect
the error to halve as well
Runge-Kutta Families
A family of advanced methods for solving ODEs
Especially good at dealing with non-linearity
Its order-four version is the most widely used, a.k.a. RK4
Position-based / Verlet
integration
Idea
After modified Euler forward-step, constrain positions of particles
to prevent divergent, unstable behavior
Use constrained positions to calculate velocity
Both of these ideas will dissipate energy, stabilize
Pros / Cons
Fast and simple
Not physically based, dissipates energy (error)
Rigid body simulation
Simple case
Similar to simulating a particle
Just consider a bit more properties Here
: positions
: rotation angle
: angular velocity
: forces
: torque
: momentum of inertia
A simple position-based
method
Key idea:
Assuming water is composed of small rigid-body spheres
Assuming the water cannot be compressed
So, as long as the density changes somewhere, it should be
"corrected" via changing the position of particles
You need to know the gradient of the density anywhere w.r.t. each
particle's position
Update with gradient descent
Eulerian vs. Lagrangian
Two different views to simulating large collections of matters
Lagrangian approach: Follows the same particle through out its
movement
Eulerian approach: Split the space into smaller meshes, follows how
each cell change over time
Material point method (MPM)
Hybrid, combining Eulerian and Lagrangian views
Lagrangian: consider particles carrying material properties
Eulerian: use a grid to do numerical integration
Interaction: particles transfer properties to the grid, grid
performs update, then interpolate back to particles