Modern Computer Graphics Course Note

Posted on 2021-09-15 Edited on 2022-07-29 Word count in article: 58k Reading time ≈ 53 mins.

Note for the course GAMES 101 - Modern Computer graphics. This course will gives a comprohensive introduction to four foundamental areas of modern computer graphics: 1. rasterization, 2. geometry, 3. ray tracing, 4. animation and simulation.

The course is orignally given in Chinese. But the course notes are available in English.

1. Overview of Computer Graphics

Why study computer graphics?

Fundamental intellectual challenges

* Creates and interacts with realistic virtual world
* Requires understanding of all aspects of physical world
* New computing methods, displays, technologies

Technical challenges

* Math of projections, curves, surfaces
* Physics of lighting and shading
* Representing / operating shapes in 3D
* Animation / simulation
* 3D graphics software programming and hardware

Course topics

Rasterization

* Project geometry primitives (3D triangles / polygons) onto the screen
* Break projected primitives into fragments (pixels)
* Gold standard in Video Games (Real-time applications)

Curves and meshes

* How to represent geometry in Computer Graphics

Ray tracing

* Shoot rays from the camera though each pixel
    * Calculate intersection and shading
    * Continue to bounce the rays till they hit light sources
* Gold standard in animations / movies (offline applications)

Animation / simulation

* Key frame Animation
* Mass-spring system

2. Review of Linear Algebra

Graphics' Dependencies

Basic mathematics
- Linear algebra, calculus, statistics
Basic physics
- Optics, mechanics
Mics
- Signal processing
- Numerical analysis
A bit of aesthetics

Cross product in graphics

Determine left or right
Determine inside or outside

3. Transformation

Why study transformation

Modeling
- Translation
- Scaling
Viewing
- (3D to 2D) projection

2D transformations

Representing transformations using matrices
Rotation, scale, shear

Scale transform

Non-uniform scale transform

Reflection matrix

Horizontal reflection

Shear matrix

For example, horizontal shift is 0 at , horizontal shift is at . Vertical shift is always 0.

Rotate (about the origin , count-clockwise by default)

Linear transformation = Matrix multiplication

Homogeneous coordinate

Translation

Why homogeneous coordinates

Translation cannot be represented in matrix form

But we don't want translation to be a special case
- Is there a unified way to represent all transformations (and what is the cost)?

Solution: Homogenous coordinates

Add a third coordinate (-coordinate)
- 2D point =
- 2D vector =
  - The reason why for 2D vector is that the vector is constant over translation
- Valid operatioin if -coordinate of result is 1 or 0
  - vector + vector = vector
  - point - point = vector
  - point + vector = point
  - point + point = midpoint (see following definition)
- In homogeneous coordinates, $ , w $
Matrix representation of translations:

Affine transformations

Affine map = linear map + translation

Using homogenous coordinates

Inverse transform

is the inverse of transform in both a matrix and geometric sense

Composing transformations

Transform ordering matters

Matrix multiplication is not commutative
Note that matrices are applied right to left

Sequence of affine transforms , ...

Compose by matrix multiplication
- Very important for performance
  - Pre-multiply matrices to obtain a single matrix representing combined transform

Decomposing complex transforms

How to rotate an angle around a given point ?
1. Translate center to origin
2. Rotate
3. Translate back Matrix representation?

3D transforms

Use homogeneous coordinates again:

3D point =
3D vector =

In general, is the 3D point: for

Use matrices for affine transformations

3D transformations

Scale

Translation

Rotation around x-, y-, or z-axis

Compose any 3D rotation from ?

So-called Euler angles
Often used in flight simulators: roll, pitch, yaw

Rodrigues' rotation formula

Rotation by angle around axis

View / camera transformation

What is view transformation?

Think about how to take a photo
- Find a good place and arrange people (model transformation)
- Find a good "angle" to put the camera (view transformation)
- Cheese! (projection transformation)

How to perform view transformation?

Define the camera first
- Position
- Look-at / gaze direction
- Up direction (assuming perpendicular to look-at)
Key observation
- If the camera and all objects move together, the "photo" will be the same
- How about that we always transform the camera to
  - The origin, up at Y, look at -Z
  - And transform the objects along with the camera
Transform the camera by
- So it's located at the origin, up at , look at
- in math:
  - Let's write
  - Translates to origin
  - Rotates to
  - Rotates to
  - Rotates to
    - Consider their inverse rotation: to , to , to
Transform objects together with the camera
Also known as ModelView Transformation

Projection transformation

Projection in computer graphics

3D to 2D

Orthographic projection

The camera is considered located at infinite far away

A simple way of understanding
- Camera located at origin, looking at , up at
- Drop coordinate
- Translate and scale the resulting rectangle to

In general

We want to map a cuboid to the "canonical" cube
Method: slightly different orders to the "simple way"
- Center cuboid by translating
- Scale into "canonical" cube
Transformation matrix
- Translate (center to origin) first, then then scale (length / width / height to 2):
Caveat
- Looking at / along is making near and far not intuitive ()
- FYI: That is why OpenGL uses left hand coordinates

Perspective projection

The camera is considered as a point at a certain distance
Most common in computer graphics, art, visual system
- Further objects are smaller
- Parallel lines not parallel; converge to single point
Recall: property of homogeneous coordinates
- all represent the same point in 3D

How to do perspective projection

First "squish" the frustum into a cuboid ,
Do orthographic projection (, already known)
In order to find a transformation
- Recall the key idea: find the relationship between transformed point and the original points :
- In homogeneous coordinates,
- So the "squish" (persp to ortho) projection does this:
  - It is already good enough to figure out part of
  - Observation: the third row is responsible for
    - Any point on the near plane will not change
      
      So the third row must be of the form
      
      Here has nothing to do with and
    - Any point's on the far plane will not change
    - They give: and
    - Finally, every entry in is filled

4. Rasterization

Perspective projection

What is near plane's ?

If explicitly specified, good
Sometimes people prefer vertical field-of-view () and aspect ratio = width / height
- Assume symmetry, i.e.
How to convert from and aspect to ?

What is after MVP?

Model transformation (placing objects)
View transformation (placing camera)
Projection transformation
- Orthographic projection (cuboid to "canonical" cube )
- Perspective projection (frustum to "canonical" cube)

Canonical cube to screen

What is a screen?
- An array of pixels
- Size of the array: resolution
- A typical kind of raster display
Raster == screen in German
- Rasterize == drawing onto the screen
Pixel (FYI, short for "picture element")
- For now: a pixel is a little square with uniform color
- Color is a mixture of (red, green, blue)
Defining the screen space
- Slightly different from the "tiger book"
- at the left-down corner. -axis toward right. -axis toward up
- Pixels' indices are in the form of , where both and are integers
- Pixels' indices are from to
- Pixel is centered at
- The screen covers range to
Irrelevant to
Transform in -plane: to
- Viewport transform matrix:

Rasterization: Drawing to raster displays

Triangles - Fundamental shape primitives

Why triangles

Most basic polygon
- Break up other polygons
Unique properties
- Guaranteed to be planar
- Well-defined interior
- Well-defined method for interpolating values at vertices over triangle (barycentric interpolation)

What pixel values approximate a triangle?

Input: position of triangle vertices projected on screen
Output: set of pixel values approximating triangle
A simple approach: sampling
- Sampling a function: Evaluating a function at point is sampling. We can discretize a function by sampling
- Sampling is a core idea in graphics. We sample time (1D), area (2D), direction (2D), volume (3D) ...

Thus define binary function: inside(tri, x, y), here x, y are not necessarily integers

1 2	inside (t, x, y) = 1 if Point (x, y) in triangle t 0 otherwise

Rasterization = Sampling a 2D indicator function:

1
2
3

for (int x = 0; x < xmax; ++x) 
    for (int y = 0; y < ymax; ++y)
        image[x][y] = inside(tri, x + 0.5, y + 0.5);

Evaluating inside(tri, x, y):
- Three cross products should give same sign
- Edge cases: you can define the situation by yourself
Do not check all pixels on the screen. Use an axis-aligned bounding box (AABB)
- A faster way: incremental triangle traversal: suitable for thin and rotated triangles
Will lead to aliasing (Jaggies)

Antialiasing

Sampling

Sampling is Ubiquitous in computer graphics

Photograph = Sampling image sensor plane
Video = Sampling time

Sampling artifacts (errors / mistakes / inaccuracies) in computer graphics

Jaggies (staircase pattern)
- This is also an example of "aliasing" - a sampling error
Moiré patterns in imaging
- Occur because of skip odd rows and columns
Wagon Wheel Illusion (False motion)
- Problem with human eye - sampling in time slower than movement of an object
Many more...
Behind the aliasing artifacts
- Signals are changing too fast (high frequency), but sampled too slowly

Antialiasing idea: Blurring (Pre-Filtering) before sampling

Antialiased edges in rasterized triangle where pixel values take intermediate values

But why aliasing and pre-filtering?

Why undersampling introduces aliasing?
Why pre-filtering then samling can do antialiasing?

Fundamental reasons

Frequency domain

Fourier transform
- Represent a function as a weighted sum of sines and cosines
- Fourier transform decomposes a signal into frequencies
  - Functions in spatial domain is Fourier transformed into new functions in frequency domain
- From a function in frequency domain, we can see that:
  - For low-frequency signal: sampled adequately for reasonable reconstruction
  - High-frequency signal is insufficiently sampled:reconstruction incorrectly appears to be from a flow frequency signal
    - Undersampling creates frequency aliases
    - Two frequencies that are indistinguishable at a given sampling are called "aliases"

Filtering = Getting rid of certain frequency contents. Also, Filtering = Convolution (=Averaging)

Convolution
- Point-wise average on a sliding box
- Convolution theorem
  - Convolution in the spatial domain is equal to multiplication in the frequency domain, and vice versa
    - Option 1:
      - Filter by convolution in the spatial domain
    - Option 2:
      - Transform to frequency domain (Fourier transform)
      - Multiply by Fourier transform of convolution kernel
      - Transform back to spatial domain (inverse Fourier)

Box filter

Box function = "Low pass" filter
- Wider filter kernel = Lower Frequencies

Sampling = Repeating frequency contents

Aliasing = Mixed frequency contents
- Dense sampling = repeated frequency contents does not overlap
- Sparse sampling = repeated frequency contents overlaps which gives aliasing

How can we reduce aliasing error?

Option 1: Increase sampling rate
- Essentially increasing the distance between replicas in the Fourier domain
- Higher resolution displays, sensors, framebuffers
- But: costly & may need very high resolution
Option 2: Antialiasing
- Making Fourier contents "narrower" before repeating
- i.e. Filtering out high frequencies before sampling
  - Limiting, the repeating

A practical pre-filter

A 1 pixel-width box filter (low pass, blurring)
Antialiasing by computing average pixel value
- In rasterizing one triangle, the average value inside a pixel area of = inside(triangle, x, y) is equal to the area of the pixel covered by the triangle
Antialiasing by supersampling (MSAA)
- Supersampling
  - Approximate the effect of the 1-pixel box filter by sampling multiple locations within a pixel and averaging their values
  - Step 2: Average the NxN samples "inside" each pixel
- No free lunch! What is the cost of MSAA?
  - More computation
More antialiasing method: Milestones
- FXAA (Fast Approximate AA)
- TAA (Temporal AA)
- Super resolution / super sampling
  - From low resolution to high resolution
  - Essentially still "not enough samples" problem
  - DLSS (Deep Learning Super Sampling)

Visibility / occlusion

Painter's algorithm

Inspired by how painters paint: Paint from back to front, overwrite in the framebuffer
Requires sorting in depth ( for triangles)
Can have unresolvable depth order (ordering in a cycle)
Not very good, thus we use Z-buffer algorithm

Z-buffering

Idea
- Store current min. z-value for each sample (pixel)
- Needs an additional buffer for depth values
  - Frame buffer stores color values
  - Depth buffer (z-buffer) stores depth
Important: For simplicity we suppose that is always positive
- Smaller is closer

The Z-Buffer algorithm

Initialize depth buffer to

During rasterization:

for (each triangle T)
  for (each sample (x, y, z) in T) 
    if (z < zbuffer[x, y])
      framebuffer[x, y] = rgb;
      zbuffer[x, y] = z;
    else
      // do nothing, this sample is occluded

Z-Buffer complexity
- Complexity
  - for triangles (assuming constant coverage)
    - Assuming that each triangle is not too big
  - How is it possible to sort triangles in linear time?
    - Because we did not sort them, just find the minimum
- Drawing triangles in different orders?
  - Working fine
Most important visibility algorithm
- Implemented in hardware for all GPUs

5. Shading

Definition

In Meriam-Webster dictionary
- The darkening or coloring of an illustration or diagram with parallel lines or a block of color
In this course
- The process of applying a material to an object

Properties of shading

Shading is local:
- Compute light reflected toward camera as a specific shading point
- Shading will ignore all other objects
- Thus no shadows will be generated
  - Shading != shadow
Inputs:
- Viewer direction:
- Surface normal:
- Light direction: (for each of many lights)
- Surface parameters: color, shininess, ...
- are all unit vector

A simple shading model (Blinn-Phong reflectance model)

Perceptual observations and modeling

Specular highlights
- Intensity depends on view direction
  - Bright near mirror reflection direction
  - close to mirror direction half vector near normal Measure "near" by dot product of unit vectors
    
    Here
    - : specularly reflected light
    - : specular coefficient
    - : the cosine decreases too slowly, thus increasing narrows the reflection lobe
      - Usually set to 100-200 in Blinn-Phong model
Diffuse reflection
- Light is scattered uniformly in all directions
  - Surface color is the same for all viewing directions
- But how much light (energy) is received?
  - Lambert's cosine law
    - In general, light per unit area is proportional to
  - Lambertian (Diffuse) shading: shading independent of view direction
    
    Here:
    - : diffusely reflected light
    - : diffuse coefficient (color)
    - : energy arrived at the shading point
    - : energy received by the shading point
Ambient lighting
- Shading that does not depend on anything
  - Add constant color to account for disregarded illumination and fill in black shadows
  - This is approximation / fake!
- Formula:
  
  Here:
  - : reflected ambient light
  - : ambient coefficient

The final mode

Ambient + Diffuse + Specular = Blinn-Phrong reflection

Shading frequencies

Shade each triangle (flat shading)

Triangle face is flat - one normal vector
Not good for smooth surfaces

Shade each vertex (Gouraud shading)

Interpolate colors from vertices across triangle
A little better than flat shading
Each vertex has a normal vector

Defining per-vertex normal vectors

Best to get vertex normals from the underlying geometry
- e.g. consider a sphere
Otherwise have to infer vertex normals from triangle faces
- Simple scheme: average surrounding face normals

Shade each pixel (Phone shading)

Interpolate normal vectors across each triangle
Compute full shading model at each pixel
Not the Blinn-Phong reflectance model

Defining per-pixel normal vectors

Barycentric interpolation of vertex normals
Don't forget to normalize the interpolated directions

Graphics (Real-time rendering) pipeline

Application => Input is vertices in 3D
Vertex processing => Vertices positioned in screen space
Triangle processing => Triangle positioned in screen space
Rasterization => Fragments (one per covered sample)
Fragment processing => Shaded fragments
Framebuffer operations => Output: image (pixels)
Display

Graphic pipeline implementation: GPUs

Specialized processors for executing graphics pipeline computations
Heterogeneous, Multi-core processor
- Modern GPUs offer 2 - 4 Tera-FLOPs of performance for executing vertex and fragment shader programs

Shader programs

Program vertex and fragment processing stages
Describe operation on a single vertex (or fragment)
Shader function executes once per fragment
Outputs color of surface at the current fragment's screen sample position

For example a GLSL fragment shader program. This shader performs a texture lookup to obtain the surface's material color at this point, then performs a diffuse lighting calculation

uniform sampler2D myTexture;            // program parameter
uniform vec3 lightDir;                  // program parameter
varying vec2 uv;                        // per fragment value (interp. by rasterizer)
varying vec3 norm;                      // per fragment value (interp. by rasterizer)

void diffuseShader（） {
  vec3 kd;
  kd = texture2d(myTexture, uv);                    // material color from texture
  kd *= clamp(dot(-lightDir, norm), 0.0, 1.0);      // Lambertian shading model
  gl_FragColor = vec4(kd, 1.0);                     // output fragment color
}

Goal: Highly complex 3D scenes in Realtime

100's of thousands to millions of triangles in a scene
Complex vertex and fragment shader computations
High resolution (2 - 4 megapixel + supersampling)
30 - 60 frames per second (even higher for VR)

Texture mapping

Surfaces are 2D
- Surface lives in 3D world space
- Every 3D surface point also has a place where it goes in the 2D image (texture)
Visualization of texture coordinates
- Each triangle vertex is assigned a texture coordinate
Textures can be used multiple times (tiled)

Interpolation across triangles: Barycentric coordinates

Interpolation across triangles

Why do we want to interpolate?
- Specify values at vertices
- Obtain smoothly varying values across triangles
What do we want to interpolate?
- Texture coordinates, colors, normal vectors, ...
How do we do interpolation?
- Barycentric coordinates

Barycentric coordinates

A coordinate system for triangles :
- Every point in the surface of this triangle can be computed with
  
  Here are coordinates of vertices of the triangle and The point is inside the triangle if all three coordinates are non-negative
Geometric viewpoint - proportional areas, for a point in the triangle:

What is the barycentric coordinate of the centroid?:
There is also a formula for any point which can be found on all kinds of documents

Using Barycentric coordinates

Linear interpolate values at vertices
To get the value inside a triangle with values at its vertices

can be positions, texture coordinates, color, normal, depth, material attributes, ...
However, barycentric coordinates are not invariant under projection

Applying textures

Simple texture mapping: diffuse color

for each rasterized screen sample (x, y):
  (u, v) = evaluate texture coordinate at (x, y);
  texcolor = texture.sample(u, v);
  set sample's color to texcolor;

Texture magnification (What if the texture is too small?)

Easy case

Generally don't want this - insufficient texture resolution
A pixel on a texture - a texel
Bilinear interpolation
- Want to sample texture value f(x, y) at a non-integer position
- Take 4 nearest sample locations and fractional offsets to these 4 locations
- Linear interpolation (1D):
- Two helper lerps (horizontal):
- Fianl vertical lerp, to get result:
- Bilinear interpolation usually gives pretty good results at reasonable costs

Hard case: What if the texture is too large?

Point sampling texture
- Upsampling: magnification
- Downsampling: minification
Will supersampling do antialiasing?
- Yes, high quality but costly
- When highly minified, many texels in pixel footprint
- Signal frequency too large in a pixel
- Need even higher sampling frequency
Let's understand this problem in another way
- What if we don't sample
- Just need to get the average value within a range
Point query vs range query

Mipmap

Allowing (fast, approximate, square) range queries
"Mip" comes from the Latin "multum in parvo", meaning a multitiude in a small space
- Mip hierarchy
Storage overhead: times of the original picture
Trilinear interpolation
- Linear interpolation based on continuous mapmap value
Mipmap limitations
- Overblur
  - Partially solved by anisotropic filtering
    - Better for rectangle area but not un-regular area
    - Storage overhead: times of the original picture
  - Slowly solved by EWA filtering
    - Use multiple lookups
    - Weighted average
    - Mipmap hierarchy still helps
    - Can handle irregular footprints

Many, many uses for texturing

In modern GPUs, texture = memory + range query (filtering)
- General method to bring data to fragment calculations
Many applications
- Environment lightnng
- Store microgeometry
- Procedural textures
- Solid modeling
- Volume rendering
- ...

Cube map

A vector maps to cube point along that direction. The cube is textured with 6 squares
- Gives much less distortioin
- Need dir -> face computation

Textures can affect shading

Textures doesn't have to only represent colors

What if it stores the height / normal?
Bump / normal mapping
Fake the detailed geometry

Bump mapping

Normal mapping
Adding surface detail without adding more triangles
- Perturb surface normal per pixel (for shading computations only)
- "Height shift" per texel defined by a texture
- How to modify vector?
How to perturb the normal (in flatland)
- Original surface normal n(p) = (0,1)
- Derivative at p is dp = c * [h(p+1) - h(p)]
- Perturbed normal is then n(p) = (-dp, 1).normalized()
How to perturb the normal in 3D
- Original surface normal n(p) = (0, 0, 1)
- Derivates at p are:
  - dp/du = c1 * [h(u + 1) - h(u)]
  - dp/dv = c2 * [h(v + 1) - h(v)]
- Pertubed normal is n = (-dp/du, -dp/dv, 1).normalized()
- Note that this is in local coordinate

Displacement mapping - a more advanced approach

Uses the same texture as in bumping mapping
But here, we actually moves the vertices
Use much more triangle, thus resources

3D Procedural noise + Solid modeling

For example, Perlin noise

Provide percomputed shading

For example, ambient occlusion texture map

Shadows

How to draw shadows using rasterization?
Shadow mapping！

Shadow mapping

An image-space algorithm
- No knowledge of scene's geometry during shadow computation
- Must deal with aliasing artifacts
Key idea:
- The points that are NOT in shadow must be seen both by the light and by the camera
Pass 1: Render from light
- Depth image from light source
Pass 2A： project to light
- Porject visible points in eye view back source
- Reprojected depths match for light and eye, then visible
- Reprojected depths from light and eye are not the same, then blocked
Well known rendering technique
- Basic shadowing technique for early animations and in EVERY 3D video game

Problems with shadow maps

Hard shadows (point lights only)
Quality depends on shadow map resolution (general blem with image-based techniques)
Involves equaliy comparison of floating point depth values means issues of scale, bias, tolerance
Hard shadows vs. soft shadows
- 0 or 1 shadows
- Non-binary shadows

6. Geometry

Many ways to represent geometry

Implicit geometry

For example:
- Algebraic surface
- Level sets
- Distance functions
- ...
Based on classifying points
- Points satisfy some specified relationship
- E.g. sphere: all points in 3D, where
- More generally:
Sampling can be hard
- What points lie on
Inside / outside test is easy
Pros:
- Compact description (e.g. a function)
- Certain queries easy (inside object, distance to surface)
- Good for ray-to-surface intersection
- For simple shapes, exact description / no sampling error
- Easy to handle changes in topology (e.g. fluid)
Cons:
- Difficult to model complex shapes

More implicit representations in computer graphics

Algebraic surfaces (Implicit)
- Surface is zero set of a polynomial in
- More complex shapes?
Constructive solid goemetry (CSG)
- Combine implicit geometry via Boolean operations
Distance functions
- Instead of Booleans, gradually blend surfaces together using distance functions:
- Giving minimum distance (could be signed distance) from anywhere to object
Level set methods
- Closed-form equations are hard to describe complex shapes
- Alternative: sotre a grid of values approximating function
- Surface is found where interpolated values equal zero
- Provides much more explicit control over shape (like a texture)
- Level sets encode, e.g., constant tissue density
Fractals
- Exhibit self-similarity, detail at all scales
- "Language" for describing natural phenomena
- Hard to control shape

Explicit geometry

For example:
- Point cloud
- Polygon mesh
- Subdivision, NURBS
- ...
All points are given directly or via parameter mapping
- Generally:
Sampling is easy
- What points lie on this surface? Just plug in values
Inside / Outside test is hard

More explicit representations in computer graphics

Point cloud
- Easiest representation: list of points (x, y, z)
- Easily represent any kind of geometry
- Useful for LARGE datasets (>> 1 point/pixel)
- Often converted into polygon mesh
- Difficult to draw in undersampled regions
Polygon mesh
- Store vertices & polygons (often triangles or quads)
- Eaier to do processing / simulation, adaptive sampling
- The Wavefront Object file (.obj) format
  - Commonly used in graphics research
  - Just a text file that specifies vertices, normals, texture corrdinates and their connectivities

No "best" representation - geometry is hard

Curves

Bézier curves

Defining cubis Bézier curve with tangents

Evaluating Bézier curves (de Casteljau algorithm)

Consider three points (quadratic Bézier)
- Repeat recursively
Cubic Bézier curve - de Casteljau
- Four input points in total
- Same recursive line interpolations
Albebraic formula
- de Casteljau algorithm gives a pyramid of coefficients
- Example: quadratic Bézier curve from three points
- General algebraic formula
  - Bernstein form of a Bézier curve of order :
    
    here
    - is the Bézier curve of order (vector polynomial of degree )
    - is the Bézier control points (vector in )
    - is the Berstein polynomial (scalar polynomial of degree )
      - Which is:
- Properties of Bézier curves
  - Interpolates endpoints
    - For cubic Bézier: ;
  - Tangent to end segments
    - Cubic case: ;
  - Affine transformation property
    - Transform curve by transforming control points
  - Convex hull property
    - Curve is within convex hull of control points

Piecewise Bézier curves

Higher-order Bézier curves?
- Very hard to control
Piecewise Bézier curves
- Instead, chain many low-order Bézier curve. Piecewise cubic Bézier is the most common technique. Widely used
- Continuity:
  - continuity:
  - continuity:

Other types of splines

Spline
- A continuous curve constructed so as to pass through a given set of points and have a certain number of continuous derivatives
- In short, a curve under control
B-splines
- Short for basis splines
- Require more information than Bézier curves
- Satisfy all important properties that Bézier curves have (i.e. superset)

Bézier surfaces

Extend Bézier curves to surfaces
Bicubic Bézier surface patch
- points to form a surface

Evaluating Bézier surfaces

Evaluating surface position for parameters
- For bi-cubic Bézier surface patch
  - Input: control points
  - Output is 2D surface parameterized by in i
  - Method: separable 1D de Casteljau Algorithm
    - Use de Casteljau to evaluate point on each of the 4 Bézier curves in . This gives 4 control points for the "moving" Bézier curve
    - Use 1D de Casteljau to evaluate point on the "moving" curve

Mesh operations: Geometry processing

Mesh subdivision
- Upsampling: Increase resolution
Mesh simplification
- Downsampling: Decrease resolution; try to preserve shape / appearance
Mesh regularization
- Same number of triangles: modify sample distribution to improve quality

Mesh subdivision algorithm

Loop subdivision

Common subdivision rule for triangle meshes
First, create more triangles (vertices)
- Split each triangle into four
Second, tune their positions
- Assign new vertex positions according to weights
  - New / old vertices updated differently
- Update
  - For new vertices: update to 3/8 * (A + B) + 1/8 * (C + D), where A, B are nodes on the same node as the new vertex and C, D are in the same triangle as the new vertex that is not A, B
  - For old vertices: Update to (1 - n*u) * original_position + u * neightbor_position_sum, where n is the vertex's degree, u = 3/16 if n = 3, u = 3 / (8n) otherwise

Catmull-Clark subdivision (General mesh)

Quad face: surface with # edges == 4
Non-quad face: surface with # edges != 4
Extraordinary vertex: vertex with degree != 4
Each subdivision step:
- Add vertex in each face
- Add midpoint on each edge
- Connect all new vertices
After one subdivision
- All non-quad faces disappear while each non-quad face adds a extraordinary vertex
Catmull-Clark vertex update rules
- Separated into face point, edge point, vertex point

Mesh simplification algorithm

Goal: reduce number of mesh elements while maintaining the overall shape

Collapsing an edge

Suppose we simplify a mesh using edge collapsing

Quadric error metrics

How much geometric error is introduced by simplification?
Not a good idea to perform local averaging of vertices
Quadric error: new vertex should minimize its sum of square distance (L2 distance) to previously related triangle planes

Quadric error of edge collapse

Idea: compute edge midpoint, measure quadric error
Iteratively collapse edges
Which edges? Assign score with quadric error metric
- Approximate distance to surface as sum of distances to planes containing triangles
- Iteratively collapse edge with smallest score
- Greedy algorithm gives great results

7. Ray tracing

Why ray tracing?

Rasterization could't handle global effects well
- (Soft) shadows
- And especially when the light bounces more than once
Rasterization is fast, but quality is relatively low
Ray tracing is accurate, but is very slow
- Rasterization: real-time, ray tracing: offline
- ~10K CPU core hours to render one frame in production

Basic ray tracing algorithms

Light rays

Three ideas about light rays

Light travels in straight lines (though this is wrong)
Light rays do not "collide" with each other if they cross (though this is still wrong)
Light rays travel from the light sources to the eye (but the physics is invariant under path reversal - reciprocity)
- "And if you gaze long into an abyss, the abyss also gazes into you"

Ray casting

Generate an image by casting one ray per pixel
Check for shadows by sending a ray to the light

Recursive (Whitted-style) ray tracing

"An improved illumination model for shaded display". Simulate how light bounces around

Ray-surface intersection

Ray equatioin

Ray is defined by its origin and a direction vector: Here

: a point along the ray
: "time"
: origin
: (normalized) direction

Ray intersection with sphere

Sphere:

The intersection must satisfy both ray equation and sphere equation. So: Solving: where

Ray intersection with implicit surface

General implicit surface:

Substitute ray equation: , and solve for real, positive roots

Ray intersection with triangle mesh

Why triangle?

Rendering: visibility, shadows lighting ...
Geometry: inside / outside test

How to computer? Let's break this down:

Simple idea: just intersect ray with each triangle
Simple, but slow
- Can have 0, 1 intersections (ignoring multiple intersections)

Triangle is in a plane

Ray-plane intersection
Test if hit point is inside triangle

Plane equation: plane is defined by normal vector and a point on plane Then we can solve for intersection:

Set and solve for :

Check:

Möller Trumbore algorithm

A faster approach, giving barycentric coordinate directly are barycentric coordinates and we solve for

Accelerating ray-surface intersection

Ray tracing - performance challenges

Simple ray-scene intersection

Exhaustively test ray-intersection with every object
Find the closest hit (with minimum )

Problem:

Naive algorithm = #pixles #objects #bounces
Very slow

Bounding volumes

Quick way to avoid intersections: bound complex object with a simple volume

Object is fully contained in the volume
If it doesn't hit the volume, it doesn't hit the object
So test the bounding volumes first, then test the objects

Ray-intersection with box

Box is the intersection of 3 pairs of slabs. Specifically, we often use an axis-aligned bounding box (AABB), i.e. any side of the bounding box is along either x, y, or z axis

Ray intersection with axis-aligned box

2D example (3D is the same): Compute intersections with slabs and take intersection of intervals

A box (3D) = three pairs of infinityly large slabs
Key ideas
- The ray enters the box only when it enters all pairs of slabs
- The ray exits the box as long as it exits any pair of slabs
For each pair, calculate the and (negative is fine)
Fox the 3D,
If , we known that the ray stays a while in the box so they must intersect
However, ray is not a line
- Should check whether is negative for physical correctness
What if ?
- The box is "behind" they ray - no intersection
What if and ?
- Then the ray's origin is inside the box - have intersection
In summary, ray and AABB intersect iff
- and

Why axis-aligned?

For the general surface, the computation is complex. But, for example, for slabs perpendicular to x-axis, we have:

Uniform spatial partitions (Grids)

Preprocess - build acceleration grid
1. Find bounding box
2. Create grid
3. Store each object in overlapping cells
Ray-scene intersection
- Step through grid in ray traversal order
- For each grid cell:
  - Test intersection with all objects stored at that cell
Grid resolution?
- One cell: no speedup
- Too many cells: Inefficiency due to extraneous grid traversal
- Heuristic:
  - Number of cells = C * number of objects
  - C 27 in 3D

Spatial partitions

KD-Tree pre-processing

Data structure for KD-Trees

Internal nodes store
- Split axis: x-, y-, or z- axis
- Split position: coordinate of split plane along axis
- Children: pointers to child nodes
- No objects are stored in internal nodes, only in leaves
Leaf nodes store
- List of objects

Traversing a KD-Tree

Going throw the tree and check if the ray intersect each internal nodes. If it does, goes down in the tree. If it is a leaf node, check if the ray intersect with all of the objects in the leaf node

Object partition & Bounding Volume Hierarchy (BVH)

Building BVH

How to subdivide a node?

Choose a dimension to split
Heuristic #1: Always chose the longest axis in node
Heuristic #2: Split node at location of median object

Data structure for BVHs

Internal node store

Bounding box
Children: pointers to child nodes

Leaf nodes store

Bounding box
List of objects

Nodes represent subset of primitives in scene

All objects in subtree

Pseudo code

Intersect(Ray ray, BVH node) {
    if (ray misses node.bbox) return;
    if (node is a leaf node) 
        test intersection with all objs;
    	return closest intersection;
    
    hit1 = Intersect (ray, node.child1);
    hit2 = Intersect (ray, node.child2);
    
    return the closer of hit1, hit2;
}

Spatial vs Object partitions

Spatial partition (e.g. KD-tree)

Partition space into non-overlapping regions
An object can be contained in multiple regions

Object partition (e.g. BVH)

Partition set of objects into disjoint subsets
Bounding boxes for each set may overlap in space

8. Basic radiometry

Motivation why we study radiometry

Observation

In Blinn-Phong model, we might use for example the light intensity . But what is it? What is the unit?
Do you think Whitted style ray tracing gives correct results?
All the answers can be found in radiometry

Radiometry

Measurement system and units for illumination

Accurately measure the spatial properties of light

New terms: Radiant flux, intensity, irradiance, radiance

Perform lighting calculations in a physically correct manner

Radiant energy and flux (power)

Def: Radiant energy is the energy of electromagnetic radiation. It is measured in units of joules, and denoted by the symbol: Def: Radiant flux (power) is the energy emitted, reflected, transmitted or received, per unit time Flux can also be considered as #photons flowing through a sensor in unit time

Important light measurements of interest

Light emitted from a source: Radiant intensity
Light falling on a surface: Irradiance
Light traveling along a ray: Radiance

Radiant intensity

Def: The radiant (luminous) intensity is the power per unit solid angle emitted by a point light source

Angles and solid angles

Angle: Ratio of subtended arc length on circle to radius. Circle has radians

Solid angle: Ratio of subtended area on sphere to radius squared. Sphere has steradians

Differential solid angles:

Sphere:

Isotropic point source

Irradiance

Def: The irradiance is the power per (perpendicular / projected) unit area incident on a surface point Lambert's cosine law: Irradiance at surface is proportional to consine of angle between light direction and surface normal which is how the Lambert's cosine law is defined

Irradiance get lower and lower the further away from the light source

Radiance

Radiance is the fundamental field quantity that describes the distribution of light in an environment

Radiance is the quantity associated with a ray
Rendering is all about computing radiance

Def: The radiance (luminance) is the power emitted, reflected, transmitted or received by a surface, per unit solid angle, per projected unit area Here accounts for projected surface area

Recall:

Irradiance: power per projected unit area
Intensity: power per solid angle

So:

Radiance: Irradiance per solid angle
Radiance: Intensity per projected unit area

Incident radiance

Incident radiance is the irradiance per unit solid angle arriving at the surface, i.e. it is the light arriving at the surface along a given ray (point on surface and incident direction)

Exiting radiance

Exiting surface radiance is the intensity per unit projected area leaving the surface, e.g. for an area light it is the light emitted along a given ray (point on surface and exit direction)

Irradiance vs. Radiance

Irradiance: total power received by area
Radiance: power received by area from "direction"

Here is the unit hemisphere

Bidirectional Reflectance Distribution Function (BRDF)

Reflection at a point

Radiance from direction turns into the power that receives. Then power will become the radiance to any other direction

Differential irradiance incoming:

Differential radiance exiting (due to ):

BRDF

The BRDF represents how much light is reflected into each outgoing direction from each incoming directioin

The reflection equation

Challenge: recursive equation

Reflected radiance depends on incoming radiance
But incoming radiance depends on reflected radiance (at another point in the scene)

The rendering equation

Re-write the reflection equation by adding an emission term to it general in case if the object is itself a light source Rendering equation as integral equation is a Fredholm Integral Equation of second kind with canonical form Or a linear operator equation:

Ray tracing and extensions

General class numerical Monte Carlo methods
Approximate set of all paths of light in scene With Binomial theorem: Here:
- : Emission directly from light sources
- : Direct illumination on surfaces
- : Indirect illumination (one bounce indirect)
- : Indirect illumination (two bounce indirect)
Shading in rasterization consider only

Monte Carlo path tracing

Monte Carlo Integration

We want to solve an integral, but it can be too difficult to solve analytically. Let's define the Monte Carlo estimator for the definite integral of given function

Random variable:
Monte Carlo estimator:

For example, uniform random variable: , then: Some notes:

The more sample, the less variance
Sample on , integrate on

Path tracing

Path tracing can be almost 100% correct, a.k.a PHOTO-REALISTIC

Whitted-style ray tracing

Always perform specular reflections / refractions
Stop bouncing at diffuse surfaces
These simplifications are not always reasonable
- But the rendering equation is correct
- However it involves:
  - Solving an integral over the hemisphere
  - Recursive execution

A simple Monte Carlo solution

Suppose we want to render one pixel (point) for direct illumination only. The the rendering equation is just an integration over directions. So we can solve it using Monte Carlo integration.

What's our ?
What's our pdf? Assume uniformly sampling the hemisphere.

Which gives: Which can be written in pseudo code:

share(p, wo)
	Randomly choose N directions wi ~ pdf
	Lo = 0.0
	For each wi
		Trace a ray r(p, wi)
		If ray r hit the light
			Lo += (1/N) * L_i * f_r * cosine / pdf(wi)
	return Lo

Introducing global illumination

One more step forward: What if a ray hits an object? It is also a light source then. Which will change the previous code to:

share(p, wo)
	Randomly choose N directions wi ~ pdf
	Lo = 0.0
	For each wi
		Trace a ray r(p, wi)
		If ray r hit the light
			Lo += (1 / N) * L_i * f_r * cosine / pdf(wi)
		Else If ray r hit an object at q
			Lo += (1 / N) * shade(q, -wi) * f_r * cosine / pdf(wi)
	Return Lo

Problems with this kind of path tracing

Explosion of number of rays as number of bounces go up

Solution: Only 1 ray is traced at each shading point. This is then called Path tracing.
If , then it is distributed ray tracing

Ray generation:

     ray_generation(camPos, pixel)
     	Uniformly choose N sample positions within the pixel
     	pixel_radiance = 0.0
     	For each sample in the pixel
     		Shoot a ray r(camPos, cam_to_sample)
     		If ray r hit the scene at p
     			pixel_radiance += 1 / N * shade(p, sample_to_cam)
     	Return pixel_radiance

2. The recursive algorithm will never stop

   * Solution: Russian Roulette (RR)

   * Russian Roulette is all about probability. 

     * With probability $0<P<1$, you are fine
     * With probability $1-P$, otherwise

   * Previously, we always shoot a ray at a shading point and get the shading result $L_o$

   * Suppose we manually set a probability $P$, shoot a ray and return the shading result divided by $P$: $L_o/P$

   * With probability $1-P$, don't shoot a ray and you will get 0

   * In this way, you can still get a expect value of $Lo$: 
     $$
     E = P \times (L_o/P)  + (1-P)\times 0  = L_o
     $$

   * Which gives the following pseudo code:

     ```pseudocode
     shade(p, wo)
     	Manually specify a probability P_RR
     	Randomly select ksi in a uniform dist. in [0, 1]
     	If (ksi > P_RR) return 0.0
     	
     	Randomly choose One direction wi~pdf(w)
     	Trace a ray r(p, wi)
     	If ray r hit the light
     		Return L_i * f_r * cosine / pdf(wi) / P_RR
         Else If ray r hit an object at q
         	Return shade(q, -wi) * f_r * cosine / pdf(wi) / P_RR

Now, the path tracing is correct

However, still not really efficient

Understanding the reason of being inefficient: There will be 1 ray hitting the light. So a lot of rays are "wasted" if we uniformly sample the hemisphere at the shading point

Sampling the light

Monte Carlo methods allows any sampling methods, so we can sample the light (therefore no rays are "wasted")
Assume uniformly sampling on the light: Because But the rendering equation integrates on the solid angle:
Recall: for Monte Carlo integration, sample on then integration on :
- Need to make the rendering equation as an integral of . Need the relationship between and .
- Projected area on the unit sphere: Here is the angle between the normal of the light and the vector from to

Then the rendering equation become: Which gives the pseudo code:

shade(p, wo)
	# Contribution from the light source
	L_dir = 0.0
	Uniformly sample the light at xp (pdf_light = 1 / A)
	Shoot a ray from p to xp
	If the ray is not blocked in the middle
		L_dir = L_i * f_r * cos theta * cos thetap / |xp - p|^2 / pdf_light
	
	# Contribution from other reflectors
	L_indir = 0.0
	Test Russian Roulete with probability P_RR
	Uniformly sample the hemisphere toward wi (pdf_hemi = 1 / 2pi)
	Trace a ray r(p, wi)
	If ray r hit a non-emitting object at q
		L_indir = shade(q, -wi) * f_r * cos theta / pdf_hemi / P_RR
    
    Return L_dir + L_indir

Now, Path tracing is finally done

9. Materials and Appearances

What is material in computer graphics?

Material == BRDF

Diffuse / Lambertian material:

Light is equally reflected in each output direction. Suppose the incident lighting is uniform: Which gives albedo (color):

Some materials

Glossy material (BRDF)
Ideal reflective / refractive material (BSDF)
Isotropic / Anisotropic materials (BRDFs)
- Key: directionality of underlying surface
- Anisotropic BRDF: Brushed metal

Fresnel reflection / term

Reflectance depends on incident angle (and polarization of light). This terms differs from different material. There is a very complex formulate. To make it accurate, you need to consider polarization.

Schlick's approximation: Here

Microfacet Material

Microfacet theory

Rough surface

Macroscale: flat & rough
Microscale: bumpy & specular

Microfacet BRDF

Key: the distribution of microfacets' normals
- Concentrated <=> Glossy
- Spread <=> Diffuse
What kind of microfacet reflect to ? Here
- : half vector
- : Fresnel term
- : Shadowing masking term
- : Distribution of normals

Isotropic / Anisotropic materials (BRDFs)

Key: directionality of underlying surface
Anisotropic BRDF: For example, brushed metal

Properties of BRDFs

Non-negativity
Linearity
Reciprocity principle
Energy conservation
Isotropic vs. anisotropic
- If isotropic,
- Then, from reciprocity:

Measuring BRDFs

Motivation

Avoid need to develop / derive models

Automatically includes all of the scattering effects present

Can accurately render with real-world materials

Useful for product design, special effects, ...

Theory and practical do not align

Image-based BRDF measurement

Goniorefectometer, also called spherical gantry at UCSD. General approach:

foreach outgoing direction wo
	move light to illuminate surface with a thin beam from wo
	foreach incoming direction wi
		move sensor to be at direction wi from surface
		measure incident radiance

Improving efficiency

Isotropic surfaces reduce dimensionality from 4D to 3D
Reciprocity reduces number of measurements by half
Clever optical systems...

Challenges in measuring BRDFs

Accurate measurements at grazing angles
- Important due to Fresnel effects
Measuring with dense enough sampling to capture high frequency specularities
Retro-reflection
Spatially-varying reflectance, ...

Representing measured BRDFs

Desirable qualities

Compact representation
Accurate representation of measured data
Efficient evaluation for arbitrary pairs of directions
Good distributions available for importance sampling

10. Advanced topics in rendering

Biased vs. unbiased Monte Carlo estimators

An unbiased Monte Carlo technique does not have any systematic error
- The expected value of an unbiased estimator will always be the correct value, no matter how many samples are used
Otherwise, biased
- One special case, the expected value converges to the correct value as infinite number of samples are used -> consistent
- Biased == blurry
- Consistent == not blurry with infinite number of samples

Bidirectional path tracing (BDPT)

Recall: a path connects the camera and the light
BDPT
- Traces sub-paths from both the camera and the light
- Connects the end points from both sub-paths
Suitable if the light transport is complex on the light's side
Unbiased
Difficult to implement & quite slow

Metropolis light transport (MLT)

A Markov chain Monte Carlo (MCMC) application
- Jumping from the current sample to the next with some PDF
Works great with difficult light paths
Unbiased
Difficult to estimate the convergence rate
Does not guarantee equal convergence rate per pixel
So, usually produces "dirty" results
Therefore, usually not used to render animations

Photon mapping

A biased approach & a two-stage method
Very good at handling Specular-Diffuse-Specular (SDS) paths and generating caustics
Approach (variations apply)
- Stage 1 - photon tracing
  - Emitting photons from the light source, bouncing them around, then recording photons on diffuse surfaces
- Stage 2 - photon collection (final gathering)
  - Shoot sub-paths from the camera, bouncing them around, until they hit diffuse surfaces
Calculation - local density estimation
- Idea: areas with more photons should be brighter
- For each shading point, find the nearest photons. Take the surface area they over
  - Small -> noisy
  - Large -> blurry
Why biased?
- Local density estimation:
- But in the sense of limit
  - More photons emitted -> the same photons covers a smaller -> is closer to

Vertex connect and merging

A combination of BDPT and photon mapping
Key idea
- Let's not waste the sub-paths in BDPT if their end points cannot be connected but can be merged
- Use photon mapping to handle the merging of nearby "photons"

Instant radiosity (IR)

Sometimes also called many-light approaches
Key idea
- Lit surface can be treated as light sources
Approach
- Shoot light sub-paths and assume the end point of each sub-path is a Virtual Point Light (VPL)
- Render the scene as usual using these VPLs
Pros: fast and usually gives good results on diffuse scenes
Cons:
- Spikes will emerge when VPLs are close to shading points
- Cannot handle glossy materials

Advanced appearance modeling

Non-surface models
- Participating media
- Hair / fur / fiber (BCSFD)
- Granular material
Surface models
- Translucent material (BSSRDF)
- Cloth
- Detailed material (non-statistical BRDF)
Procedural appearance

Non-surface models

Participating media: cloud / mist

At any point as light travels through a participating medium, it can be (partially) absorbed and scattered
Use phase function to describe the angular distribution of light scattering at any point within participating media
Rendering
- Randomly choose a direction to bounce
- Randomly choose a distance to go straight
- At each "shading point", connect to the light

Hair appearance

Marschner model

Glass-like cylinder
3 types of light interactions: , , , here:
- Reflection:
- Transmission:
Human hair vs animal fur
- Cortex
  - Contains pigments
  - Absorbs light
- Medulla
  - Complex structure
  - Scatters light
  - Animal fur has a much larger medulla than human hair
- Cuticle
  - Covered with scales

Granular material

Lots of small particles
Can we avoid explicit modeling of all granules?
- Yes, with procedural definition

Surface models

Translucent material: i.e. jade

Subsurface scattering
- Visual characteristics of many surfaces caused by light existing at different points than it enters
- Violates a fundamental assumption of BRDF
Scattering functions
- BSSRDF
  - Generalization of BRDF
  - Exitant radiance at one point due to incident differential irradiance at another point
- Generalization of rendering equation
  - Integrating over all points on the surface and all directions
Dipole approximation
- Approximate light diffusion by introducing two point sources

Cloth

A collection of twisted fibers
Two levels of twist
- Cloth includes several cloth
- Yarn includes several ply
- Ply includes several fibers
Render as surface
- Given the weaving pattern, calculate the overall behavior
- CameRender using BRDF
Render as participating media
- Properties of individual fibers and their distribution -> scattering parameters
- Render as a participating medium
Render as actual fibers
- Render every fiber explicitly
- Too complex

Detailed appearance

Not looking realistic: Looking too perfect. Real world is more complex
Add detailed
Wave optics

Procedural appearance

Can we define details without textures?
- Yes, compute a noise function on the fly
- 3D noise -> Internal structure if cut or broken
- Thresholding (noise -> binary noise)

11. Cameras, Lenses and Light Fields

Camera

What's happening inside the camera?

Pinholes & lenses form image on sensor
Shutter exposes sensor for precise duration
Sensor accumulates irradiance during exposure

Why not sensors without lenses?

Each sensor point would integrate light from all points on the object, so all pixel values would be similar, i.e. the sensor records irradiance

Field of view (FOV)

For a fixed sensor size, decreasing the focal length increases the field of view

Focal length v. Field of view

For historical reasons, it is common to refer to angular field of view by focal length of a lens used on a 35mm-format film (36x24 mm)
Examples of focal lengths on 35mm format
- 17mm is wide angle
- 50mm is a "normal" lens
- 200mm is telephoto lens
Careful! When we say current cell phones have approximately 28mm "equivalent" focal length, this uses the above convention

Sensor sizes

Maintain FOV on smaller sensor?

To maintain FOV, decrease focal length of lens in proportion to width / height of sensor

Exposure

Exposure = time irradiance
Exposure time ()
- Controlled by shutter
Irradiance ()
- Power of light falling on a unit area of sensor
- Controlled by lens aperture and focal length

Exposure controls in photography

Aperture size

Change the f-stop by opening / closing the aperture (if camera has iris control)
F-number (F-stop): Exposure level
- Written as FN or F/N. N is the f-number
- Informal understanding: the inverse-diameter of a round aperture
- Formal: the f-number of a lens is defined as the focal length divided by the diameter of the aperture

Shutter speed

Change the duration the sensor pixels integrate light
Side effect of shutter speed
- Motion blur: handshake, subject movement
- Doubling shutter time doubles motion blur
  - Motion blur is not always bad
- Rolling shutter: different parts of photo taken at different times

ISO gain

Change the amplification (analog and/or digital) between sensor values and digital image values
Film: trade sensitivity for grain
Digital: trade sensitivity for noise
- Multiply signal before analog-to-digital conversion
- Linear effect (ISO 200 needs half the light as ISO 100)

Constant exposure: F-stop vs. Shutter speed

If the exposure is too bright/dark, may need to adjust f-stop and / or shutter up / down

Photographers must trade off depth of field and motion blur for moving subjects

Thin lens approximation

Real lens designs are highly complex

Ideal thin lens - focal point

All parallel rays entering a lens pass through its focal point
All rays through a focal point will be in parallel after passing the lens
Focal length can be arbitrarily changed (in reality, yes with several lenses)

The thin lens equation

Ray tracing ideal thin lenses

Ray tracing for defocus blur

(One possible) setup:

Choose sensor size, lens focal length and aperture size
Choose depth of subject of interest z

Rendering:

For each pixel on the sensor (actually, film)
Sample random points on lens plane
You know the ray passing through the lens will (using the thin lens formula)

Depth of field

Set circle of confusion as the maximum permissible blur spot on the image plane that will pear sharp under final viewing conditions

Circle of confusion for depth of field

I.e. depth range in a scene where the corresponding CoC is considered small enough

Light field / lumigraph

The Plenoptic function

is intensity of light

Seen from a single view point
At a single time
Averaged over the wavelengths of the visible spectrum

Improved version is intensity of light

Seen from ANY viewpoint
Over time
As a function of wavelength
This is the holographic movie
Can reconstruct every possible view, at every moment, from every position, at every wavelength
Contains every photograph, every movie, everything that anyone has ever seen. It completely captures our visual reality

Only plenoptic surface

The surface of a cube holds all the radiance information due to the enclosed object

12. Color and Perception

Physical basis of color

The fundamental components of light

Newton showed sunlight can be subdivided into a rainbow with a prism
Resulting light cannot be further subdivided with a second prism

Spectral power distribution (SPD)

Salient property in measuring light

The amount of light present at each wavelength
Units:
- Radiometric units / nanometer (e.g. watts / nm)
- Can also be unit-less
Often use "relative units" scaled to maximum wavelength for comparison across wavelengths when absolute units are not important

What is color?

Color is a phenomenon of human perception, it is not a universal property of light
Different wavelengths of light are not "colors"

Biological basis of color

Retinal photoreceptor cells: Rods and Cones

Rods are primary receptors in very low light ("scotopic" conditions), e.g. dim moonlight

~ 120 million rods in eye
Perceive only shades of grey, no color

Cones are primary receptors in typical light levels ("photopic")

~ 6-7 million cones in eye
Three types of cones, each with different spectral sensitivity
Provide sensation of color

Spectral response of human cone cells

Three types of cone cells: S, M, and L (corresponding to peak response at short, medium, and long wavelengths)
Fraction of three cone cell types varies widely
- Distribution of cone cells at edge of fovea in 12 different humans with normal color vision. High variability of percentage of different cone types
Now we have three detectors (S, M, L cones cells)

The human visual system

Human eye does not measure and brain does not receive information about each wavelength of light
Rather, the eye "sees" only three response values (S, M, L), and this is only info available to brain

Metamers

Metamers are two different spectra (-dim) that project to the same (S, M, L) (3-dim) response

These will appear to have the same color to a human

The existence of metamers is critical to color reproduction

Don't have to reproduce the full spectrum of a real world scene
Example: A metamer can reproduce the perceived color of a real-world scene on a display with pixels of only three color

This is the theory behind color matching

Color reproduction / Matching

Additive color

Given a set of primary lights, each with its own spectral distribution (e.g. RGB display pixels)
Adjust the brightness of these lights and add them together
The color is now described by the scalar values:

CIE RGB color matching experiment

Same setup as additive color matching before, but primaries are monochromatic light (single wavelength): 700nm, 546.1nm, 435.8nm
Graph plots how much of each CIE RGB primary light must be combined to match a monochromatic light of wavelength given on x-axis
For any spectrum , the perceived color is matched by the convolution of the spectrum and the CIE RGB color match graph

Color spaces

Standard color spaces (sRGB)

Makes a particular monitor RGB standard
Other color devices simulate that monitor by calibration
Widely adopted today
Gamut is limited

A universal color space: CIE XYZ

Imaginary set of standard color primaries X, Y, Z

Primary colors with these matching functions do not exist
Y is luminance (brightness regardless of color)

Designed such that

Matching functions are strictly positive
Span all observable colors

Separating luminance, Chromaticity

Luminance: Y
Chromaticity: x, y, z, define as
Since , we only need to record two of the three
- Usually choose and , leading to coordinates at a specific brightness
The curved boundary
- Named spectral locus
- Corresponds to monochromatic light (each point representing a pure color of a single point)

Gamut

Different color space gives a different gamut.

Perceptually organized color spaces

HSV color space (Hue-Saturation-Value)

Axes correspond to artistic characteristics of color
idely used in a "color picker"
Hue
- the "kind" of color, regardless of attributes
- colorimetric correlate: dominant wavelength
- artist's correlate: the chosen pigment color
Saturation
- the "colorfulness"
- colorimetric correlate: purity
- artist's correlate: fraction of paint from the colored tube
Lightness (or value)
- the overall amount of light
- colorimetric correlate: luminance
- artist's correlate: tints are lighter, shades are darker

CIELAB space (aka. L* a* b*)

A commonly used color space that strives for perceptual uniformity

L* is lightness (brightness)
a* and b* are color-opponent pairs
- a* is red-green
- b* is blue-yellow
Opponent color theory
- There is a good neurological basis for the color space dimensions in CIE LAB
  - The brain seems to encode color early on using three axes:
    - White - black, red - green, yellow - blue
  - The white - black axis is lightness, the others determine hue and saturation
- One pieces of evidence: you have a light green, a dark green, a yellow green, or a blue green, you can't find a red-green

CMYK: A subtractive color space

Subtractive color model

The more you mix, the darker it will be
Cyan, magenta, yellow, and key

Widely used in printing

13. Animation

Animation

"Bring things to life"

Communication tool
Aesthetic issues often dominate technical issues

An extension of modeling

Represent scene models as a function of time

Output: sequence of images that when viewed sequentially provide a sense of motion

Film: 24 frames per second
Video (in general): 30 fps
Virtual reality: 90 fps

History

First firm
- Originally used as scientific tool rather than for entertainment
- Critical technology that accelerated development of animation
First hand-drawn feature-length (>40 mins) animations: 1937
First digital computer-generated animation: 1963
Early computer animation: 1972
Digital dinosaurs: 1993
First CG feature-length film: 1995
Computer animation becomes quite good: 2009

Keyframe animation

Animator (e.g. lead animator) creates keyframes
Assistant (person or computer) creates in-between frames

Keyframe interpolation

Think of each frame as a vector of parameter values
Keyframe interpolation of each parameter
- Linear interpolation usually not good enough
- Splines for smooth / controllable interpolation

Physical simulation

Physical based animation

Generate motion of objects using numerical simulation

Mass spring system: Example of modeling a dynamic system

A simple idealized spring

Force pulls points together
Strength proportional to displacement (Hooke's law)
is a spring coefficient: stiffness

Non-zero length string

Spring with non-zero rest length Here is the rest length
Problem: infinite oscillation

Introducing energy loss

Simple motion damping:
Behaves like viscous drag on motion
Slows down motion in the direction of velocity
is a damping coefficient
Problem: Slows down all motion
- Want a rusty spring's oscillations to slow down, but should it also fall to the ground more slowly

Internal damping for spring

Damp only the internal, spring-driven motion
Viscous drag only on change in spring length
- Won't slow group motion for the spring system (e.g. global translation or rotation of the group)
Note this is only one specific type damping

Structures from springs

Behavior is determined by structure linkages
Sheets
Blocks
Others

Particle systems

Model dynamical systems as collections of large numbers of particles

Each particle's motion is defined by a set of physical (or non-physical) forces

Popular technique in graphics and games

Easy to understand, implement
Scalable: fewer particles for speed, more for higher complexity

Challenges

May need many particles (e.g. fluids)
May need acceleration structures (e.g. to find nearest particles for interactions)

Particle system animations

For each frame in animation

[If needed] create new particles
Calculate forces on each particle
Update each particle's position and velocity
[If needed] remove dead particles
Render particles

Particle system forces

Attraction and repulsion forces

Gravity, electromagnetism, ...
Springs, propulsion, ...

Damping forces

Friction, air drag, viscosity, ...

Collisions

Walls, containers, fixed objects, ...
Dynamic objects, character body parts, ...

Simulated flocking as an ODE

Model each bird as a particle

Kinematics

Forward kinematics

Articulated skelecton

Topology (what's connected to what)
Geometric relations from joints
Tree structure (in absence of loops)

Joint types

Pin (1D rotation)
Ball (2D rotation)
Prismatic joint (translation)

Forward kinematics

Animation is described as angle parameter values as a function of time

Kinematics pros and cons

Strengths

Direct control is convenient
Implementation is straightforward

Weaknesses

Animation may be inconsistent with physics
Time consuming for artists

Inverse kinematics

Direct inverse kinematics: for two-segment arm, can solve for parameters analytically

Why is the problem hard?

Multiple solution in configuration space
Solutions may not always exist

Numerical solution to general N-link IK problem

Choose an initial configuration
Define an error metric (e.g. square of distance between goal and current position)
Compute gradient of error as function of configuration
Apply gradient descent (or Newton's method, or other optimization procedure)

Rigging

Rigging is a set of higher level controls on a character that allow more rapid & intuitive modification of pose, deformations, expressions, etc.

Important

Like strings on a puppet
Captures all meaningful character changes
Varies from character to character

Expensive to create

Manual effort
Requires both artistic and technical training

Blend shapes

Instead of skeleton, interpolate directly between surfaces

E.g. model a collection of fracial expressions

Simplest scheme: take linear combination of vertex positions

Spline used to control choice of weights over time

Motion capture

Data-driven approach to creating animation sequences

Record real-world performances (e.g. person executing an activity)
Extract pose as a function of time from the data collected

Strengths

Can capture large amounts of real data quickly
Realism can be high

Weaknesses

Complex and costly set-ups
Captured animation may not meet artistic needs, requiring alterations

Single particle simulation

First study motion of a single particle

To start, assume motion of particle determined by a velocity vector field that is a function of position and time:

Computing position of particle over time requires solving a first ODE:

We can solve the ODE, subject to a given initial, particle position , by using forward numerical integration

Euler's method

Simple iterative method
Commonly used
Very inaccurate
Most often goes unstable

Errors of Euler method

With numerical integration, errors accumulate. Euler integration is particularly bad

Instability of the Euler method

Inaccuracies increase as time step increases
Instability is a common, serious problem that can cause simulation to diverge
lack of stability is a fundamental problem in simulation, and cannot be ignored

Some methods to combat instability

Midpoint method / modified Euler

Average velocities at start and endpoint

Adaptive step size

Compare one step and two half-steps, recursively, until error is acceptable

Implicit methods

Use the velocity at the next time step

Position-based / Verlet integration

Constrain positions and velocities of particles after time step

Midpoint method

Compute Euler step
Compute derivative at midpoint of Euler step
Update position using midpoint derivative

Adaptive step size

Technique for choosing step size based on error estimate

Very practical technique
But may need very small steps

Repeat until error is below threshold:

Compute an Euler step, size
Compute two Euler step, size
Compute error
If (error > threshold), reduce step size and try again

Implicit Euler method

Informally backward methods

Use derivatives in the future, for the current step Solve nonlinear problem for and

Use root-finding algorithm, e.g. Newton's method

Offers much better stability

How to determine / quantize "stability"

We use the local truncation error (every step) / total accumulated error (overall)
Absolute values do not matter, but the orders w.r.t. step
Implicit Euler has order 1, which means that
- Local truncation error:
- Global truncation error:
Understanding of
- If we halve , we can expect the error to halve as well

Runge-Kutta Families

A family of advanced methods for solving ODEs

Especially good at dealing with non-linearity
Its order-four version is the most widely used, a.k.a. RK4

Position-based / Verlet integration

Idea

After modified Euler forward-step, constrain positions of particles to prevent divergent, unstable behavior
Use constrained positions to calculate velocity
Both of these ideas will dissipate energy, stabilize

Pros / Cons

Fast and simple
Not physically based, dissipates energy (error)

Rigid body simulation

Simple case

Similar to simulating a particle
Just consider a bit more properties Here
- : positions
- : rotation angle
- : angular velocity
- : forces
- : torque
- : momentum of inertia

A simple position-based method

Key idea:

Assuming water is composed of small rigid-body spheres
Assuming the water cannot be compressed
So, as long as the density changes somewhere, it should be "corrected" via changing the position of particles
You need to know the gradient of the density anywhere w.r.t. each particle's position
Update with gradient descent

Eulerian vs. Lagrangian

Two different views to simulating large collections of matters

Lagrangian approach: Follows the same particle through out its movement
Eulerian approach: Split the space into smaller meshes, follows how each cell change over time

Material point method (MPM)

Hybrid, combining Eulerian and Lagrangian views

Lagrangian: consider particles carrying material properties
Eulerian: use a grid to do numerical integration
Interaction: particles transfer properties to the grid, grid performs update, then interpolate back to particles

0. Reference

GAMES 101

1. Overview of Computer Graphics

Why study computer graphics?

Fundamental intellectual challenges

Technical challenges

Course topics

Rasterization

Curves and meshes

Ray tracing

Animation / simulation

2. Review of Linear Algebra

Graphics' Dependencies

Cross product in graphics

3. Transformation

Why study transformation

2D transformations

Scale transform

Non-uniform scale transform

Reflection matrix

Shear matrix

Rotate (about the origin , count-clockwise by default)

Linear transformation = Matrix multiplication

Homogeneous coordinate

Translation

Why homogeneous coordinates

Translation cannot be represented in matrix form

Solution: Homogenous coordinates

Affine transformations

Affine map = linear map + translation

Using homogenous coordinates

Inverse transform

Composing transformations

Transform ordering matters

Sequence of affine transforms ​, ...

Decomposing complex transforms

3D transforms

3D transformations

Scale

Translation

Rotation around x-, y-, or z-axis

Rodrigues' rotation formula

View / camera transformation

What is view transformation?

How to perform view transformation?

Projection transformation

Orthographic projection

The camera is considered located at infinite far away

In general

Perspective projection

How to do perspective projection

4. Rasterization

Perspective projection

What is near plane's ?

What is after MVP?

Canonical cube to screen

Rasterization: Drawing to raster displays

Triangles - Fundamental shape primitives

Why triangles

What pixel values approximate a triangle?

Rasterization = Sampling a 2D indicator function:

Antialiasing

Sampling

Sampling is Ubiquitous in computer graphics

Sampling artifacts (errors / mistakes / inaccuracies) in computer graphics

Antialiasing idea: Blurring (Pre-Filtering) before sampling

But why aliasing and pre-filtering?

Fundamental reasons

Frequency domain

Filtering = Getting rid of certain frequency contents. Also, Filtering = Convolution (=Averaging)

Box filter

Sampling = Repeating frequency contents

How can we reduce aliasing error?

A practical pre-filter

Visibility / occlusion

Painter's algorithm

Z-buffering

5. Shading

Definition

Properties of shading

A simple shading model (Blinn-Phong reflectance model)

Perceptual observations and modeling

Sequence of affine transforms , ...