An Brief Introduction to Amortized Analysis

Posted on 2021-11-16 Edited on 2022-08-10 Word count in article: 5k Reading time ≈ 5 mins.

Sometimes we want to design data structures that trade per-operation efficiency for overall efficiency. To analyze efficiency of these data structures, we will need to use amortized analysis.

Examples

The two-stack queue

Maintain an In stack and an Out stack.
To enqueue an element, push it onto the In stack.
- Each enqueue takes time .
  - Just push an item onto the In stack.
To dequeue an element:
- If the Out stack is nonempty, pop it.
- If the Out stack is empty, pop lements from the In stack, pushing them into the Out stack. Then dequeue as usual.
- Dequeues can vary in their runtime.
  - Could be if the Out stack isn't empty.
  - Could be if the Out stack is empty.
  - It's wrong, but useful, to pretend the cost of a dequeue is .
    - Some operations take more time than this.
    - However, if we pretend each operation takes time , then the sum of all the costs never underestimates the total.
Intuition: We only do expensive dequeues after a long run of cheap enqueues.
- Provided we move all items from In to Out at once, and provided that enqueue items accumulate slowly, this is a fast strategy.
Key fact: Any series of operations on a two-stack queue will take time .
- Each item is pushed into at most two stacks and popped from at most two stacks.
- Adding up the work done per element across all operations, we can do at most work.

Dynamic arrays

A dynamic array is the most common way to implement a list of values.
Maintain an array slihtly biggerthan the one you need. When you run out of space, double the array size and copy the elements over.
Most appends to a dynamic array take time .
Infrequently, we do work to copy all elements from the old array to a new one.
Claim: The cost of doing appends to an initially empty dynamic array is always .
- The array doubles at sizes , etc.
- The very last doubling is at the largest power of two less than . This is at most .
- Total work done across all doubling is at most
- It's wrong, but useful, to pretend that the cost of an append is .

Building B-Trees

You are given a sorted list of values and a value of .
What's the most efficient way to construct a B-tree of order holding these values?
- Idea 1: Insertthe items into an empty B-tree in sorted order.
  - Cost: , due to the top-down search.
- Idea 2: Since all insertions will happen at the rightmost leaf, store a pointer to that leaf. Add new values by appending to this leaf, then doing any necessary splits.
  - Cost: The cost of an insert varies based on the shape of the tree.
    - If no splits are required, the cost is .
    - If one split is required, the cost is .
    - If we have to split all the way up, the cost is
  - Using our worst-case cost across inserts gives a runtime bound of
  - Claim: The cost of inserts is always
  - If is wrong, but useful, to pretend that the cost of an insert is
    - Some operations take more time than this
    - However, pretending each insert takes time never underestimates the total amount of work done across all operations.

Amortized analysis

The setup

We now have three examples of data strucutres where
- individual operations may be slow, but
- any series of operations is fast
Giving weak upper bounds on the cost of each operation is not useful for making predictions.
How can we clearly communicate when a situation like this one exists?

Key idea

Backcharge expensive operations to cheaper ones.
- For the real costs, most operations are fast, but we can't get a nice upper bound on any one operation cost.
- For the amortized costs, each operation is still reasonably fast, and all of them are nicely bounded from above.
Assign each operation a (fake) cost called its amortized cost such that, for any series of operations performed, the following is true:
Key intuition: amortized costs shift work backwards from expensive operations onto cheaper ones.

Formalizing this idea

Assigning amortized costs

The approach we've taken so far for assigning amortized costs is called an aggregate analysis.
- Directly compute the maximum possible work done across any sequence of operations, then divide that by the number of operations.
This approach works well here, but it doesn't scale well to more complex data structures.
- What if different operations contribute to / clean up messes in different ways?
- What if it's not clear what sequence is the worst-case sequence of operations?
In practice, we tend to use a different strategy called the potential method to assign amortized costs.

Potential functions

To assign amortized costs, we'll need to measure how "messy" the data structure is.
For each data structure, we define a potential function such that
- is small when the data strucuture is "clean"
- is large when the data structure is "messy"
Once we've chosen a potential function , we define the amortized cost of an operation to be:

where is the difference between just after the operation finishes and just before the operation started:
Intuitively:
- if increases, the data structure got "messier", and the amortized cost is higher than the real cost.
- If decreases, the data strucutre got "cleaner", and the amortized cost is lower than the real cost.
Let's make two assumptions:
- is always nonnegative.
- Thus, .
For example, for two-stack queues:
- height of In stack.
- Theorem: The amortized cost of anny enqueue or dequeue operation on a two-stack queue is .
For example, for dynamic arrays
- Goal: Choose potential function such that the amortized cost of an append is .
- Intuition: should measure how "messy" the data strucuture is.
  - Having lots of free slots means there's very little mess.
  - Having few free slots means there's a lot of mess.
- We can come up with
- Theorem: The amortized cost of an append to a dynamic array is .
For example, building B-trees.
- What is the actual cost of appending an element?
  - Suppose that we perform splits at layers in the tree.
  - Each split takes time to copy and move keys around.
  - Total cost: .
- Goal: Pick a potential function so that we can offset this cost and make each append cost amortized .
- Let be the number of keys on the right spine
- Each split moves (roughly) half the keys from the split node into a node off the right spine.
- Change in potential per split: .
- Net .