Introduction to Binomial Heap

Binomial heap is a useful data structure. This article will introduce some basic properties of binomial heap.

Binomial Heaps

Priority queues

  • A Priority queue is a data structure that supports these operations:
    • pq.enqueue(v, k), which enqueues element v with key k
    • pq.find-min(), which returns the element with the least key
    • pg.extract-min(), which removes and returns the element with the least key
  • Priority queues in practice
    • They're useful as building blocks in a bunch of algorithms
    • Many graph algorithms directly rely priority queues supporting extra operations:
      • meld(pq1, pq2): Destroy pq1 and pq2 and combine their elements into a single priority queue. (MSTs via Cheriton-Tarjan)
      • pq.decrease-key(v, k'): Given a pointer to element v already in the queue, lower its key to have new value k'. (Shortest paths via Dijkstra, global min-cut via Stoer-Wagner)
      • pq.add-to-all(dk): Add dk to the keys of each element in the priority queue, typically used with meld. (Optimum branchings via Chu-Edmonds-Liu)

Binary heaps

  • Priority queues are frequently implemented as binary heaps
    • enqueue and extract-min run in time ; find-min runs in time
  • These heaps are surprisingly fast in practice. It's tough to beat their performance.
    • -ary heaps can outperform binary heaps for a well-tuned value of , and otherwise only these sequence heap is known to specifically outperform this family.
    • In that case, why do we need other heaps?

Meldable Priority Queues

  • A priority queue supporting the meld operation is called a meldable priority queue.
  • Efficiently meldable queues
    • Standard binary heaps do not efficiently support meld
    • Intuition: Binary heaps are complete binary trees, and two complete binary cannot easily be linked to one another.
    • A different intuition
  • Building a priority queue
    • Idea: Store elements in "packets" whose sizes are powers of two and meld by "adding" groups of packets.
    • What properties must our packets have?
      • Sizes must be powers of two.
      • Can efficiently fuse packets of the same size.
        • As long as the packets provide access to the minimum, we can execute find-min in time
      • Can efficiently find the minimum element of each packet.
  • Inserting into the queue
    • If we can efficiently meld two priority queues, we can efficiently enqueue elements to the queue.
    • Idea: Meld together the queue and a new queue with a single packet.
      • Time required: fuses.
  • Deleting the minimum
    • After losing an element, the packet will not necessarily hold a number of elements that is a power of two.
    • If we have a packet with elements in it and remove a single element, we are left with remaining elements.
    • Idea: "Fracture" the packet into smaller packets, then add them back in.
      • We can extract-min by fracturing the packet containing the minimum and adding the fragments back in.
      • Runtime is fuses in meld, plus fracture cost.
  • What properties must our packets have?
    • Size is a power of two.
    • Can efficiently fuse packets of the same size.
    • Can efficiently find the minimum element of each packet.
    • Ca efficiently "fracture" a packet of nodes into packets of nodes.

Binomial trees

  • A binomial tree of order is a type of tree recursively defined as follows:
    • A binomial tree of order is a single node whose children are binomial trees of order .
  • Theorem: A binomial tree of order has exactly nodes.
    • Proof: Induction on . Assume that binomial trees of orders have nodes. The number of nodes in an order- binomial tree is So the claim holds for as well.
  • A heap-ordered binomial tree is a binomial tree whose nodes obey the heap property: all nodes are less than or equal to their descendants.
    • We will use heap-ordered binomial trees to implement our "packets"
    • Make the binomial tree with the larger root the first child of the tree with the smaller root.
  • A binomial heap is a collection of binomial trees stored in ascending order of size.
    • Operations defined as follows:
      • meld(pq1, pq2): Use addition to combine all the trees
        • Fuses trees. Cost: . Here assume one binomial heap has nodes, the other .
      • pq.enqueue(v, k): Meld pq and a singleton heap of (v, k)
        • Total time:
      • pq.find-min(): Find the minimum of all tree roots.
        • Total time:
      • pq.extract-min(): Find the min, delete the tree root, then meld together the queue and the exposed children.
        • Total time: .
  • Theorem: No comparison-based priority queue structure can have enqueue and extract-min each take time .
    • Proof: Suppose these operations each take time . Then we could sort elements by perform enqueues and then extract-mins in time . This is impossible with comparison-based algorithms.
    • We can't make both enqueue and extract-min run in time
    • However, we could conceivably make one of them faster.
    • Which one should we prioritize?
      • Probably enqueue since we aren't guaranteed to have to remove all added items.
      • Goal: make enqueue take time
      • The enqueue operation is implemented in terms of meld

Thinking with amortization: Lazy Melding

  • Consider the following lazy melding approach:
    • To meld together two binomial heaps, just combine the two sets of trees together.
  • Intuition: Why do any work to organize keys if we're not going to do an extract-min?
  • If we store our list of trees as circularly, doubly-linked lists, we can concatenate tree lists in time .
    • Cost of a meld:
    • Cost of an enqueue:

Lazy binomial heap

  • A lazy binomial heap is a binomial heap, modified as follows
    • The meld operation is lazy. It just combines the two groups of trees together.
    • After doing an extract-min, we do a coalesce to combine together trees until there's at most one tree of each order.
  • Intuitively, we'd expect this to amortize away nicely, since the "mess" left by meld gets cleaned up later on by a future extract-min.

Coalescing trees

  • The coalesce step repeatedly combines trees together until there's at most one tree of each order.
  • Observation: This would be a lot easier to do if all the trees were sorted by size.
  • We can sort our group of trees by size in time using a standard sorting algorithm.
  • Better idea: All the sizes are small integers. Use counting sort.
  • A faster implementation of coalesce:
    • Distribute the trees into an array of buckets big enough to hold trees of orders .
    • Start at bucket 0. While there's two or more trees in the buckets, fuse them and place the result one bucket higher.
  • Analyzing coalesce
    • Claim: Coalescing a group of trees takes time .
      • Time to create the array of buckets: .
      • Time to distribute trees into buckets: .
      • Time to fuse trees:
        • Number of fuses is , since each fuse decreases the number of trees by one. Cost per fuse is .
        • Need to iterate across buckets.
      • Total work done:
        • In the worst case, this is
    • But these are worst-case time bounds. Intuitively, things should nicely amortize away.
      • The number of trees grows slowly (one per enqueue)
      • The number of trees drops quickly (at most one tree per order) after an extract-min
      • Amortize analyzing: Set to the number of trees in the lazy binomial heap.

Analyzing extract-min

  • Suppose we perform an extract-min on a binomial heap with trees in it.
  • Initially, we expose the children of the minimum element. This increases the number of trees to .
  • The runtime for coalescing these trees is .
  • When we're done merging, there will be trees remaining, so
  • Amortized cost is:

Operation costs

  • Eager binomial heap:
    • enqueue:
    • meld:
    • find-min:
    • extract-min:
  • Lazy binomial heap:
    • enqueue:
    • meld:
    • find-min:
    • extract-min: (amortized)