Data Structures and Algorithms

  1. Introduction: What should we measure when we compare algorithms?
  2. Algorithmic Efficiency (Time Complexity)
  3. Orders of Efficiency
  4. Big-Oh Analysis of Loops
  5. Definition of Big-Oh
  6. Definition of Omega and Theta

Introduction: What should we measure when we compare algorithms?

Our first concern in analyzing problems and developing solutions is simply to ensure that we have a correct solution for the correct problem.

Another concern is to ensure that the solution is a maintainable one (readable, modifiable, etc.).

Yet another major concern is the efficiency of our solution, in terms of time and resource requirements.

It is this third issue that we will address now.

There are two theoretical issues we need to deal with when analyzing problems and solutions:

  1. How can we predict the running time of our solution compared to other possible solutions?
  2. How does the predicted running time of our solution compare to the difficulty of the problem itself, i.e. to the best possible solution?

Program efficiency

Measurements for different kinds of problem


Orders of efficiency

Some common orders

Big-O for searching and sorting


Calculating big-O: loops and statements

The next challenge is to take one of our own code segments and try and determine what complexity order it has.

Some basic guidelines:

Some examples:

Example calculations

COMPARING ALGORITHMS and ORDERS

When we determine the running time of two algorithms as some functions, f(n) and g(n), we will usually want to show that one algorithm is better (or worse, or no better, etc) than another.

We will do this by grouping the functions into orders, where a "larger" order contains all the functions in all smaller orders. Most of the proofs we look at regarding running time involve trying to prove whether a given function is inside or outside a particular order.

If we show one function is in the order of another then we know the first function is "no worse" than the second. If we then go on to show the second function is NOT in the order of the first function then we have effectively proved the first function is superior.

Definition of Big-Oh: A function f(n) is said to be in the Big-Oh of another function g(n) if the following holds:

Notation: f(n) ∈ O(g(n))

Statement: f(n) is in the Big-Oh of g(n)

I.e., if we're using f(n) and g(n) as the running time of two algorithms on a data set of size n, then if we look at large enough data sets we find that at worst f(n) is within a constant factor of g(n) -- it may be many many times faster, but at worst it's "not much slower".

We use this definition of asymptotic complexity to group functions in terms of "equivalent efficiency".

See the proof technique notes for more detailed discussion of proofs and proof techniques.
Example 1:

Example 2: Example 3: Example 4:

More General Proofs

Equivalence of functions: we say two functions, f(n) and g(n), are asymptotically equivalent iff f(n) ∈ O(g(n)) AND g(n) isin; O(f(n)). Proving this simply involves doing a proof once in each direction, i.e.
(i) prove f(n) ∈ O(g(n))


(ii) prove g(n) ∈ O(f(n))

Prove one order is a subset of another: we say O(f(n)) ⊆ O(g(n)) iff every function in O(f(n)) is also in O(g(n)). Stated another way, there are no functions in O(f(n)) that are not in O(g(n)).

For example, prove O(n2) ⊆ O(n3):

Prove two orders are equivalent: to show O(f(n)) == O(g(n)) we can again do two subproofs:
(i) prove O(f(n)) ⊆ O(g(n))


(ii) prove O(g(n)) ⊆ O(f(n))

Prove O(nk) ⊆ O(nk+1) for n,k > 1

Prove O(nk) ≠ O(nk+1)

Prove p(N) ∈ O(Nk) where p(N) = akNk + ak-1Nk-1 + ... + a1N + a0

Note that this is the justification for dropping all "lesser degree" terms and scalars when looking at the order of a polynomial.


Analysing Recursion

We often need to analyze the efficiency of recursive solutions to a problem, which can pose a serious problem.

Often the easiest way to express the running time for the overall algorithm is by expressing the running time for recursive parts of it.

For instance, suppose we have a divide and conquer algorithm where the problem is split in two at each stage, the two subproblems are called recursively, and then the results are glued back together. If the "gluing" takes time h(N), then we could express the overall run time as f(N) = 2 * f(N/2) + h(N).

While there are many means of carrying out such analysis, for the moment we will just consider one of them: the substitution method.

The substitution method: this method is handy if we have a gut feeling for what the running time should be, but need some way to prove it. Essentially it involves substituting our guess into the formula, and using a proof by induction to show it is correct.

For example: suppose we are analysing an algorithm such as mergesort, where at each level the problem is divided into two equal parts, each of which is solved recursively, and a linear time process is followed to glue the two pieces together again. Problems of size 1 have constant time solutions, i.e. f(1) = 1, but for larger values of N we have the following recurrence: f(N) = 2 * f( ⌊ N/2 ⌋ ) + N

Now, if we make an educated guess that the complexity is O( N lg(N)) (based on the fact that it is a divide and conquer algorithm) we can try to come up with an inductive proof that this is actually the correct answer.

For our problem, we need to prove f(N) ≤ c N lg(N) for some constant c and ∀ N > n0

Thus, using proof by induction, we have f(N) ∈ O(N lg(N)).

Other Complexity Bounds

Earlier we defined f(n) ∈ O(g(n)) iff ∃ c, n0 | ∀ n ≥ n0 f(n) ≤ cg(n), thus providing an effective upper bound on the running time of f(n).

We can also provide lower bounds using a similar definition: f(n) ∈ Ω(g(n)) iff ∃ c, n0 | ∀ n ≥ n0 f(n) ≥ cg(n)

Ideally, we can prove the same upper and lower bounds: f(n) ∈ &theta(g(n)) iff f(n) ∈ O(g(n)) and f(n) ∈ Ω(g(n))

Usually we use big-O notation when we want to discuss the worst-case or average case complexity of a particular algorithm, and we use big-Omega notation when we want to discuss the fundamental difficulty of a particular problem (e.g. we might show that to guarantee we completely sort n elements requires Ω(n lg(n)) comparisons between elements - regardless of what sorting algorithm is used).


This document is licensed under Creative Commons copyright "by-nc": anyone may use it for non-commercial purposes with attribution; derived works allowed, with attribution. Original author David Wessels, derivation by Gara Pruesse.