CSCI 260 Notes

Data Structures and Algorithms

Introduction: What should we measure when we compare algorithms?
Algorithmic Efficiency (Time Complexity)
Orders of Efficiency
Big-Oh Analysis of Loops
Definition of Big-Oh
Definition of Omega and Theta

Introduction: What should we measure when we compare algorithms?

Our first concern in analyzing problems and developing solutions is simply to ensure that we have a correct solution for the correct problem.

Another concern is to ensure that the solution is a maintainable one (readable, modifiable, etc.).

Yet another major concern is the efficiency of our solution, in terms of time and resource requirements.

It is this third issue that we will address now.

There are two theoretical issues we need to deal with when analyzing problems and solutions:

How can we predict the running time of our solution compared to other possible solutions?
How does the predicted running time of our solution compare to the difficulty of the problem itself, i.e. to the best possible solution?

Program efficiency

Program efficiency is usually used to compare algorithms, deciding which one is faster
Sometimes it is also used to judge how close we are to an optimal algorithm for a particular problem
Two ways to measure the speed efficiency of a program:
- time the executable on a variety of datasets
- theoretical analysis of the algorithm or source code
The first option is impractical because the time depends heavily on the type of system and associated devices
It is difficult, if not impossible, to compare algorithms across different platforms
The second is extremely difficult unless we make some simplifying assumptions, which is the next topic of discussion

Measurements for different kinds of problem

We'll proceed with a theoretical analysis, estimating the number of operations performed by a program
One of the common simplifications is to only count certain kinds of operations, e.g.:
- we might count the number of function calls invoked (as function calls tend to slow down a program)
- we might count the number of multiplication operations carried out (multiplication and division are also slow compared to other operaions)
- we might count the number of assignment operations (probably the most common kind of operation in a program)
- for sorting/searching we might count the number of comparisons, or the number of data elements examined
Since the time a program takes typically varies based on the size of the input problem, we'll describe the estimated time as a function of the input size.
For example, if we are writing sorting algorithms, the time typically depends on the number of elements to be sorted.
Thus if we somehow knew the number of operations required to sort a list of N elements was (N² + 37N - 6), we would use this formula to describe the efficiency of our algorithm rather than stating the running time for some specific value of N.
One thing to observe is that different algorithms might be better for solving a problem at different sizes of N.
Suppose:
- We have one bubblesort-based algorithm whose efficiency we calculated at (2N² + 7) operations per element
- We have a mergesort-based algorithm whose efficiency we calculated at (50 N log²(N)) operations per element
Suppose N is 64: the bubblesort-based algorithm would take 8199 steps, while the mergesort-based algorithm would take 19200 steps, so the bubblesort one would be preferable
Now suppose N is 1024: the bubblesort-based algorithm would take 2097159 steps, while the mergesort-based algorithm would take 512000 steps, so the mergesort one would be preferable
Thus whenever you choose an algorithm based on efficiency analysis it is important to know something about the expected size of your real data sets!

Orders of efficiency

System loads, different processors and co-processors, and I/O capabilities can easily affect program speed by an order of magnitude, and these things change all the time regardless of our specific choice of solution for a problem
If we want to compare possible algorithms before we code them up, and possibly before we know exactly what kind of hardware they'll be running on, we need a general approach that can ignore "minor" levels of detail
When comparing algorithms on a theoretical level we'll only consider major differences of scale - we talk about the order of an algorithm as a representation of the efficiency at a very general scale
For a data set of size N (e.g. an array of N elements) we'll talk about the ORDER of an algorithm, as being the running time with any constant factors filtered out.
For instance, we'll say 3N² is the same order as 9N² which is the same order as 0.2N², which is the same order as 77N², which is the same order as 12N² + 99, etc etc.
To describe all the functions which are of the same order, we determine the core underlying function (e.g. f(N) = N²) and use the notation O(f(N)) to describe the set of all function in the same order.
Again, the set of functions described by O(N²) includes 3N², and 77N², and 66N² + 17, etc etc.
say O(c * N * N) == O(N * N), for any constant c

In fact, when the core underlying function is a polynomial, the only term we are interested in is the highest degree term.

For example, the order O(N²) includes every function of the form c*N*N + b*N + a, as long as a, b, and c are constants (i.e. as long as they do not depend on the value of N).

To see why this simplification is generally accepted, observe the following:

   N     | N^2   | N^2 + 120N + 10000
---------+-------+----------------------
    100  | 10^4  | 3.200 * 10^4
   1000  | 10^6  | 1.130 * 10^6
  10000  | 10^8  | 1.012 * 10^8
 100000  | 10^10 | 1.001 * 10^10

Note that for large values of N the difference that the constants make
(and the difference the "lesser degree" terms make) is trivial
when comparing algorithms of different orders

Some common orders

The big-O notation (e.g. O(N*N*)) is used to describe the order of an algorithm, i.e. an approximation of its running time given a data set of size N
We write the orders in the simplified versions - i.e. with scalar constants and lesser terms dropped

Some of the common orders, and their values for a few values of N, are:

  N   | log2(N) | Nlog(n) |     N^2   |       N^3 
------+---------+---------+-----------+--------------
    2 |   1     |      2  |         4 |             8
    8 |   3     |     24  |        64 |           512
   32 |   5     |    160  |      1024 |         32768
  128 |   7     |    896  |     16384 |       2097152
  512 |   9     |   4608  |    262144 |     134217728
 1024 |  10     |  10240  |   1048576 |    1073741824
16384 |  14     | 229376  | 268435456 | 4.398 * 10^12

Developing a better algorithm can have far more impact than getting a faster processor:
- put quicksort, O(NlogN), on a slow processor,
- bubblesort, O(N^2), on one a thousand times faster
- on a million-entry database which will be better?

Big-O for searching and sorting

Here are the performance orders for best-case, average-case, and worst-case behaviour of some sorting and searching algorithms

All assume a list of N data items

Algorithm         | Best-case | Average-case | Worst-case
------------------+-----------+--------------+-----------
Sequential search |  O(1)     |   O(N)       |  O(N)
Binary search     |  O(1)     |   O(logN)    |  O(logN)
------------------+-----------+--------------+-----------
Selection sort    |  O(N^2)   |   O(N^2)     |  O(N^2)
Bubblesort        |  O(N)     |   O(N^2)     |  O(N^2)
Insertion sort    |  O(N)     |   O(N^2)     |  O(N^2)
Quicksort         |  O(NlogN) |   O(NlogN)   |  O(N^2)
------------------+-----------+--------------+-----------

O(1) means at most some constant number of operations are performed, e.g. solving the problem in <= 3 steps

It can be proven that you cannot have a general sorting algorithm with worst case behaviour better than O(NlogN)

Calculating big-O: loops and statements

The next challenge is to take one of our own code segments and try and determine what complexity order it has.

Some basic guidelines:

We need to determine what kinds of operations we will count: for now let's count arithmetic operations, comparisons, and assignment statements
We need to determine what our input size is, e.g. if we're working with an array the input size might be the size of the array
Any fixed number of steps (i.e. not based on the size of the input) takes time O(1)
To compute the order of a loop we calculate the number of times we will go through the loop, and multiply that by the complexity of the steps within the loop.
If we call another subroutine, we must compute the complexity of that subroutine and add it's complexity at the point of the call.

Some examples:

We try to measure algorithm complexity relative to the inputs and parameters, e.g. suppose M and N are input values below
```
for (int m = 0; m < M; m++) {
    cout << "m is " << m;
    for (int n = 0; n < N; n++) {
        sum = sum + n;
        foo = foo * n + m;
    }
    cout << " sum and foo are ";
    cout << sum << " " << foo << endl;
}
```
The code fragment performs O(M * N) operations:
- it goes through the outer loop M times,
- for each outer loop pass it goes through the inner loop N times, (and performs a couple of extra operations)
- for each inner loop pass it performs a fixed number of operations
Thus giving M * (c1 + (N * c2)) ==> O(M * N)

Example calculations

show the following is O(M)

for (int i = 6; i < (2*M); i++) {
    foo[i] = 3 * i * i;
    foo[i-6] = i;
}

show the following is O(M + N)

for (int i = 0; i < M; i++) {
    cout << i;
}
for (int j = 0; j < N; j++) {
    cin >> arr[j];
}

show the following is O(M)

for (int i = 0; i < 600; i++) {
    count = 0;
    while (count < M) {
        cout << i * M << endl;
        count++;
    }
}

show the following is O(M * N)

for (int i = 0; i < (M/2); i++) {
    for (int j = 0; j < N; j += 3) {
        cout << i * j << endl;
    }
}

COMPARING ALGORITHMS and ORDERS

When we determine the running time of two algorithms as some functions, f(n) and g(n), we will usually want to show that one algorithm is better (or worse, or no better, etc) than another.

We will do this by grouping the functions into orders, where a "larger" order contains all the functions in all smaller orders. Most of the proofs we look at regarding running time involve trying to prove whether a given function is inside or outside a particular order.

If we show one function is in the order of another then we know the first function is "no worse" than the second. If we then go on to show the second function is NOT in the order of the first function then we have effectively proved the first function is superior.

Definition of Big-Oh: A function f(n) is said to be in the Big-Oh of another function g(n) if the following holds:

f(n) ≤ c * g(n)

Notation: f(n) ∈ O(g(n))

Statement: f(n) is in the Big-Oh of g(n)

I.e., if we're using f(n) and g(n) as the running time of two algorithms on a data set of size n, then if we look at large enough data sets we find that at worst f(n) is within a constant factor of g(n) -- it may be many many times faster, but at worst it's "not much slower".

We use this definition of asymptotic complexity to group functions in terms of "equivalent efficiency".

See the proof technique notes for more detailed discussion of proofs and proof techniques.
Example 1:

Suppose the running times we have identified for two algorithms are f(n) = 6n² and g(n) = n³, and we want to show that the f(n) algorithm is no slower (asymptotically) than the g(n) algorithm.
This is equivalent to trying to show that function f(n) is in the order of g(n), i.e. for all sufficiently large values of n, there exists some constant C such that f(n) ≤ Cg(n)
In this case, that means showing there exists constant values n₀ and C such that for all n ≥ n₀, 6N² ≤ Cn³
If we make an inspired guess to pick n₀ = 6 and C = 1, then we have the claim 6n² ≤ n³ for all n ≥ 6,
dividing both sides by n² gives 6 ≤ n for all n ≥ 6, which is of course true.
Thus we have found values for C and n₀ that make our statement true, hence f(n) ∈ O(g(n)).

Example 2:

Suppose f(n) = 2n²+4 and g(n) = n²
Let C = 4 and n₀ = 2 (lucky guess again) and we will again see this is also a true statement.
i.e. our claim is 2nⁿ + 4 ≤ 4n² for all n ≥ 2, and if we subtract 2n² from both sides we get 4 ≤ 2n² for all n ≥ 2, which we again see is true.
Thus f(n) ∈ O(g(n))

Example 3:

Suppose f(N) = (N+1)! and g(N) = N!
We can show that f(N) is slower than g(N), i.e. grows asymptotically faster than g(N), using proof by contradiction:
- assume (N+1)! is in O(N!)
- this would mean there exist C, n₀ such that for all N ≥ n₀, (N+1)! ≤ CN!
- simplify by dividing both sides by N!, giving
  (N+1) ≤ C for some constant C and for all values of N ≥ n₀
- Clearly this is false, since regardless of the choice of constant C there exists some larger value of N
- Since a logical impossibility has arisen, our assumption must be incorrect, and hence (N+1)! is NOT in O(N!)

Example 4:

Suppose f(N) = N⁴ and g(N) = 16N³
We can show that f(N) is slower than g(N), i.e. grows asymptotically faster than g(N), using proof by contradition:
- assume N⁴ ∈ O(16N³)
- this would mean there exist C, n₀ such that for all N ≥ n₀ N⁴ ≤ 16CN³
- dividing both sides by N³ gives N ≤ 16C (for all N ge; n₀)
- again, this is clearly untrue, since for any value of C we simply pick N = 16C + 1 and we have a contradiction
- Since a logical impossibility has arisen, our assumption must be incorrect, and hence N⁴ is NOT in O(16N³)

More General Proofs

Equivalence of functions: we say two functions, f(n) and g(n), are asymptotically equivalent iff f(n) ∈ O(g(n)) AND g(n) isin; O(f(n)). Proving this simply involves doing a proof once in each direction, i.e.
(i) prove f(n) ∈ O(g(n))

(left as an exercise for the reader)
(ii) prove g(n) ∈ O(f(n))

(left as an exercise for the reader)

Prove one order is a subset of another: we say O(f(n)) ⊆ O(g(n)) iff every function in O(f(n)) is also in O(g(n)). Stated another way, there are no functions in O(f(n)) that are not in O(g(n)).

For example, prove O(n²) ⊆ O(n³):

let h(n) be any function in O(n²)
then, by definition, there exist c, n₀ such that, ∀ n > n₀, h(n) ≤ c n²
if n₀ ≥ 1, then c n² < c n³
thus h(n) ≤ c n³ ∀ n > n₀
thus h(n) ∈ O(n³)

Prove two orders are equivalent: to show O(f(n)) == O(g(n)) we can again do two subproofs:
(i) prove O(f(n)) ⊆ O(g(n))

(left as an exercise for the reader)
(ii) prove O(g(n)) ⊆ O(f(n))

(left as an exercise for the reader)

Prove O(n^k) ⊆ O(n^k+1) for n,k > 1

let h(n) be any function in O(n^k)
then, by definition, there exist c, n₀ such that, ∀ n > n₀, h(n) ≤ c n^k
if n₀ ≥ 1, then c N^k < c n^k+1
thus h(n) ≤ c n^k+1 ∀ n > n₀
thus h(n) ∈ O(n^k+1)

Prove O(n^k) ≠ O(n^k+1)

By the previous proof we know that every function in O(n^k is also in O(n^k+1), therefore to prove the two orders are not identical we need to show that O(n^k+1) contains at least one function that is not in O(n^k)
Let's try the function n^k+1:
1. Show n^k+1 ∈ O(n^k+1)
  (left as an exercise for the reader)
2. Show n^k+1 ∉ O(n^k)
  - Assume n^k+1 ∈ O(n^k)
  - Then ∃ c, n₀ such that ∀ n > n₀ n^k+1 ≤ c n^k
  - dividing both sides by n^k gives n ≤ c ∀ n > n₀, which is clearly false for any n > c
  - Thus we have a contradiction, and our assumption must be false, i.e. n^k+1 ∉ O(n^k)

Prove p(N) ∈ O(N^k) where p(N) = a_kN^k + a_k-1N^k-1 + ... + a₁N + a₀

first, divide both sides by N^k
second, choose an n₀ large enough so that each individual term (out of a_k-1/N, ... a₀/N^k) is smaller than 1/k
then the sum of those k terms would be ≤ 1
thus p(N)/N^k ≤ a_k + 1
thus p(N) ≤ (a_k + 1) N^k
thus ∃ n₀ and c such that ∀ N > n₀ p(N) ≤ c N^k
thus p(N) ∈ O(N^k)

Note that this is the justification for dropping all "lesser degree" terms and scalars when looking at the order of a polynomial.

Analysing Recursion

We often need to analyze the efficiency of recursive solutions to a problem, which can pose a serious problem.

Often the easiest way to express the running time for the overall algorithm is by expressing the running time for recursive parts of it.

For instance, suppose we have a divide and conquer algorithm where the problem is split in two at each stage, the two subproblems are called recursively, and then the results are glued back together. If the "gluing" takes time h(N), then we could express the overall run time as f(N) = 2 * f(N/2) + h(N).

While there are many means of carrying out such analysis, for the moment we will just consider one of them: the substitution method.

The substitution method: this method is handy if we have a gut feeling for what the running time should be, but need some way to prove it. Essentially it involves substituting our guess into the formula, and using a proof by induction to show it is correct.

For example: suppose we are analysing an algorithm such as mergesort, where at each level the problem is divided into two equal parts, each of which is solved recursively, and a linear time process is followed to glue the two pieces together again. Problems of size 1 have constant time solutions, i.e. f(1) = 1, but for larger values of N we have the following recurrence: f(N) = 2 * f( ⌊ N/2 ⌋ ) + N

Now, if we make an educated guess that the complexity is O( N lg(N)) (based on the fact that it is a divide and conquer algorithm) we can try to come up with an inductive proof that this is actually the correct answer.

Refresher: inductive proofs have three parts:

Base case: proving the statement is true for some small fixed value of N
Induction assumption: a statement assuming the statement is true for smaller values of N
Induction step: showing that if the statement is true for smaller values then the statement is true for N

For our problem, we need to prove f(N) ≤ c N lg(N) for some constant c and ∀ N > n₀

Base case: In this particular case, we'll include two base cases: showing the statement holds for N=2 and N=3.
f(2) = 2 * f(1) + 2 = 4
f(3) = 2 * f(1) + 3 = 5
Induction assumption: assume our theory holds true for N/2, i.e.
f( ⌊ N/2 ⌋ ) ≤ c ⌊ N/2 ⌋ lg ( ⌊ N/2 ⌋ )
Induction step: show f(N) ≤ c N lg(N)
- f(N) = 2 * f( ⌊ N/2 ⌋ ) + N
  substituting from our induction assumption gives
- f(N) ≤ 2 * (c ⌊ N/2 ⌋ lg( ⌊ N/2 ⌋ )) + N
  using c₁ = 2c as our new constant gives
- f(N) ≤ c₁ N lg(N/2) + N
  expanding lg(N/2) gives
- f(N) ≤ c₁N lg(N) - c₁N lg(2) + N
  evaluating lg(2) gives
- f(N) ≤ c N lg(N) - cN + N
  observing (c N lg(N) - cN + N) < (c N lg(N)) gives us
- f(N) ≤ c N lg(N)
  which was our goal!

Thus, using proof by induction, we have f(N) ∈ O(N lg(N)).

Other Complexity Bounds

Earlier we defined f(n) ∈ O(g(n)) iff ∃ c, n₀ | ∀ n ≥ n₀ f(n) ≤ cg(n), thus providing an effective upper bound on the running time of f(n).

We can also provide lower bounds using a similar definition: f(n) ∈ Ω(g(n)) iff ∃ c, n₀ | ∀ n ≥ n₀ f(n) ≥ cg(n)

Ideally, we can prove the same upper and lower bounds: f(n) ∈ &theta(g(n)) iff f(n) ∈ O(g(n)) and f(n) ∈ Ω(g(n))

Usually we use big-O notation when we want to discuss the worst-case or average case complexity of a particular algorithm, and we use big-Omega notation when we want to discuss the fundamental difficulty of a particular problem (e.g. we might show that to guarantee we completely sort n elements requires Ω(n lg(n)) comparisons between elements - regardless of what sorting algorithm is used).

This document is licensed under Creative Commons copyright "by-nc": anyone may use it for non-commercial purposes with attribution; derived works allowed, with attribution. Original author David Wessels, derivation by Gara Pruesse.