Another concern is to ensure that the solution is a maintainable one (readable, modifiable, etc.).
Yet another major concern is the efficiency of our solution, in terms of time and resource requirements.
It is this third issue that we will address now.
There are two theoretical issues we need to deal with when analyzing problems and solutions:
Sometimes it is also used to judge how close we are to an optimal algorithm for a particular problem
It is difficult, if not impossible, to compare algorithms across different platforms
The second is extremely difficult unless we make some simplifying assumptions, which is the next topic of discussion
For example, if we are writing sorting algorithms, the time typically depends on the number of elements to be sorted.
Thus if we somehow knew the number of operations required to sort a list of N elements was (N2 + 37N - 6), we would use this formula to describe the efficiency of our algorithm rather than stating the running time for some specific value of N.
Suppose:
Suppose N is 64: the bubblesort-based algorithm would take 8199 steps, while the mergesort-based algorithm would take 19200 steps, so the bubblesort one would be preferable
Now suppose N is 1024: the bubblesort-based algorithm would take 2097159 steps, while the mergesort-based algorithm would take 512000 steps, so the mergesort one would be preferable
Thus whenever you choose an algorithm based on efficiency analysis it is important to know something about the expected size of your real data sets!
For instance, we'll say 3N2 is the same order as 9N2 which is the same order as 0.2N2, which is the same order as 77N2, which is the same order as 12N2 + 99, etc etc.
To describe all the functions which are of the same order, we determine the core underlying function (e.g. f(N) = N2) and use the notation O(f(N)) to describe the set of all function in the same order.
Again, the set of functions described by O(N2) includes 3N2, and 77N2, and 66N2 + 17, etc etc.
say O(c * N * N) == O(N * N), for any constant c
For example, the order O(N2) includes every function of the form c*N*N + b*N + a, as long as a, b, and c are constants (i.e. as long as they do not depend on the value of N).
To see why this simplification is generally accepted, observe the following:
N | N^2 | N^2 + 120N + 10000 ---------+-------+---------------------- 100 | 10^4 | 3.200 * 10^4 1000 | 10^6 | 1.130 * 10^6 10000 | 10^8 | 1.012 * 10^8 100000 | 10^10 | 1.001 * 10^10 Note that for large values of N the difference that the constants make (and the difference the "lesser degree" terms make) is trivial when comparing algorithms of different orders
N | log2(N) | Nlog(n) | N^2 | N^3 ------+---------+---------+-----------+-------------- 2 | 1 | 2 | 4 | 8 8 | 3 | 24 | 64 | 512 32 | 5 | 160 | 1024 | 32768 128 | 7 | 896 | 16384 | 2097152 512 | 9 | 4608 | 262144 | 134217728 1024 | 10 | 10240 | 1048576 | 1073741824 16384 | 14 | 229376 | 268435456 | 4.398 * 10^12
Algorithm | Best-case | Average-case | Worst-case ------------------+-----------+--------------+----------- Sequential search | O(1) | O(N) | O(N) Binary search | O(1) | O(logN) | O(logN) ------------------+-----------+--------------+----------- Selection sort | O(N^2) | O(N^2) | O(N^2) Bubblesort | O(N) | O(N^2) | O(N^2) Insertion sort | O(N) | O(N^2) | O(N^2) Quicksort | O(NlogN) | O(NlogN) | O(N^2) ------------------+-----------+--------------+-----------O(1) means at most some constant number of operations are performed, e.g. solving the problem in <= 3 steps
Some basic guidelines:
Some examples:
for (int m = 0; m < M; m++) { cout << "m is " << m; for (int n = 0; n < N; n++) { sum = sum + n; foo = foo * n + m; } cout << " sum and foo are "; cout << sum << " " << foo << endl; }The code fragment performs O(M * N) operations:
for (int i = 6; i < (2*M); i++) { foo[i] = 3 * i * i; foo[i-6] = i; }
for (int i = 0; i < M; i++) { cout << i; } for (int j = 0; j < N; j++) { cin >> arr[j]; }
for (int i = 0; i < 600; i++) { count = 0; while (count < M) { cout << i * M << endl; count++; } }
for (int i = 0; i < (M/2); i++) { for (int j = 0; j < N; j += 3) { cout << i * j << endl; } }
When we determine the running time of two algorithms as some functions, f(n) and g(n), we will usually want to show that one algorithm is better (or worse, or no better, etc) than another.
We will do this by grouping the functions into orders, where a "larger" order contains all the functions in all smaller orders. Most of the proofs we look at regarding running time involve trying to prove whether a given function is inside or outside a particular order.
If we show one function is in the order of another then we know the first function is "no worse" than the second. If we then go on to show the second function is NOT in the order of the first function then we have effectively proved the first function is superior.
Definition of Big-Oh: A function f(n) is said to be in the Big-Oh of another function g(n) if the following holds:
Notation: f(n) ∈ O(g(n))
Statement: f(n) is in the Big-Oh of g(n)
I.e., if we're using f(n) and g(n) as the running time of two algorithms on a data set of size n, then if we look at large enough data sets we find that at worst f(n) is within a constant factor of g(n) -- it may be many many times faster, but at worst it's "not much slower".
We use this definition of asymptotic complexity to group functions in terms of "equivalent efficiency".
See the proof technique notes for more detailed discussion of proofs and proof techniques. |
This is equivalent to trying to show that function f(n) is in the order of g(n), i.e. for all sufficiently large values of n, there exists some constant C such that f(n) ≤ Cg(n)
In this case, that means showing there exists constant values n0 and C such that for all n ≥ n0, 6N2 ≤ Cn3
If we make an inspired guess to pick n0 = 6 and C = 1, then
we have the claim 6n2 ≤ n3 for all n ≥ 6,
dividing both sides by n2 gives 6 ≤ n for all n ≥ 6,
which is of course true.
Thus we have found values for C and n0 that make our statement true, hence f(n) ∈ O(g(n)).
Let C = 4 and n0 = 2 (lucky guess again) and we will again see this is also a true statement.
i.e. our claim is 2nn + 4 ≤ 4n2 for all n ≥ 2, and if we subtract 2n2 from both sides we get 4 ≤ 2n2 for all n ≥ 2, which we again see is true.
Thus f(n) ∈ O(g(n))
We can show that f(N) is slower than g(N), i.e. grows asymptotically faster than g(N), using proof by contradiction:
We can show that f(N) is slower than g(N), i.e. grows asymptotically faster than g(N), using proof by contradition:
Equivalence of functions: we say two functions, f(n) and g(n),
are asymptotically equivalent iff
f(n) ∈ O(g(n)) AND g(n) isin; O(f(n)).
Proving this simply involves doing a proof once in each direction, i.e.
(i) prove f(n) ∈ O(g(n))
Prove one order is a subset of another: we say O(f(n)) ⊆ O(g(n)) iff every function in O(f(n)) is also in O(g(n)). Stated another way, there are no functions in O(f(n)) that are not in O(g(n)).
For example, prove O(n2) ⊆ O(n3):
Prove two orders are equivalent: to show O(f(n)) == O(g(n)) we can again
do two subproofs:
(i) prove O(f(n)) ⊆ O(g(n))
Prove O(nk) ⊆ O(nk+1) for n,k > 1
Prove O(nk) ≠ O(nk+1)
Prove p(N) ∈ O(Nk) where p(N) = akNk + ak-1Nk-1 + ... + a1N + a0
Note that this is the justification for dropping all "lesser degree" terms and scalars when looking at the order of a polynomial.
We often need to analyze the efficiency of recursive solutions to a problem, which can pose a serious problem.
Often the easiest way to express the running time for the overall algorithm is by expressing the running time for recursive parts of it.
For instance, suppose we have a divide and conquer algorithm where the problem is split in two at each stage, the two subproblems are called recursively, and then the results are glued back together. If the "gluing" takes time h(N), then we could express the overall run time as f(N) = 2 * f(N/2) + h(N).
While there are many means of carrying out such analysis, for the moment we will just consider one of them: the substitution method.
The substitution method: this method is handy if we have a gut feeling for what the running time should be, but need some way to prove it. Essentially it involves substituting our guess into the formula, and using a proof by induction to show it is correct.
For example: suppose we are analysing an algorithm such as mergesort, where at each level the problem is divided into two equal parts, each of which is solved recursively, and a linear time process is followed to glue the two pieces together again. Problems of size 1 have constant time solutions, i.e. f(1) = 1, but for larger values of N we have the following recurrence: f(N) = 2 * f( ⌊ N/2 ⌋ ) + N
Now, if we make an educated guess that the complexity is O( N lg(N)) (based on the fact that it is a divide and conquer algorithm) we can try to come up with an inductive proof that this is actually the correct answer.
For our problem, we need to prove f(N) ≤ c N lg(N) for some constant c and ∀ N > n0
Earlier we defined f(n) ∈ O(g(n)) iff ∃ c, n0 | ∀ n ≥ n0 f(n) ≤ cg(n), thus providing an effective upper bound on the running time of f(n).
We can also provide lower bounds using a similar definition: f(n) ∈ Ω(g(n)) iff ∃ c, n0 | ∀ n ≥ n0 f(n) ≥ cg(n)
Ideally, we can prove the same upper and lower bounds: f(n) ∈ &theta(g(n)) iff f(n) ∈ O(g(n)) and f(n) ∈ Ω(g(n))
Usually we use big-O notation when we want to discuss the worst-case or
average case complexity of a particular algorithm, and we use big-Omega
notation when we want to discuss the fundamental difficulty of a particular
problem (e.g. we might show that to guarantee we completely sort n elements
requires Ω(n lg(n)) comparisons between elements - regardless of what
sorting algorithm is used).