For each proof, we begin with a set of known facts and rules and a claim we wish to prove or disprove.
The most common facts we typically begin with (for proofs about complexities and efficiency) are
Notation: ∃ means there exists, ∀ means for all, | means such that
Thus the "standard definition" above would read:
if and only if there exist c and n0 such that, for all N greater than or equal to n0, f of N is less than or equal to c times g of N. Essentially this is saying that the complexity of f(N) is not more than a constant factor worse than g(N) once we get into large enough values of N. |
The biggest decision in trying to develop a proof is (usually) "which proof technique should be used?". Unfortunately, determining in advance which technique will be best is something of an art - requiring a mix of insight and experience.
There are a great many proof techniques, each of which is applicable in a variety of cases. For this course (and this lab) we will focus on just a handful.
In such a proof we repeatedly apply the known rules to the known facts to prove the truth of new facts, until eventually we arrive at a proof of the target claim.
While this technique often results in the clearest, most intuitive proofs, it relies on our ability to recognize ways to combine existing facts and rules in a sequence that will (hopefully) produce the desired result.
For instance: If f(N) = 1000N and g(N) = 0.001 N2 then prove f(N) ∈ O(g(N)).
Proof by contradiction attempts to prove that something is true essentially by proving it is impossible for it to be false.
This is done by assuming the opposite of the claim and then proving that some logical impossibility arises.
For instance: prove N3 ∉ O(N2
While disproving something isn't really a proof technique, it is worth of some seperate discussion.
When asked to disprove something we are trying to prove that the claim is not valid, and often this is simply a matter of finding an instance or fact that contradicts the claim. For instance, if asked to disprove the claim "all dogs are brown" it would suffice to find a single non-brown dog.
In the context of complexity arguments, the claims are more likely to be along the lines of: Disprove the claim that O(f(N)) ⊆ O(g(N)).
The claim states that O(f(N)) is a subset of O(g(N)), i.e. it claims every function in O(f(N)) is also in O(g(N)).
To disprove this claim, it would be sufficient to find a single function in O(f(N)) that is NOT in O(g(N)).
Suppose we wanted to prove that, for all positive integer values of N, N2 < (N+1)2.
We attempt to prove this one case at a time, e.g.
Instead, we can prove a starting point (e.g. N = 1 in the example above) and come up with general proofs that take us from any one case to the next.
E.g. in the example above, we would try to create a proof that, if the theory held for N = 1, ... , k then we could prove it also held for N = (k+1).
If our rule was valid, and we had enough time, we could do the proof in k+1 individual stages...
start with N=1 (our base case, for which we have already proven it's true)
apply our rule starting from N=1, which proves the N=2 case, apply our rule starting from N=2, which proves the N=3 case, .... etc etc, this provides a valid mechanism for stating the truth of the claim for any larger value of N |
n
(called the base
case(s))
n
from the base cases through to
a variable value, k
(in fact, this is termed
strong induction)
n = k + 1
Again, these three parts are sufficient for a valid proof because,
given sufficient time, we could use them to build a constructive
proof of the property for any specific value of n
.
Example:
1 + 2 + 3 + ... + n = n(n+1)/2
,
for all positive integers n
.
Our inductive proof would be laid out as follows:
1 = 1(1 + 1)/2
1(1+1)/2 = 1(2)/2 = 2/2 = 1
, thus
the claim is true for the base case
n
in the range 1..k
,
the sum of 1 through k is k(k+1)/2
.
1 + 2 + ... + k + (k+1) = (k+1)((k+1)+1)/2
1 + 2 + ... + k + (k+1) = (1 + 2 + ... + k) + (k+1) = k(k+1)/2 + (k+1) by induction hypothesis = (k+1)(k/2 + 1) = (k+1)((k+2)/2) = (k+1)(k+2)/2 = (k+1)((k+1)+1)/2Which proves our target!
NOTE: it is very important in inductive proofs to very clearly point out exactly where and how the induction hypothesis is being applied
We will often be called upon to evaluate the efficiency of a recursive algorithm.
Sometimes we will be able to use constructive reasoning and proofs to conclusively demonstrate the efficiency, but in many cases such proofs are difficult to arrive at.
There are several other approaches to solving recurrences, but for now we'll focus on a single one: substitution.
In this method, we guess what the correct running time will be, and then use proof by induction to show that our guess was correct.
Needless to say, this only works if we guessed correctly, so it is most useful when we have a "gut feel" for what the running time is. Of course, we can make very pessimistic guesses initially (which should be easy to prove correct) and gradually refine them to be more and more accurate - until eventually we hit a point where no longer succeed in finding a proof. (Which doesn't mean there isn't a proof, it could mean the proof has just eluded us so far.)
Example: suppose the running time of an algorithm is expressed as follows:
This looks like a divide-and-conquer type of efficiency, where the running time for a problem of size N is based on some constant times the running time for half the problem, plus a linear stage to "put the pieces together".
Based on that observation, we guess that the running time is O(N lg(N))
Now we try to use an inductive proof to show that
f(N) ∈ O( N lg(N) ),
i.e. ∃ c, n0 | f(N) ≤ c N lg(N) ∀ N > n0
.... TO BE COMPLETED .... |
Problem: in fact, there is a weakness in this proof, in that it assumes N/2 is an integer value. A similar but more thorough approach is taken in the example at the end of the course notes on substitution proofs. |