Algorithm Analysis
The performance of a program usually depends on:
- Data structures and algorithms;
- Characteristics of input data;
- Computer hardware, including but not limited to CPU, memory and disk;
- Programming language used to develop the program;
- Compiler/interpreter;
- Software environment; and
- Network communication protocols and connections.
An algorithm is a step-by-step procedure for solving a problem
in a finite amount of time. Time efficiency of an algorithm indicates
how fast an algorithm runs; space efficiency deals with the extra
space the algorithm requires. Other analysis includes correctness,
robustness and maintainability.
The running time of an algorithm typically grows with the input
size. Average case time efficiency is often difficult to determine.
So we usually focus on the worst case scenario analysis because
it's easier to analyze and crucial to applications.
We can use experiments to measure an algorithm's running time as
a function of the input size. But there are limitations with this
method:
- The algorithm needs to be implemented first, which may be
difficulty and costly;
- The measured results may not be comprehensive enough to
capture the charactristics of all types of inputs;
- The results are only measured on specific hardware and software
environment.
The preferred way is to analyze the running time of an algorithm
theoretically, characterize the running time as a function of
the input size (N). This method allows us to evaluate the
time efficiency of an algorithm independent of the hardware
and software environment.
General steps of asymptotic analysis of an algorithm:
- Define the problem size (input size), N;
- Count the number of the primitive operations as a function of N;
- Determine the growth rate of the function.
It is obvious that almost all algorithms takes longer to run
on larger inputs. We usually use a parameter, N, to indicate
the input size of an algorithm. Sometimes, several parameters
combined together are used to define the input size.
Primitive Operations include:
- assigning a value to a variable;
- calling a method;
- performing an arithmetic operation;
- comparing two numbers;
- accessing an array element;
- following a pointer or a reference;
- returning from a method;
Treating all primitive operations the same and ignoring the hardware
and software environment difference may affect the estimate of
the running time of an algorithm by a constant factor, but
it does not alter the growth rate of the running time efficiency
function.
For large input size N, it is the function's order of growth that
matters:
N | logN | N | NlogN | N2
| N3 | 2N | N! |
10 | 3 | 10 | 33 | 100
| 1K | 1K | 3.6M |
32 | 5 | 32 | 160 | 1K
| 32K | 4B | large |
64 | 6 | 64 | 384 | 4K
| 256K | 16BB | large |
103 | 10 | 103 | 104
| 106 | 109 | - | - |
106 | 20 | 106 | 107
| 1012 | 1018 | - | - |
As a comparison, 16 billion billion seconds is 543 billion years.
The Earth's age is about 4.5 billion years, and the universe's age
is about 14 billion years.
Asymptotic Notations
- The upper bound Ο definition:
Let f(n) and g(n) be functions mapping non-negative integers to
real numbers. f(n) is Ο(g(n)) if there exists a real constant
c > 0 and an integer constant n0 >= 1, such that
f(n) <= cg(n) for every integer n >= n0.
In other words, the growth rate of f(n) is no more than that of g(n).
- Ω and Θ:
If g(n) is Ο(f(n)), then f(n) is Ω(g(n)).
If f(n) is Ο(g(n)) and Ω(g(n)), then
f(n) is Θ(g(n)).
- ο and ω
f(n) is ο(g(n)) if for every real constant
c > 0, there exists an integer constant n0 >= 1, such that
f(n) <= cg(n) for every integer n >= n0.
In other words, the growth rate of f(n) is less than that of g(n).
If g(n) is ο(f(n)), then f(n) is ω(g(n)).
The Ο notation denotes a class of functions. The growth rates
of all functions in the same class are the same. The Ο families
provide us an convenient way to analyze algorithms because
they let us focus on the big picture rather than the details.
Ο Rules
- If f(n) is a polynomial of degree d, then f(n) is Ο(nd).
I.e., drop the lower order terms and the constant factors.
- Use the smallest possible class of functions.
For example, 2n is Ο(n2), but we usually say
2n is Ο(n).
- Use the simplest expression of the class.
Ο(3n2) should be Ο(n2).
Rules for Ο Analysis of running time of an algorithm:
- Sequential Composition:
S1, S2, ..., Sk
TS = Ο(max(T1, T2, ...,
Tk))
- Iteration:
for i from 1 to k { Exp }
TI = Ο(max(k, TExp * k))
- Conditional Execution:
if cond then Exp1 else Exp2
TC = Ο(max(Tcond, TExp1,
TExp2))
Ο Analysis for recursive algorithms:
- Write a recurrence equation for running time function;
- Solve the recurrence equation;
- Classify the result to an Θ(f(n)) family.
The Master Theorem:
T(n) = aT(n/b) + f(n), if n >= d
- If there is a small constant ε > 0, such that f(n)
is Ο(nlogba - ε),
then T(n) is Θ(nlogba).
- If there is a constant k >= 0, such that f(n) is
Θ(nlogbalogkn),
then T(n) is Θ(nlogbalogk+1n).
- If there are small constants ε > 0 and δ < 1,
such that f(n) is &Omiga;(nlogba + ε)
and af(n/b) <= δf(n) for n >= d, then
T(n) is Θ(f(n)).