|
Advertisement |
Algorithms : The backbone of programming
Posted On July 26, 2010 by Rose Mary filed under Programming
HTML clipboard
Algorithms have many applications in computer science. The subject of algorithms is a powerful lens through which to view the field of computer science in general. Algorithmic ideas do not just provide solutions to well-posed problems; they form the language that lets you cleanly express the underlying questions. An algorithm is any set of detailed instructions which results in a predictable end-state from a known beginning. Algorithms are only as good as the instructions given, however, and the result will be incorrect if the algorithm is not properly defined. This article covers the some hidden facts about the algorithms.
Introduction
Each algorithm is a list of well-defined instructions for completing a task. Starting from an initial state, the instructions describe a computation that proceeds through a well-defined series of successive states, eventually terminating in a final ending state. The transition from one state to the next is not necessarily deterministic; some algorithms, known as randomized algorithms, incorporate randomness.
An algorithm can be given in many ways. For example, it can be written down in English (or French, or any other ``natural'' language). However, we are interested in algorithms which have been precisely specified using an appropriate mathematical formalism--such as a programming language.
In order to learn more about an algorithm, we can ``analyze'' it. By this we mean to study the specification of the algorithm and to draw conclusions about how the implementation of that algorithm--the program--will perform in general. But what can we analyze? We can
· Determine the running time of a program as a function of its inputs;
· Determine the total or maximum memory space needed for program data;
- Determine the total size of the program code;
- Determine whether the program correctly computes the desired result;
- Determine the complexity of the program--e.g., how easy is it to read, understand, and modify; and,
- Determine the robustness of the program--e.g., how well does it deal with unexpected or erroneous inputs?
In this text, we are concerned primarily with the running time. We also consider the memory space needed to execute the program. There are many factors that affect the running time of a program. Among these are the algorithm itself, the input data, and the computer system used to run the program. The performance of a computer is determined by
- The hardware:
- Processor used (type and speed),
- Memory available (cache and ram), and
- Disk available;
- The programming language in which the algorithm is specified;
- The language compiler/interpreter used; and
- The computer operating system software.
Algorithmic is a branch of computer science that consists of designing and analyzing computer algorithms
- The “design” pertain to
- The description of algorithm at an abstract level by means of a pseudo language, and
- Proof of correctness that is, the algorithm solves the given problem in all cases.
- The “analysis” deals with performance evaluation (complexity analysis).
Basic Classification
The algorithms can be divided as follows:
- Recursion or iteration: A recursive algorithm is one that invokes (makes reference to) itself repeatedly until a certain condition matches, which is a method common to functional programming. Iterative algorithms use repetitive constructs like loops and sometimes additional data structures like stacks to solve the given problems. Some problems are naturally suited for one implementation or the other. For example, towers of Hanoi is well understood in recursive implementation. Every recursive version has an equivalent (but possibly more or less complex) iterative version, and vice versa.
- Logical: An algorithm may be viewed as controlled logical deduction. This notion may be expressed as: Algorithm = logic + control. The logic component expresses the axioms that may be used in the computation and the control component determines the way in which deduction is applied to the axioms. This is the basis for the logic programming paradigm. In pure logic programming languages the control component is fixed and algorithms are specified by supplying only the logic component. The appeal of this approach is the elegant semantics: a change in the axioms has a well defined change in the algorithm.
- Serial or parallel or distributed: Algorithms are usually discussed with the assumption that computers execute one instruction of an algorithm at a time. Those computers are sometimes called serial computers. An algorithm designed for such an environment is called a serial algorithm, as opposed to parallel algorithms or distributed algorithms. Parallel algorithms take advantage of computer architectures where several processors can work on a problem at the same time, whereas distributed algorithms utilize multiple machines connected with a network. Parallel or distributed algorithms divide the problem into more symmetrical or asymmetrical subproblems and collect the results back together. The resource consumption in such algorithms is not only processor cycles on each processor but also the communication overhead between the processors. Sorting algorithms can be parallelized efficiently, but their communication overhead is expensive. Iterative algorithms are generally parallelizable. Some problems have no parallel algorithms, and are called inherently serial problems.
- Deterministic or non-deterministic: Deterministic algorithms solve the problem with exact decision at every step of the algorithm whereas non-deterministic algorithms solve problems via guessing although typical guesses are made more accurate through the use of heuristics.
- Exact or approximate: While many algorithms reach an exact solution, approximation algorithms seek an approximation that is close to the true solution. Approximation may use either a deterministic or a random strategy. Such algorithms have practical value for many hard problems.
Classes of Algorithms
There are common classes of algorithms given below:
¨ Divide and Conquer Algorithms: A divide and conquer algorithm is similar to a branch and bound algorithm, except it uses the backtracking method of recurring in tandem with dividing a problem into subproblems.
¨ Greedy Algorithms: Greedy algorithms attempt not only to find a solution, but to find the ideal solution to any given problem.
¨ Dynamic Programming Algorithms: This class remembers older results and attempts to use this to speed the process of finding new results.
¨ Brute Force Algorithms: The brute force approach starts at some random point and iterates through every possibility until it finds the solution.
¨ Randomized Algorithms: This class includes any algorithm that uses a random number at any point during its process.
¨ Branch and Bound Algorithms: Branch and bound algorithms form a tree of subproblems to the primary problem, following each branch until it is either solved or lumped in with another branch.
¨ Simple Recursive Algorithms: This type of algorithm goes for a direct solution immediately, then backtracks to find a simpler solution.
¨ Backtracking Algorithms: Backtracking algorithms test for a solution, if one is found the algorithm has solved, if not it recurs once and tests again, continuing until a solution is found.
Algorithm's Performance
Two important ways to characterize the effectiveness of an algorithm are its space complexity and time complexity. Time complexity of an algorithm concerns determining an expression of the number of steps needed as a function of the problem size. Since the step count measure is somewhat coarse, one does not aim at obtaining an exact step count. Instead, one attempts only to get asymptotic bounds on the step count. Asymptotic analysis makes use of the O (Big Oh) notation. Two other notational constructs used by computer scientists in the analysis of algorithms are Θ (Big Theta) notation and Ω (Big Omega) notation.
The performance evaluation of an algorithm is obtained by totaling the number of occurrences of each operation when running the algorithm. The performance of an algorithm is evaluated as a function of the input size n and is to be considered modulo a multiplicative constant.
The following notations are commonly use notations in performance analysis and used to characterize the complexity of an algorithm.

O-Notation (Upper Bound)
This notation gives an upper bound for a function to within a constant factor. We write f(n) = O(g(n)) if there are positive constants n0 and c such that to the right of n0, the value of f(n) always lies on or below cg(n).
Big O from Code
It is often easy to determine Big O directly from code:
- Elementary operations such as + or = are O(1).
- The Big O for a sequence of statements is the max of the Big O of the statements.
- The Big O for an if statement is the max of the Big O of the test, then statement, and else statement.
- The Big O for a loop is the loop count times the Big O of the contents.
- Big O for a recursive call that cuts the problem size in half is log(n).
Beware the Bermuda Triangle
Some gym teachers punish misbehaving students by making them run 50 yards. However, the way in which they must do it is to run 1 yard, then come back, then run 2 yards, then come back, ...

How many total yards does a student run?
&sumi=1n i = n * (n + 1) / 2 = O(n2)
for ( i = 0; i < n; i++ )
for ( j = 0; j <= i; j++ )
sum += a[i][j];
Rule: If a computation of O(i) is inside a loop where i is growing up to n, the total computation is O(n2).
Big O for Arrays
Arrays are random access, which means that access to any element has the same cost.
- a[i] is O(1).
- Allocating a new array (not initializing the contents) could be essentially O(1). However, since Java is type-safe, the new array must be initialized for its contents type, so creation of a new size n array costs O(n).
If the O(n) cost of array creation is amortized over the n elements, the creation cost per element is O(1).
- Arrays are rigid: they cannot be expanded. One must either:
- Get a new larger array and copy the old contents into it, O(n)
- Make the array over-sized, wasting some space.
- If an array is larger than main memory, the cost to access an element can be much larger due to disk paging.
Average Case vs. Worst Case
Sometimes we will be interested in several performance values:
- Average Case: a random or average input
- Typical Case: it might typically be the case that the input is ``nearly sorted''.
- Worst Case: to guarantee performance even in the worst case of input data.
The worst-case figure gives the strongest guarantee, and it may be easier to analyze.
Ω-Notation (Lower Bound)
This notation gives a lower bound for a function to within a constant factor. We write f(n) = Ω(g(n)) if there are positive constants n0 and c such that to the right of n0, the value of f(n) always lies on or above cg(n).

Algorithm Analysis
The complexity of an algorithm is a function g(n) that gives the upper bound of the number of operation (or running time) performed by an algorithm when the input size is n.
There are two interpretations of upper bound.
Worst-case Complexity
The running time for any given size input will be lower than the upper bound except possibly for some values of the input where the maximum is reached.
Average-case Complexity
The running time for any given size input will be the average number of operations over all problem instances for a given size.
Because, it is quite difficult to estimate the statistical behavior of the input, most of the time we content ourselves to a worst case behavior. Most of the time, the complexity of g(n) is approximated by its family o(f(n)) where f(n) is one of the following functions. n (linear complexity), log n (logarithmic complexity), na where a≥2 (polynomial complexity), an (exponential complexity).
Optimality
Once the complexity of an algorithm has been estimated, the question arises whether this algorithm is optimal. An algorithm for a given problem is optimal if its complexity reaches the lower bound over all the algorithms solving this problem. For example, any algorithm solving “the intersection of n segments” problem will execute at least n2 operations in the worst case even if it does nothing but print the output. This is abbreviated by saying that the problem has Ω(n2) complexity. If one finds an O(n2) algorithm that solve this problem, it will be optimal and of complexity Θ(n2).
Reduction
Another technique for estimating the complexity of a problem is the transformation of problems, also called problem reduction. As an example, suppose we know a lower bound for a problem A, and that we would like to estimate a lower bound for a problem B. If we can transform A into B by a transformation step whose cost is less than that for solving A, then B has the same bound as A.
The Convex hull problem nicely illustrates "reduction" technique. A lower bound of Convex-hull problem established by reducing the sorting problem (complexity: Θ(nlogn)) to the Convex hull problem.
Divide-and-Conquer Algorithms
Divide-and-conquer is a top-down technique for designing algorithms that consists of dividing the problem into smaller subproblems hoping that the solutions of the subproblems are easier to find and then composing the partial solutions into the solution of the original problem.
Divide-and-conquer paradigm consists of following major phases:
* Breaking the problem into several sub-problems that are similar to the original problem but smaller in size,
* Solve the sub-problem recursively (successively and independently), and then
* Combine these solutions to subproblems to create a solution to the original problem.
Greedy Algorithms
Greedy algorithms are simple and straightforward. They are shortsighted in their approach in the sense that they take decisions on the basis of information at hand without worrying about the effect these decisions may have in the future. They are easy to invent, easy to implement and most of the time quite efficient. Many problems cannot be solved correctly by greedy approach. Greedy algorithms are used to solve optimization problems.
Greedy Approach
Greedy Algorithm works by making the decision that seems most promising at any moment; it never reconsiders this decision, whatever situation may arise later.
As an example consider the problem of "Making Change".
Coins available are:
* dollars (100 cents)
* quarters (25 cents)
* dimes (10 cents)
* nickels (5 cents)
* pennies (1 cent)
Problem Make a change of a given amount using the smallest possible number of coins.
Informal Algorithm
· Start with nothing.
· at every stage without passing the given amount.
o add the largest to the coins already chosen.
Formal Algorithm
Make change for n units using the least possible number of coins.
MAKE-CHANGE (n)
C ← {100, 25, 10, 5, 1} // constant.
Sol ← {}; // set that will hold the solution set.
Sum ← 0 sum of item in solution set
WHILE sum not = n
x = largest item in set C such that sum + x ≤ n
IF no such item THEN
RETURN "No Solution"
S ← S {value of x}
sum ← sum + x
RETURN S
Example Make a change for 2.89 (289 cents) here n = 2.89 and the solution contains 2 dollars, 3 quarters, 1 dime and 4 pennies. The algorithm is greedy because at every stage it chooses the largest coin without worrying about the consequences. Moreover, it never changes its mind in the sense that once a coin has been included in the solution set, it remains there.
Characteristics and Features of Problems solved by Greedy Algorithms
To construct the solution in an optimal way. Algorithm maintains two sets. One contains chosen items and the other contains rejected items.
The greedy algorithm consists of four (4) function.
1. A function that checks whether chosen set of items provide a solution.
2. A function that checks the feasibility of a set.
3. The selection function tells which of the candidates is the most promising.
4. An objective function, which does not appear explicitly, gives the value of a solution.
Structure Greedy Algorithm
* Initially the set of chosen items is empty i.e., solution set.
* At each step
o item will be added in a solution set by using selection function.
o IF the set would no longer be feasible
§ reject items under consideration (and is never consider again).
o ELSE IF set is still feasible THEN
§ add the current item.
Dynamic Programming Algorithms
Dynamic programming is a stage-wise search method suitable for optimization problems whose solutions may be viewed as the result of a sequence of decisions. The most attractive property of this strategy is that during the search for a solution it avoids full enumeration by pruning early partial decision solutions that cannot possibly lead to optimal solution. In many practical situations, this strategy hits the optimal solution in a polynomial number of decision steps. However, in the worst case, such a strategy may end up performing full enumeration.
Dynamic programming takes advantage of the duplication and arrange to solve each subproblem only once, saving the solution (in table or something) for later use. The underlying idea of dynamic programming is: avoid calculating the same stuff twice, usually by keeping a table of known results of subproblems. Unlike divide-and-conquer, which solves the subproblems top-down, a dynamic programming is a bottom-up technique.
Bottom-up means
i. Start with the smallest subproblems.
ii. Combining theirs solutions obtain the solutions to subproblems of increasing size.
iii. Until arrive at the solution of the original problem.
Genetic Algorithms
Genetic Algorithms are a way of solving problems by mimicking the same processes nature uses. They use the same combination of selection, recombination and mutation to evolve a solution to a problem.
Before you can use a genetic algorithm to solve a problem, a way must be found of encoding any potential solution to the problem. This could be as a string of real numbers or, as is more typically the case, a binary bit string. I will refer to this bit string from now on as the chromosome. A typical chromosome may look like this:
10010101110101001010011101101110111111101
At the beginning of a run of a genetic algorithm a large population of m chromosomes is created. Each one, when decoded will represent a different solution to the problem at hand. Let's say there are N chromosomes in the initial population. Then, the following steps are repeated until a solution is found :
· Test each chromosome to see how good it is at solving the problem at hand and assign a fitness score accordingly. The fitness score is a measure of how good that chromosome is at solving the problem to hand.
· Select two members from the current population. The chance of being selected is proportional to the chromosomes fitness. Roulette wheel selection is a commonly used method.
· Dependent on the crossover rate crossover the bits from each chosen chromosome at a randomly chosen point.
· Step through the chosen chromosomes bits and flip dependent on the mutation rate.
· Repeat step 2, 3, 4 until a new population of N members has been created.
This article covers only the basic concepts .For core technical aspects & consultancy you can consult the following references or e-mail at : skphind@yahoo.co.uk
References
* Introduction to Algorithms, Thomas H. Cormen, Charles E. Leiserson, Ronal L. Rivest, Clifford Stein, PHI Pub.
* Computer Algorithms by Horowitz E., Sahni S., Rajasekaran S., Galgotia Pub.
* Introduction to the Design and Analysis of Algorithms A Strategic Approach, R.C.T. Lee, S.S. Tseng, R.C. Chang & Y.T.Tsai, TMH Pub.
* A. Aho, J. Hopcroft, J. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley Pub.
* G. Brassard, P. Bratley. Fundamentals of Algorithmics. Prentice-Hall Pub.
* T. Cormen, C. Leiserson, R. Rivest. Introduction to Algorithms. McGraw-Hill Pub.
* U. Manber. Introduction to Algorithms -- A Creative Approach. Addison Wessley Pub.
* D. Kozen. The Design and Analysis of Algorithms. Springer-Verlag Pub.






Anil Sharma commented, on July 28, 2010 at 6:40 p.m.:
I HAVE NOT RECEIVED THE JULY COPY. MY SUBSCRIPTION NO IS 5235.