## Algorithms and Data Structures - MAFIADOC.COM

Dec 22, 2013 - PDF generated using the open source mwlib toolkit. ...... AVL trees and red-black trees are both forms of self-balancing binary .... Madru, Justin (18 August 2009). "Binary ... "Python Search Tree Empirical Performance Comparison" (http://stromberg.dnsalias.org/ ..... Equals" hat from the 1980s - both a boast.

Algorithms and Data Structures Part 7: Trees and Graphs (Wikipedia Book 2014)

By Wikipedians

Editors: Reiner Creutzburg, Jenny Knackmuß

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 22 Dec 2013 19:38:59 UTC

Contents Articles Trees

1

Tree (graph theory)

1

Binary tree

5

Binary search tree

13

Infix notation

20

Complete graph

22

Polish notation

24

Reverse Polish notation

29

Self-balancing binary search tree

34

AVL tree

37

B-tree

42

Heap (data structure)

52

Fibonacci heap

55

Spanning tree

60

Graphs

64

Graph (mathematics)

64

Graph theory

71

Glossary of graph theory

78

Directed graph

88

91

Floyd–Warshall algorithm

94

Shortest path problem

99

105

Depth-first search

110

Backtracking

116

Topological sorting

121

Dijkstra's algorithm

125

Greedy algorithm

133

Travelling salesman problem

136

References Article Sources and Contributors

151

155

157

1

Trees Tree (graph theory) Trees

A labeled tree with 6 vertices and 5 edges Vertices

v

Edges

v − 1

Chromatic number 2 if v > 1 • • •

e

v t 

In mathematics, and more specifically in graph theory, a tree is an undirected graph in which any two vertices are connected by exactly one simple path. In other words, any connected graph without simple cycles is a tree. A forest is a disjoint union of trees. The various kinds of data structures referred to as trees in computer science are equivalent as undirected graphs to trees in graph theory, although such data structures are generally rooted trees, thus in fact being directed graphs, and may also have additional ordering of branches. The term "tree" was coined in 1857 by the British mathematician Arthur Cayley.

Definitions A tree is an undirected simple graph G that satisfies any of the following equivalent conditions: • G is connected and has no cycles. • G has no cycles, and a simple cycle is formed if any edge is added to G. • G is connected, but is not connected if any single edge is removed from G. • G is connected and the 3-vertex complete graph is not a minor of G. • Any two vertices in G can be connected by a unique simple path. If G has finitely many vertices, say n of them, then the above statements are also equivalent to any of the following conditions: • G is connected and has n − 1 edges. • G has no simple cycles and has n − 1 edges.

Tree (graph theory) As elsewhere in graph theory, the order-zero graph (graph with no vertices) is generally excluded from consideration: while it is vacuously connected as a graph (any two vertices can be connected by a path), it is not 0-connected (or even (−1)-connected) in algebraic topology, unlike non-empty trees, and violates the "one more node than edges" relation. A leaf is a vertex of degree 1. An internal vertex is a vertex of degree at least 2. An irreducible (or series-reduced) tree is a tree in which there is no vertex of degree 2. A forest is an undirected graph, all of whose connected components are trees; in other words, the graph consists of a disjoint union of trees. Equivalently, a forest is an undirected cycle-free graph. As special cases, an empty graph, a single tree, and the discrete graph on a set of vertices (that is, the graph with these vertices that has no edges), all are examples of forests. The term hedge sometimes refers to an ordered sequence of trees. A polytree (also known as oriented tree or singly connected network) is a directed acyclic graph (DAG) whose underlying undirected graph is a tree. In other words, if we replace its arcs with edges, we obtain an undirected graph that is both connected and acyclic. A directed tree is a directed graph which would be a tree if the directions on the edges were ignored. Some authors restrict the phrase to the case where the edges are all directed towards a particular vertex, or all directed away from a particular vertex (see arborescence). A tree is called a rooted tree if one vertex has been designated the root, in which case the edges have a natural orientation, towards or away from the root. The tree-order is the partial ordering on the vertices of a tree with u ≤ v if and only if the unique path from the root to v passes through u. A rooted tree which is a subgraph of some graph G is a normal tree if the ends of every edge in G are comparable in this tree-order whenever those ends are vertices of the tree (Diestel 2005, p. 15). Rooted trees, often with additional structure such as ordering of the neighbors at each vertex, are a key data structure in computer science; see tree data structure. In a context where trees are supposed to have a root, a tree without any designated root is called a free tree. In a rooted tree, the parent of a vertex is the vertex connected to it on the path to the root; every vertex except the root has a unique parent. A child of a vertex v is a vertex of which v is the parent. A labeled tree is a tree in which each vertex is given a unique label. The vertices of a labeled tree on n vertices are typically given the labels 1, 2, …, n. A recursive tree is a labeled rooted tree where the vertex labels respect the tree order (i.e., if u < v for two vertices u and v, then the label of u is smaller than the label of v). An n-ary tree is a rooted tree for which each vertex has at most n children. 2-ary trees are sometimes called binary trees, while 3-ary trees are sometimes called ternary trees. A terminal vertex of a tree is a vertex of degree 1. In a rooted tree, the leaves are all terminal vertices; additionally, the root, if not a leaf itself, is a terminal vertex if it has precisely one child.

Plane Tree An ordered tree or plane tree is a rooted tree for which an ordering is specified for the children of each vertex. This is called a "plane tree" because an ordering of the children is equivalent to an embedding of the tree in the plane, with the root at the top and the children of each vertex lower than that vertex. Given an embedding of a rooted tree in the plane, if one fixes a direction of children, say left to right, then an embedding gives an ordering of the children. Conversely, given an ordered tree, and conventionally drawing the root at the top, then the child nodes in an ordered tree can be drawn left-to-right, yielding an essentially unique planar embedding . A leaf in a rooted tree is a vertex of degree 1 that is not the root. A terminal vertex of a tree is a vertex of degree 1. In a rooted tree, the leaves are all terminal vertices; additionally, the root, if not a leaf itself, is a terminal vertex if it has precisely one child.

2

Tree (graph theory)

3

Example The example tree shown to the right has 6 vertices and 6 − 1 = 5 edges. The unique simple path connecting the vertices 2 and 6 is 2-4-5-6.

Facts • Every tree is a bipartite graph and a median graph. Every tree with only countably many vertices is a planar graph. • Every connected graph G admits a spanning tree, which is a tree that contains every vertex of G and whose edges are edges of G. • Every connected graph with only countably many vertices admits a normal spanning tree (Diestel 2005, Prop. 8.2.4). • There exist connected graphs with uncountably many vertices which do not admit a normal spanning tree (Diestel 2005, Prop. 8.5.2). • Every finite tree with n vertices, with n > 1, has at least two terminal vertices (leaves). This minimal number of terminal vertices is characteristic of path graphs; the maximal number, n − 1, is attained by star graphs. • For any three vertices in a tree, the three paths between them have exactly one vertex in common.

Enumeration Labeled trees Cayley's formula states that there are nn−2 trees on n labeled vertices. It can be proved by first showing that the number of trees with vertices 1,2,...,n, of degrees d1,d2,...,dn respectively, is the multinomial coefficient

An alternative proof uses Prüfer sequences. Cayley's formula is the special case of complete graphs in a more general problem of counting spanning trees in an undirected graph, which is addressed by the matrix tree theorem. The similar problem of counting all the subtrees regardless of size has been shown to be #P-complete in the general case (Jerrum (1994)).

Unlabeled trees Counting the number of unlabeled free trees is a harder problem. No closed formula for the number t(n) of trees with n vertices up to graph isomorphism is known. The first few values of t(n) are: 1, 1, 1, 1, 2, 3, 6, 11, 23, 47, 106, 235, 551, 1301, 3159, ... (sequence A000055 in OEIS). Otter (1948) proved the asymptotic estimate:

with the values C and α seeming to be approximately 0.534949606… and 2.95576528565… (sequence A051491 in OEIS), respectively. (Here, means that .) This is a consequence of his asymptotic estimate for the number

of unlabeled rooted trees with n vertices:

with D apparently around 0.43992401257… and the same α as above (cf. Knuth (1997), Chap. 2.3.4.4 and Flajolet & Sedgewick (2009), Chap. VII.5). The first few values of r(n) are:

Tree (graph theory) 1, 1, 2, 4, 9, 20, 48, 115, 286, 719, 1842, 4766, 12486, 32973, ...

Types of trees A star graph is a tree which consists of a single internal vertex (and n − 1 leaves). In other words, a star graph of order n is a tree of order n with as many leaves as possible. Its diameter is at most 2. A tree with two leaves (the fewest possible) is a path graph; a forest in which all components are isolated nodes and path graphs is called a linear forest. If all vertices in a tree are within distance one of a central path subgraph, then the tree is a caterpillar tree. If all vertices are within distance two of a central path subgraph, then the tree is a lobster.

Notes  http:/ / en. wikipedia. org/ w/ index. php?title=Template:Infobox_graph& action=edit  Cayley (1857) "On the theory of the analytical forms called trees," (http:/ / books. google. com/ books?id=MlEEAAAAYAAJ& pg=PA172#v=onepage& q& f=false) Philosophical Magazine, 4th series, 13 : 172-176. However it should be mentioned that in 1847, K.G.C. von Staudt, in his book Geometrie der Lage (Nürnberg, (Germany): Bauer und Raspe, 1847), presented a proof of Euler's polyhedron theorem which relies on trees on pages 20-21 (http:/ / books. google. com/ books?id=MzQAAAAAQAAJ& pg=PA20#v=onepage& q& f=false). Also in 1847, the German physicist Gustav Kirchhoff investigated electrical circuits and found a relation between the number (n) of wires/resistors (branches), the number (m) of junctions (vertices), and the number (μ) of loops (faces) in the circuit. He proved the relation via an argument relying on trees. See: Kirchhoff, G. R. (1847) "Uber die Auflösung der Gleichungen auf welche man bei der Untersuchung der linearen Vertheilung galvanischer Ströme geführt wird" (http:/ / books. google. com/ books?id=gx4AAAAAMAAJ& vq=Kirchoff& pg=PA497#v=onepage& q& f=false) (On the solution of equations to which one is led by the investigation of the linear distribution of galvanic currents), Annalen der Physik und Chemie, 72 (12) : 497-508.  See .  See .  See .  See .  See .

References • Dasgupta, Sanjoy (1999), "Learning polytrees" (http://cseweb.ucsd.edu/~dasgupta/papers/poly.pdf), in Proc. 15th Conference on Uncertainty in Artificial Intelligence (UAI 1999), Stockholm, Sweden, July-August 1999, pp. 134–141. • Harary, Frank; Sumner, David (1980), "The dichromatic number of an oriented tree", Journal of Combinatorics, Information & System Sciences 5 (3): 184–187, MR  603363 (http://www.ams.org/ mathscinet-getitem?mr=603363). • Kim, Jin H.; Pearl, Judea (1983), "A computational model for causal and diagnostic reasoning in inference engines" (http://ijcai.org/Past Proceedings/IJCAI-83-VOL-1/PDF/041.pdf), in Proc. 8th International Joint Conference on Artificial Intelligence (IJCAI 1983), Karlsruhe, Germany, August 1983, pp. 190–193. • Li, Gang (1996), "Generation of Rooted Trees and Free Trees" (http://webhome.cs.uvic.ca/~ruskey/Theses/ GangLiMScThesis.pdf), M.S. Thesis, Dept. of Computer Science, University of Victoria, BC, Canada, p. 9. • Simion, Rodica (1991), "Trees with 1-factors and oriented trees", Discrete Mathematics 88 (1): 93–104, doi: 10.1016/0012-365X(91)90061-6 (http://dx.doi.org/10.1016/0012-365X(91)90061-6), MR  1099270 (http:// www.ams.org/mathscinet-getitem?mr=1099270).

4

Tree (graph theory)

5

Further reading • Diestel, Reinhard (2005), Graph Theory (http://diestel-graph-theory.com/index.html) (3rd ed.), Berlin, New York: Springer-Verlag, ISBN 978-3-540-26183-4. • Flajolet, Philippe; Sedgewick, Robert (2009), Analytic Combinatorics, Cambridge University Press, ISBN 978-0-521-89806-5 • Hazewinkel, Michiel, ed. (2001), "Tree" (http://www.encyclopediaofmath.org/index.php?title=p/t094060), Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4 • Knuth, Donald E. (November 14, 1997), The Art of Computer Programming Volume 1: Fundamental Algorithms (3rd ed.), Addison-Wesley Professional • Jerrum, Mark (1994), "Counting trees in a graph is #P-complete", Information Processing Letters 51 (3): 111–116, doi: 10.1016/0020-0190(94)00085-9 (http://dx.doi.org/10.1016/0020-0190(94)00085-9), ISSN  0020-0190 (http://www.worldcat.org/issn/0020-0190). • Otter, Richard (1948), "The Number of Trees", Annals of Mathematics. Second Series 49 (3): 583–599, doi: 10.2307/1969046 (http://dx.doi.org/10.2307/1969046), JSTOR  1969046 (http://www.jstor.org/stable/ 1969046).

Binary tree In computer science, a binary tree is a tree in which each node has at most two child nodes (denoted as the left child and the right child). Nodes with children are referred to as parent nodes, and child nodes may contain references to their parents. Following this convention, you can define ancestral relationships in a tree: that is, for example, one node can be an ancestor of another node, a descendant, or a great-grand-child of another node. The root node is the ancestor to all nodes of the tree, and any node in the tree can be reached from the root node. A tree which does not have any node other than root node is called a null tree. In a binary tree, the degree of every node can be at most two. A tree with n nodes has exactly n−1 branches or degree.

A simple binary tree of size 9 and height 3, with a root node whose value is 2. The above tree is unbalanced and not sorted.

Binary trees are used to implement binary search trees and binary heaps, and is applied for efficient searching and sorting. A binary tree is a special case of a K-ary tree (that is, k=2).

Definitions for rooted trees • • • •

A directed edge refers to the link from the parent to the child (the arrows in the picture of the tree). The root node of a tree is the node with no parents. There is at most one root node in a rooted tree. A leaf node has no children. The depth (or height) of a tree is the length of the path from the root to the deepest node in the tree. A (rooted) tree with only one node (the root) has a depth of zero. • Siblings are those nodes that share the same parent node. • A node p is an ancestor of a node q if p exists on the path from the root node to node q. The node q is then termed as a descendant of p. • The size of a node is the number of descendants it has including itself. • The in-degree of a node is the number of edges arriving at that node. • The out-degree of a node is the number of edges leaving that node.

Binary tree

6

• The root is the only node in a tree with an in-degree of 0. • All the leaf nodes have an out-degree of 0.

Types of binary trees • A rooted binary tree is a tree with a root node in which every node has at most two children. • A full binary tree (sometimes proper binary tree or 2-tree or strictly binary tree) is a tree in which every node other than the leaves has two children. Or, perhaps more clearly, every node in a binary tree has exactly (strictly) 0 or 2 children. Sometimes a full tree is ambiguously defined as a perfect tree (see next). Physicists define a binary tree to mean a full binary tree.

Tree rotations are very common internal operations on self-balancing binary trees.

. • A perfect binary tree is a full binary tree in which all leaves are at the same depth or same level, and in which every parent has two children. (This is ambiguously also called a complete binary tree (see next).) An example of a perfect binary tree is the ancestry chart of a person to a given depth, as each person has exactly two biological parents (one mother and one father); note that this reverses the usual parent/child tree convention, and these trees go in the opposite direction from usual (root at bottom). An ancestry chart which maps to a perfect depth 4

• A complete binary tree is a binary tree in which every level, except binary tree possibly the last, is completely filled, and all nodes are as far left as possible. A tree is called an almost complete binary tree or nearly complete binary tree if the exception holds, i.e. the last level is not completely filled. This type of tree is used as a specialized data structure called a heap. • An infinite complete binary tree is a tree with a countably infinite number of levels, in which every node has two children, so that there are 2d nodes at level d. The set of all nodes is countably infinite, but the set of all infinite paths from the root is uncountable: it has the cardinality of the continuum. These paths correspond by an order preserving bijection to the points of the Cantor set, or (through the example of the Stern–Brocot tree) to the set of positive irrational numbers. • A balanced binary tree is commonly defined as a binary tree in which the depth of the left and right subtrees of every node differ by 1 or less, although in general it is a binary tree where no leaf is much farther away from the root than any other leaf. (Different balancing schemes allow different definitions of "much farther".) Binary trees that are balanced according to this definition have a predictable depth (how many nodes are traversed from the root to a leaf, root counting as node 0 and subsequent as 1, 2, ..., depth). This depth is equal to the integer part of where is the number of nodes on the balanced tree. Example 1: balanced tree with 1 node, (depth = 0). Example 2: balanced tree with 3 nodes,

(depth=1). Example 3:

balanced tree with 5 nodes, (depth of tree is 2 nodes). • A degenerate tree is a tree where for each parent node, there is only one associated child node. This means that in a performance measurement, the tree will behave like a linked list data structure.

Binary tree Note that this terminology often varies in the literature, especially with respect to the meaning of "complete" and "full".

Properties of binary trees • The number of nodes n in a perfect binary tree can be found using this formula: n = 2h+1-1 where h is the depth of the tree. • The number of nodes n in a binary tree of height h is at least n = h + 1 and at most n = 2h+1-1 where h is the depth of the tree. • The number of leaf nodes l in a perfect binary tree can be found using this formula: l = 2h where h is the depth of the tree. • The number of nodes n in a perfect binary tree can also be found using this formula: n = 2l-1 where l is the number of leaf nodes in the tree. • The number of null links (i.e., absent children of nodes) in a complete binary tree of n nodes is (n+1). • The number of internal nodes (i.e., non-leaf nodes or n-l) in a complete binary tree of n nodes is ⌊ n/2 ⌋. • For any non-empty binary tree with n0 leaf nodes and n2 nodes of degree 2, n0 = n2 + 1.

Common operations There are a variety of different operations that can be performed on binary trees. Some are mutator operations, while others simply return useful information about the tree.

Insertion Nodes can be inserted into binary trees in between two other nodes or added after an external node. In binary trees, a node that is inserted is specified as to which child it is. External nodes Say that the external node being added onto is node A. To add a new node after node A, A assigns the new node as one of its children and the new node assigns node A as its parent. Internal nodes Insertion on internal nodes is slightly more complex than on external nodes. Say that the internal node is node A and that node B is the child of A. (If the insertion is to insert a right child, then B is the right child of A, and similarly with a left child insertion.) A The process of inserting a node into a binary tree assigns its child to the new node and the new node assigns its parent to A. Then the new node assigns its child to B and B assigns its parent as the new node.

7

Binary tree

8

Deletion Deletion is the process whereby a node is removed from the tree. Only certain nodes in a binary tree can be removed unambiguously. Node with zero or one children Say that the node to delete is node A. If a node has no children (external node), deletion is accomplished by setting the child of A's parent to null. If it has one child, set the parent of A's child to A's parent and set the child of A's parent to A's child. The process of deleting an internal node in a binary tree

Node with two children In a binary tree, a node with two children cannot be deleted unambiguously. However, in certain binary trees (including binary search trees) these nodes can be deleted, though with a rearrangement of the tree structure.

Binary tree

9

Type theory In type theory, a binary tree with nodes of type A is defined inductively as TA = μα. 1 + A × α × α.

Definition in graph theory For each binary tree data structure, there is equivalent rooted binary tree in graph theory. Graph theorists use the following definition: A binary tree is a connected acyclic graph such that the degree of each vertex is no more than three. It can be shown that in any binary tree of two or more nodes, there are exactly two more nodes of degree one than there are of degree three, but there can be any number of nodes of degree two. A rooted binary tree is such a graph that has one of its vertices of degree no more than two singled out as the root. With the root thus chosen, each vertex will have a uniquely defined parent, and up to two children; however, so far there is insufficient information to distinguish a left or right child. If we drop the connectedness requirement, allowing multiple connected components in the graph, we call such a structure a forest. Another way of defining binary trees is a recursive definition on directed graphs. A binary tree is either: • A single vertex. • A graph formed by taking two binary trees, adding a vertex, and adding an edge directed from the new vertex to the root of each binary tree. This also does not establish the order of children, but does fix a specific root node.

Combinatorics In combinatorics one considers the problem of counting the number of full binary trees of a given size. Here the trees have no values attached to their nodes (this would just multiply the number of possible trees by an easily determined factor), and trees are distinguished only by their structure; however the left and right child of any node are distinguished (if they are different trees, then interchanging them will produce a tree distinct from the original one). The size of the tree is taken to be the number n of internal nodes (those with two children); the other nodes are leaf nodes and there are n + 1 of them. The number of such binary trees of size n is equal to the number of ways of fully parenthesizing a string of n + 1 symbols (representing leaves) separated by n binary operators (representing internal nodes), so as to determine the argument subexpressions of each operator. For instance for n = 3 one has to parenthesize a string like , which is possible in five ways:

The correspondence to binary trees should be obvious, and the addition of redundant parentheses (around an already parenthesized expression or around the full expression) is disallowed (or at least not counted as producing a new possibility). There is a unique binary tree of size 0 (consisting of a single leaf), and any other binary tree is characterized by the pair of its left and right children; if these have sizes i and j respectively, the full tree has size i + j + 1. Therefore the number of binary trees of size n has the following recursive description , and for any positive integer n. It follows that

is the Catalan number of index n.

The above parenthesized strings should not be confused with the set of words of length 2n in the Dyck language, which consist only of parentheses in such a way that they are properly balanced. The number of such strings satisfies the same recursive description (each Dyck word of length 2n is determined by the Dyck subword enclosed by the initial '(' and its matching ')' together with the Dyck subword remaining after that closing parenthesis, whose lengths 2i and 2j satisfy i + j + 1 = n); this number is therefore also the Catalan number . So there are also five Dyck words of length 10: .

Binary tree

10

These Dyck words do not correspond in an obvious way to binary trees. A bijective correspondence can nevertheless be defined as follows: enclose the Dyck word in an extra pair of parentheses, so that the result can be interpreted as a Lisp list expression (with the empty list () as only occurring atom); then the dotted-pair expression for that proper list is a fully parenthesized expression (with NIL as symbol and '.' as operator) describing the corresponding binary tree (which is in fact the internal representation of the proper list). The ability to represent binary trees as strings of symbols and parentheses implies that binary trees can represent the elements of a free magma on a singleton set.

Methods for storing binary trees Binary trees can be constructed from programming language primitives in several ways.

Nodes and references In a language with records and references, binary trees are typically constructed by having a tree node structure which contains some data and references to its left child and its right child. Sometimes it also contains a reference to its unique parent. If a node has fewer than two children, some of the child pointers may be set to a special null value, or to a special sentinel node. In languages with tagged unions such as ML, a tree node is often a tagged union of two types of nodes, one of which is a 3-tuple of data, left child, and right child, and the other of which is a "leaf" node, which contains no data and functions much like the null value in a language with pointers.

Arrays Binary trees can also be stored in breadth-first order as an implicit data structure in arrays, and if the tree is a complete binary tree, this method wastes no space. In this compact arrangement, if a node has an index i, its children are found at indices (for the left child) and (for the right), while its parent (if any) is found at index (assuming the root has index zero). This method benefits from more compact storage and better locality of reference, particularly during a preorder traversal. However, it is expensive to grow and wastes space proportional to 2h - n for a tree of depth h with n nodes. This method of storage is often used for binary heaps. No space is wasted because nodes are added in breadth-first order.

Binary tree

11

Encodings Succinct encodings A succinct data structure is one which occupies close to minimum possible space, as established by information theoretical lower bounds. The number of different binary trees on nodes is , the th Catalan number (assuming we view trees with identical structure as identical). For large about

; thus we need at least

bits to encode it. A succinct binary tree therefore would occupy

bits.

One simple representation which meets this bound is to visit the nodes of the tree in preorder, outputting "1" for an internal node and "0" for a leaf.  If the tree contains data, we can simply simultaneously store it in a consecutive array in preorder. This function accomplishes this: function EncodeSuccinct(node n, bitstring structure, array data) { if n = nil then append 0 to structure; else append 1 to structure; append n.data to data; EncodeSuccinct(n.left, structure, data); EncodeSuccinct(n.right, structure, data); } The string structure has only

bits in the end, where

is the number of (internal) nodes; we don't even have

to store its length. To show that no information is lost, we can convert the output back to the original tree like this: function DecodeSuccinct(bitstring structure, array data) { remove first bit of structure and put it in b if b = 1 then create a new node n remove first element of data and put it in n.data n.left = DecodeSuccinct(structure, data) n.right = DecodeSuccinct(structure, data) return n else return nil } More sophisticated succinct representations allow not only compact storage of trees but even useful operations on those trees directly while they're still in their succinct form.

Encoding general trees as binary trees There is a one-to-one mapping between general ordered trees and binary trees, which in particular is used by Lisp to represent general ordered trees as binary trees. To convert a general ordered tree to binary tree, we only need to represent the general tree in left child-sibling way. The result of this representation will be automatically binary tree, if viewed from a different perspective. Each node N in the ordered tree corresponds to a node N' in the binary tree; the left child of N' is the node corresponding to the first child of N, and the right child of N' is the node corresponding to N 's next sibling --- that is, the next node in order among the children of the parent of N. This binary tree representation of a general order tree is sometimes also referred to as a left child-right sibling binary tree (LCRS tree), or a doubly chained tree, or a Filial-Heir chain.

Binary tree

12

One way of thinking about this is that each node's children are in a linked list, chained together with their right fields, and the node only has a pointer to the beginning or head of this list, through its left field. For example, in the tree on the left, A has the 6 children {B,C,D,E,F,G}. It can be converted into the binary tree on the right.

The binary tree can be thought of as the original tree tilted sideways, with the black left edges representing first child and the blue right edges representing next sibling. The leaves of the tree on the left would be written in Lisp as: (((N O) I J) C D ((P) (Q)) F (M)) which would be implemented in memory as the binary tree on the right, without any letters on those nodes that have a left child.

Notes  Unitary Symmetry, James D. Louck, World Scientific Pub., 2008  Aaron M. Tenenbaum, et. al Data Structures Using C, Prentice Hall, 1990 ISBN 0-13-199746-7  Paul E. Black (ed.), entry for data structure in Dictionary of Algorithms and Data Structures. U.S. National Institute of Standards and Technology. 15 December 2004. Online version (http:/ / xw2k. nist. gov/ dads/ / HTML/ balancedtree. html) Accessed 2010-12-19.  http:/ / theory. csail. mit. edu/ classes/ 6. 897/ spring03/ scribe_notes/ L12/ lecture12. pdf

References • Donald Knuth. The art of computer programming vol 1. Fundamental Algorithms, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89683-4. Section 2.3, especially subsections 2.3.1–2.3.2 (pp. 318–348). • Kenneth A Berman, Jerome L Paul. Algorithms: Parallel, Sequential and Distributed. Course Technology, 2005. ISBN 0-534-42057-5. Chapter 4. (pp. 113–166).

External links • Gamedev.net introduction on binary trees (http://www.gamedev.net/page/resources/_/technical/ general-programming/trees-part-2-binary-trees-r1433) • Binary Tree Proof by Induction (http://www.brpreiss.com/books/opus4/html/page355.html) • Balanced binary search tree on array How to create bottom-up an Ahnentafel list, or a balanced binary search tree on array (http://piergiu.wordpress.com/2010/02/21/balanced-binary-search-tree-on-array/)

Binary search tree

13

Binary search tree Binary search tree Type

Tree

Time complexity in big O notation Average Worst case Space O(n)

O(n)

Search O(log n) O(n) Insert O(log n) O(n) Delete O(log n) O(n)

In computer science, a binary search tree (BST), sometimes also called an ordered or sorted binary tree, is a node-based binary tree data structure which has the following properties: • The left subtree of a node contains only nodes with keys less than the node's key. • The right subtree of a node contains only nodes with keys greater than the node's key. • The left and right subtree each must also be a binary search tree. • There must be no duplicate nodes. Generally, the information represented by each node is a record rather than a single data element. However, for sequencing purposes, nodes are compared according to their keys rather than any part of their associated records.

A binary search tree of size 9 and depth 3, with root 8 and leaves 1, 4, 7 and 13

The major advantage of binary search trees over other data structures is that the related sorting algorithms and search algorithms such as in-order traversal can be very efficient. Binary search trees are a fundamental data structure used to construct more abstract data structures such as sets, multisets, and associative arrays.

Binary-search-tree property Let x be a node in a binary search tree. If y is a node in the left subtree of x, then y.key < x.key. If y is a node in the right subtree of x, then y.key > x.key.

Operations Operations, such as find, on a binary search tree require comparisons between nodes. These comparisons are made with calls to a comparator, which is a subroutine that computes the total order (linear order) on any two keys. This comparator can be explicitly or implicitly defined, depending on the language in which the binary search tree was implemented. A common comparator is the less-than function, for example, a < b, where a and b are keys of two nodes a and b in a binary search tree.

Binary search tree

Searching Searching a binary search tree for a specific key can be a recursive or an iterative process. We begin by examining the root node. If the tree is null, the key we are searching for does not exist in the tree. Otherwise, if the key equals that of the root, the search is successful and we return the node. If the key is less than that of the root, we search the left subtree. Similarly, if the key is greater than that of the root, we search the right subtree. This process is repeated until the key is found or the remaining subtree is null. If the searched key is not found before a null subtree is reached, then the item must not be present in the tree. This is easily expressed as a recursive algorithm: function Find-recursive(key, node): // call initially with node = root if node = Null or node.key = key then return node else if key < node.key then return Find-recursive(key, node.left) else return Find-recursive(key, node.right) The same algorithm can be implemented iteratively: function Find(key, root): current-node := root while current-node is not Null do if current-node.key = key then return current-node else if key < current-node.key then current-node := current-node.left else current-node := current-node.right Because in the worst case this algorithm must search from the root of the tree to the leaf farthest from the root, the search operation takes time proportional to the tree's height (see tree terminology). On average, binary search trees with n nodes have O(log n) height. However, in the worst case, binary search trees can have O(n) height, when the unbalanced tree resembles a linked list (degenerate tree).

Insertion Insertion begins as a search would begin; if the key is not equal to that of the root, we search the left or right subtrees as before. Eventually, we will reach an external node and add the new key-value pair (here encoded as a record 'newNode') as its right or left child, depending on the node's key. In other words, we examine the root and recursively insert the new node to the left subtree if its key is less than that of the root, or the right subtree if its key is greater than or equal to the root. Here's how a typical binary search tree insertion might be performed in a non-empty tree in C++: void insert(Node* node, int value) { if (value < node->key) { if (node->leftChild == NULL) node->leftChild = new Node(value); else insert(node->leftChild, value); } else {

14

Binary search tree if(node->rightChild == NULL) node->rightChild = new Node(value); else insert(node->rightChild, value); } } The above destructive procedural variant modifies the tree in place. It uses only constant heap space (and the iterative version uses constant stack space as well), but the prior version of the tree is lost. Alternatively, as in the following Python example, we can reconstruct all ancestors of the inserted node; any reference to the original tree root remains valid, making the tree a persistent data structure: def binary_tree_insert(node, key, value): if node is None: return TreeNode(None, key, value, None) if key == node.key: return TreeNode(node.left, key, value, node.right) if key < node.key: return TreeNode(binary_tree_insert(node.left, key, value), node.key, node.value, node.right) else: return TreeNode(node.left, node.key, node.value, binary_tree_insert(node.right, key, value)) The part that is rebuilt uses O(log n) space in the average case and O(n) in the worst case (see big-O notation). In either version, this operation requires time proportional to the height of the tree in the worst case, which is O(log n) time in the average case over all trees, but O(n) time in the worst case. Another way to explain insertion is that in order to insert a new node in the tree, its key is first compared with that of the root. If its key is less than the root's, it is then compared with the key of the root's left child. If its key is greater, it is compared with the root's right child. This process continues, until the new node is compared with a leaf node, and then it is added as this node's right or left child, depending on its key. There are other ways of inserting nodes into a binary tree, but this is the only way of inserting nodes at the leaves and at the same time preserving the BST structure.

Deletion There are three possible cases to consider: • Deleting a leaf (node with no children): Deleting a leaf is easy, as we can simply remove it from the tree. • Deleting a node with one child: Remove the node and replace it with its child. • Deleting a node with two children: Call the node to be deleted N. Do not delete N. Instead, choose either its in-order successor node or its in-order predecessor node, R. Replace the value of N with the value of R, then delete R. Broadly speaking, nodes with children are harder to delete. As with all binary trees, a node's in-order successor is its right subtree's left-most child, and a node's in-order predecessor is the left subtree's right-most child. In either case, this node will have zero or one children. Delete it according to one of the two simpler cases above.

15

Binary search tree

Deleting a node with two children from a binary search tree. First the rightmost node in the left subtree, the inorder predecessor 6, is identified. Its value is copied into the node being deleted. The inorder predecessor can then be easily deleted because it has at most one child. The same method works symmetrically using the inorder successor labelled 9.

Consistently using the in-order successor or the in-order predecessor for every instance of the two-child case can lead to an unbalanced tree, so some implementations select one or the other at different times. Runtime analysis: Although this operation does not always traverse the tree down to a leaf, this is always a possibility; thus in the worst case it requires time proportional to the height of the tree. It does not require more even when the node has two children, since it still follows a single path and does not visit any node twice. def find_min(self): # Gets minimum node (leftmost leaf) in a subtree current_node = self while current_node.left_child: current_node = current_node.left_child return current_node def replace_node_in_parent(self, new_value=None): if self.parent: if self == self.parent.left_child: self.parent.left_child = new_value else: self.parent.right_child = new_value if new_value: new_value.parent = self.parent def binary_tree_delete(self, key): if key < self.key: self.left_child.binary_tree_delete(key) elif key > self.key: self.right_child.binary_tree_delete(key) else: # delete the key here if self.left_child and self.right_child: # if both children are present successor = self.right_child.find_min() self.key = successor.key successor.binary_tree_delete(successor.key) elif self.left_child: # if the node has only a *left* child self.replace_node_in_parent(self.left_child)

16

Binary search tree elif self.right_child: # if the node has only a *right* child self.replace_node_in_parent(self.right_child) else: # this node has no children self.replace_node_in_parent(None)

Traversal Once the binary search tree has been created, its elements can be retrieved in-order by recursively traversing the left subtree of the root node, accessing the node itself, then recursively traversing the right subtree of the node, continuing this pattern with each node in the tree as it's recursively accessed. As with all binary trees, one may conduct a pre-order traversal or a post-order traversal, but neither are likely to be useful for binary search trees. An in-order traversal of a binary search tree will always result in a sorted list of node items (numbers, strings or other comparable items). The code for in-order traversal in Python is given below. It will call callback for every node in the tree. def traverse_binary_tree(node, callback): if node is None: return traverse_binary_tree(node.leftChild, callback) callback(node.value) traverse_binary_tree(node.rightChild, callback) Traversal requires O(n) time, since it must visit every node. This algorithm is also O(n), so it is asymptotically optimal.

Sort A binary search tree can be used to implement a simple but efficient sorting algorithm. Similar to heapsort, we insert all the values we wish to sort into a new ordered data structure—in this case a binary search tree—and then traverse it in order, building our result:

def build_binary_tree(values): tree = None for v in values:

17

Binary search tree

18

tree = binary_tree_insert(tree, v) return tree def get_inorder_traversal(root): ''' Returns a list containing all the values in the tree, starting at *root*. Traverses the tree in-order(leftChild, root, rightChild). ''' result = [] traverse_binary_tree(root, lambda element: result.append(element)) return result The worst-case time of build_binary_tree is

—if you feed it a sorted list of values, it chains them

into a linked list with no left subtrees. For example, build_binary_tree([1, 2, 3, 4, 5]) yields the tree (1 (2 (3 (4 (5))))). There are several schemes for overcoming this flaw with simple binary trees; the most common is the self-balancing binary search tree. If this same procedure is done using such a tree, the overall worst-case time is O(nlog n), which is asymptotically optimal for a comparison sort. In practice, the poor cache performance and added overhead in time and space for a tree-based sort (particularly for node allocation) make it inferior to other asymptotically optimal sorts such as heapsort for static list sorting. On the other hand, it is one of the most efficient methods of incremental sorting, adding items to a list over time while keeping the list sorted at all times.

Types There are many types of binary search trees. AVL trees and red-black trees are both forms of self-balancing binary search trees. A splay tree is a binary search tree that automatically moves frequently accessed elements nearer to the root. In a treap (tree heap), each node also holds a (randomly chosen) priority and the parent node has higher priority than its children. Tango trees are trees optimized for fast searches. Two other titles describing binary search trees are that of a complete and degenerate tree. A complete tree is a tree with n levels, where for each level d 1; and a disconnected graph has connectivity 0. In network theory, a giant component is a connected subgraph that contains a majority of the entire graph's nodes. A bridge, or cut edge or isthmus, is an edge whose removal disconnects a graph. (For example, all the edges in a tree are bridges.) A cut vertex is an analogous vertex (see above). A disconnecting set is a set of edges whose removal increases the number of components. An edge cut is the set of all edges which have one vertex in some proper vertex subset S and the other vertex in V(G)\S. Edges of K3 form a disconnecting set but not an edge cut. Any two edges of K3 form a minimal disconnecting set as well as an edge cut. An edge cut is necessarily a disconnecting set; and a minimal disconnecting set of a nonempty graph is necessarily an edge cut. A bond is a minimal (but not necessarily minimum), nonempty set of edges whose removal disconnects a graph. A graph is k-edge-connected if any subgraph formed by removing any k - 1 edges is still connected. The edge connectivity κ'(G) of a graph G is the minimum number of edges needed to disconnect G. One well-known result is that κ(G) ≤ κ'(G) ≤ δ(G). A component is a maximally connected subgraph. A block is either a maximally 2-connected subgraph, a bridge (together with its vertices), or an isolated vertex. A biconnected component is a 2-connected component. An articulation point (also known as a separating vertex) of a graph is a vertex whose removal from the graph increases its number of connected components. A biconnected component can be defined as a subgraph induced by a maximal set of nodes that has no separating vertex.

Glossary of graph theory

Distance The distance dG(u, v) between two (not necessary distinct) vertices u and v in a graph G is the length of a shortest path between them. The subscript G is usually dropped when there is no danger of confusion. When u and v are identical, their distance is 0. When u and v are unreachable from each other, their distance is defined to be infinity ∞. The eccentricity εG(v) of a vertex v in a graph G is the maximum distance from v to any other vertex. The diameter diam(G) of a graph G is the maximum eccentricity over all vertices in a graph; and the radius rad(G), the minimum. When there are two components in G, diam(G) and rad(G) defined to be infinity ∞. Trivially, diam(G) ≤ 2 rad(G). Vertices with maximum eccentricity are called peripheral vertices. Vertices of minimum eccentricity form the center. A tree has at most two center vertices. The Wiener index of a vertex v in a graph G, denoted by WG(v) is the sum of distances between v and all others. The Wiener index of a graph G, denoted by W(G), is the sum of distances over all pairs of vertices. An undirected graph's Wiener polynomial is defined to be Σ qd(u,v) over all unordered pairs of vertices u and v. Wiener index and Wiener polynomial are of particular interest to mathematical chemists. The k-th power Gk of a graph G is a supergraph formed by adding an edge between all pairs of vertices of G with distance at most k. A second power of a graph is also called a square. A k-spanner is a spanning subgraph, S, in which every two vertices are at most k times as far apart on S than on G. The number k is the dilation. k-spanner is used for studying geometric network optimization.

Genus A crossing is a pair of intersecting edges. A graph is embeddable on a surface if its vertices and edges can be arranged on it without any crossing. The genus of a graph is the lowest genus of any surface on which the graph can embed. A planar graph is one which can be drawn on the (Euclidean) plane without any crossing; and a plane graph, one which is drawn in such fashion. In other words, a planar graph is a graph of genus 0. The example graph is planar; the complete graph on n vertices, for n> 4, is not planar. Also, a tree is necessarily a planar graph. When a graph is drawn without any crossing, any cycle that surrounds a region without any edges reaching from the cycle into the region forms a face. Two faces on a plane graph are adjacent if they share a common edge. A dual, or planar dual when the context needs to be clarified, G* of a plane graph G is a graph whose vertices represent the faces, including any outerface, of G and are adjacent in G* if and only if their corresponding faces are adjacent in G. The dual of a planar graph is always a planar pseudograph (e.g. consider the dual of a triangle). In the familiar case of a 3-connected simple planar graph G (isomorphic to a convex polyhedron P), the dual G* is also a 3-connected simple planar graph (and isomorphic to the dual polyhedron P*). Furthermore, since we can establish a sense of "inside" and "outside" on a plane, we can identify an "outermost" region that contains the entire graph if the graph does not cover the entire plane. Such outermost region is called an outer face. An outerplanar graph is one which can be drawn in the planar fashion such that its vertices are all adjacent to the outer face; and an outerplane graph, one which is drawn in such fashion. The minimum number of crossings that must appear when a graph is drawn on a plane is called the crossing number. The minimum number of planar graphs needed to cover a graph is the thickness of the graph.

85

Glossary of graph theory

86

Weighted graphs and networks A weighted graph associates a label (weight) with every edge in the graph. Weights are usually real numbers. They may be restricted to rational numbers or integers. Certain algorithms require further restrictions on weights; for instance, Dijkstra's algorithm works properly only for positive weights. The weight of a path or the weight of a tree in a weighted graph is the sum of the weights of the selected edges. Sometimes a non-edge is labeled by a special weight representing infinity. Sometimes the word cost is used instead of weight. When stated without any qualification, a graph is always assumed to be unweighted. In some writing on graph theory the term network is a synonym for a weighted graph. A network may be directed or undirected, it may contain special vertices (nodes), such as source or sink. The classical network problems include: • minimum cost spanning tree, • shortest paths, • maximal flow (and the max-flow min-cut theorem)

Direction A directed arc, or directed edge, is an ordered pair of endvertices that can be represented graphically as an arrow drawn between the endvertices. In such an ordered pair the first vertex is called the initial vertex or tail; the second one is called the terminal vertex or head (because it appears at the arrow head). An undirected edge disregards any sense of direction and treats both endvertices interchangeably. A loop in a digraph, however, keeps a sense of direction and treats both head and tail identically. A set of arcs are multiple, or parallel, if they share the same head and the same tail. A pair of arcs are anti-parallel if one's head/tail is the other's tail/head. A digraph, or directed graph, or oriented graph, is analogous to an undirected graph except that it contains only arcs. A mixed graph may contain both directed and undirected edges; it generalizes both directed and undirected graphs. When stated without any qualification, a graph is almost always assumed to be undirected. A digraph is called simple if it has no loops and at most one arc between any pair of vertices. When stated without any qualification, a digraph is usually assumed to be simple. A quiver is a directed graph which is specifically allowed, but not required, to have loops and more than one arc between any pair of vertices. In a digraph Γ, we distinguish the out degree dΓ+(v), the number of edges leaving a vertex v, and the in degree dΓ-(v), the number of edges entering a vertex v. If the graph is oriented, the degree dΓ(v) of a vertex v is equal to the sum of its out- and in- degrees. When the context is clear, the subscript Γ can be dropped. Maximum and minimum out degrees are denoted by Δ+(Γ) and δ+(Γ); and maximum and minimum in degrees, Δ-(Γ) and δ-(Γ). An out-neighborhood, or successor set, N+Γ(v) of a vertex v is the set of heads of arcs going from v. Likewise, an in-neighborhood, or predecessor set, N-Γ(v) of a vertex v is the set of tails of arcs going into v. A source is a vertex with 0 in-degree; and a sink, 0 out-degree. A vertex v dominates another vertex u if there is an arc from v to u. A vertex subset S is out-dominating if every vertex not in S is dominated by some vertex in S; and in-dominating if every vertex in S is dominated by some vertex not in S. A kernel in a (possibly directed) graph G is an independent set S such that every vertex in V(G) \ S dominates some vertex in S. In undirected graphs, kernels are maximal independent sets. A digraph is kernel perfect if every induced sub-digraph has a kernel. An Eulerian digraph is a digraph with equal in- and out-degrees at every vertex. The zweieck of an undirected edge

is the pair of diedges

and

which form the simple

dicircuit. An orientation is an assignment of directions to the edges of an undirected or partially directed graph. When stated without any qualification, it is usually assumed that all undirected edges are replaced by a directed one in an

Glossary of graph theory

87

orientation. Also, the underlying graph is usually assumed to be undirected and simple. A tournament is a digraph in which each pair of vertices is connected by exactly one arc. In other words, it is an oriented complete graph. A directed path, or just a path when the context is clear, is an oriented simple path such that all arcs go the same direction, meaning all internal vertices have in- and out-degrees 1. A vertex v is reachable from another vertex u if there is a directed path that starts from u and ends at v. Note that in general the condition that u is reachable from v does not imply that v is also reachable from u. If v is reachable from u, then u is a predecessor of v and v is a successor of u. If there is an arc from u to v, then u is a direct predecessor of v, and v is a direct successor of u. A digraph is strongly connected if every vertex is reachable from every other following the directions of the arcs. On the contrary, a digraph is weakly connected if its underlying undirected graph is connected. A weakly connected graph can be thought of as a digraph in which every vertex is "reachable" from every other but not necessarily following the directions of the arcs. A strong orientation is an orientation that produces a strongly connected digraph. A directed cycle, or just a cycle when the context is clear, is an oriented simple cycle such that all arcs go the same direction, meaning all vertices have in- and out-degrees 1. A digraph is acyclic if it does not contain any directed cycle. A finite, acyclic digraph with no isolated vertices necessarily contains at least one source and at least one sink. An arborescence, or out-tree or branching, is an oriented tree in which all vertices are reachable from a single vertex. Likewise, an in-tree is an oriented tree in which a single vertex is reachable from every other one.

Directed acyclic graphs The partial order structure of directed acyclic graphs (or DAGs) gives them their own terminology. If there is a directed edge from u to v, then we say u is a parent of v and v is a child of u. If there is a directed path from u to v, we say u is an ancestor of v and v is a descendant of u. The moral graph of a DAG is the undirected graph created by adding an (undirected) edge between all parents of the same node (sometimes called marrying), and then replacing all directed edges by undirected edges. A DAG is perfect if, for each node, the set of parents is complete (i.e. no new edges need to be added when forming the moral graph).

Colouring Vertices in graphs can be given colours to identify or label them. Although they may actually be rendered in diagrams in different colours, working mathematicians generally pencil in numbers or letters (usually numbers) to represent the colours. Given a graph G (V,E) a k-colouring of G is a map ϕ : V → {1, ..., k} with the property that (u, v) ∈ E ⇒ ϕ(u) ≠ ϕ(v) - in other words, every vertex is assigned a colour with the condition that adjacent vertices cannot be assigned the same colour. The chromatic number χ(G) is the smallest k for which G has a k-colouring. Given a graph and a colouring, the colour classes of the graph are the sets of vertices given the same colour.

This graph is an example of a 4-critical graph. Its chromatic number is 4 but all of its proper subgraphs have a chromatic number less than 4. This graph is also planar

A graph is called k-critical if its chromatic number is k but all of its proper subgraphs have chromatic number less than k. An odd cycle is 3-critical, and the complete graph on k vertices is k-critical.

Glossary of graph theory

88

Various A graph invariant is a property of a graph G, usually a number or a polynomial, that depends only on the isomorphism class of G. Examples are the order, genus, chromatic number, and chromatic polynomial of a graph.

References  Bondy, J.A., Murty, U.S.R., Graph Theory, p. 298  Béla Bollobás, Modern Graph theory, p. 298

• Bollobás, Béla (1998). Modern Graph Theory. New York: Springer-Verlag. ISBN 0-387-98488-7. [Packed with advanced topics followed by a historical overview at the end of each chapter.] • Diestel, Reinhard (2005). Graph Theory (http://www.math.uni-hamburg.de/home/diestel/books/graph. theory/) (3rd edition ed.). Graduate Texts in Mathematics, vol. 173, Springer-Verlag. ISBN 3-540-26182-6. [Standard textbook, most basic material and some deeper results, exercises of various difficulty and notes at the end of each chapter; known for being quasi error-free.] • West, Douglas B. (2001). Introduction to Graph Theory (2ed). Upper Saddle River: Prentice Hall. ISBN 0-13-014400-2. [Tons of illustrations, references, and exercises. The most complete introductory guide to the subject.] • Weisstein, Eric W., " Graph (http://mathworld.wolfram.com/Graph.html)", MathWorld. • Zaslavsky, Thomas. Glossary of signed and gain graphs and allied areas. Electronic Journal of Combinatorics, Dynamic Surveys in Combinatorics, # DS 8. http://www.combinatorics.org/Surveys/

Directed graph In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph, or set of nodes connected by edges, where the edges have a direction associated with them. In formal terms, a digraph is a pair (sometimes

) of:

• a set V, whose elements are called vertices or nodes, • a set A of ordered pairs of vertices, called arcs, directed edges, or arrows (and sometimes simply edges with the corresponding set named E instead of A). It differs from an ordinary or undirected graph, in that the latter is defined in terms of unordered pairs of vertices, which are usually called edges.

A directed graph.

Sometimes a digraph is called a simple digraph to distinguish it from a directed multigraph, in which the arcs constitute a multiset, rather than a set, of ordered pairs of vertices. Also, in a simple digraph loops are disallowed. (A loop is an arc that pairs a vertex to itself.) On the other hand, some texts allow loops, multiple arcs, or both in a digraph.

Directed graph

89

Basic terminology An arc arc;

is considered to be directed from is said to be a direct successor of

one or more successive arcs leads from

, and to

is called the tail of the

is said to be a direct predecessor of

. If a path made up of

, then

to

;

is said to be a successor of

, and

is said to be a

predecessor of . The arc is called the arc inverted. An orientation of a simple undirected graph is obtained by assigning a direction to each edge. Any directed graph constructed this way is called an "oriented graph". A directed graph is an oriented simple graph if and only if it has neither self-loops nor 2-cycles. A weighted digraph is a digraph with weights assigned to its arcs, similar to a weighted graph. In the context of graph theory a digraph with weighted edges is called a network. The adjacency matrix of a digraph (with loops and multiple arcs) is the integer-valued matrix with rows and columns corresponding to the nodes, where a nondiagonal entry is the number of arcs from node i to node j, and the diagonal entry is the number of loops at node i. The adjacency matrix of a digraph is unique up to identical permutation of rows and columns. Another matrix representation for a digraph is its incidence matrix. See Glossary of graph theory#Direction for more definitions.

Indegree and outdegree For a node, the number of head endpoints adjacent to a node is called the indegree of the node and the number of tail endpoints adjacent to a node is its outdegree (called "branching factor" in trees). The indegree is denoted vertex with

and the outdegree as

A

is called a source, as it is the origin of

each of its incident edges. Similarly, a vertex with called a sink. The degree sum formula states that, for a directed graph,

If for every node v ∈ V,

is A digraph with vertices labeled (indegree, outdegree)

, the graph is called a balanced digraph.

Digraph connectivity A digraph G is called weakly connected (or just connected) if the undirected underlying graph obtained by replacing all directed edges of G with undirected edges is a connected graph. A digraph is strongly connected or strong if it contains a directed path from u to v and a directed path from v to u for every pair of vertices u,v. The strong components are the maximal strongly connected subgraphs.

Classes of digraphs A directed graph G is called symmetric if, for every arc that belongs to G, the corresponding reversed arc also belongs to G. A symmetric, loopless directed graph is equivalent to an undirected graph with the edges replaced by pairs of inverse arcs; thus the number of edges is equal to the number of arcs halved.

Directed graph

An acyclic directed graph, acyclic digraph, or directed acyclic graph is a directed graph with no directed cycles. Special cases of acyclic directed graphs include multitrees (graphs in which no two directed paths from a single starting node meet back at the same ending node), oriented trees or polytrees (digraphs formed by orienting the edges of undirected acyclic graphs), and rooted trees (oriented trees in which all edges of the underlying undirected tree are directed away from the roots).

90

A simple acyclic directed graph

A tournament is an oriented graph obtained by choosing a direction for each edge in an undirected complete graph. In the theory of Lie groups, a quiver Q is a directed graph serving as the domain of, and thus characterizing the shape of, a representation V defined as a functor, specifically an object of the functor category FinVctKF(Q) where F(Q) is the free category on Q consisting of paths in Q and FinVctK is the category of finite dimensional vector spaces over a field K. Representations of a quiver label its vertices with vector spaces and its edges (and hence paths) compatibly with linear transformations between them, and transform via natural transformations.

A tournament on 4 vertices

Notes    

. , Section 1.10. , Section 10. , Section 1.10. ;. p. 19 in the 2007 edition; p. 20 in the 2nd edition (2009).

References • Bang-Jensen, Jørgen; Gutin, Gregory (2000), Digraphs: Theory, Algorithms and Applications (http://www.cs. rhul.ac.uk/books/dbook/), Springer, ISBN 1-85233-268-9 (the corrected 1st edition of 2007 is now freely available on the authors' site; the 2nd edition appeared in 2009 ISBN 1-84800-997-6). • Bondy, John Adrian; Murty, U. S. R. (1976), Graph Theory with Applications (http://www.ecp6.jussieu.fr/ pageperso/bondy/books/gtwa/gtwa.html), North-Holland, ISBN 0-444-19451-7. • Diestel, Reinhard (2005), Graph Theory (http://www.math.uni-hamburg.de/home/diestel/books/graph. theory/) (3rd ed.), Springer, ISBN 3-540-26182-6 (the electronic 3rd edition is freely available on author's site). • Harary, Frank; Norman, Robert Z.; Cartwright, Dorwin (1965), Structural Models: An Introduction to the Theory of Directed Graphs, New York: Wiley. • Number of directed graphs (or digraphs) with n nodes. (http://oeis.org/A000273)

91

Adjacency matrix In mathematics and computer science, an adjacency matrix is a means of representing which vertices (or nodes) of a graph are adjacent to which other vertices. Another matrix representation for a graph is the incidence matrix. Specifically, the adjacency matrix of a finite graph G on n vertices is the n × n matrix where the non-diagonal entry aij is the number of edges from vertex i to vertex j, and the diagonal entry aii, depending on the convention, is either once or twice the number of edges (loops) from vertex i to itself. Undirected graphs often use the latter convention of counting loops twice, whereas directed graphs typically use the former convention. There exists a unique adjacency matrix for each isomorphism class of graphs (up to permuting rows and columns), and it is not the adjacency matrix of any other isomorphism class of graphs. In the special case of a finite simple graph, the adjacency matrix is a (0,1)-matrix with zeros on its diagonal. If the graph is undirected, the adjacency matrix is symmetric. The relationship between a graph and the eigenvalues and eigenvectors of its adjacency matrix is studied in spectral graph theory.

Examples The convention followed here is that an adjacent edge counts 1 in the matrix for an undirected graph. Labeled graph

Coordinates are 1-6.

The Nauru graph Coordinates are 0-23. White fields are zeros, colored fields are ones.

Directed Cayley graph of S4 As the graph is directed, the matrix is not symmetric.

• The adjacency matrix of a complete graph contains all ones except along the diagonal where there are only zeros. • The adjacency matrix of an empty graph is a zero matrix.

92

where

is an

of a bipartite graph whose parts have

matrix, and

and

vertices has the form

represents the zero matrix. Clearly, the matrix

uniquely represents the

bipartite graphs. It is sometimes called the biadjacency matrix. Formally, let with parts

and

iff If

be a bipartite graph

. The biadjacency matrix is the

0-1 matrix

in which

.

is a bipartite multigraph or weighted graph then the elements

the vertices or the weight of the edge

are taken to be the number of edges between

respectively.

Properties The adjacency matrix of an undirected simple graph is symmetric, and therefore has a complete set of real eigenvalues and an orthogonal eigenvector basis. The set of eigenvalues of a graph is the spectrum of the graph. Suppose two directed or undirected graphs

and

are isomorphic if and only if there exists a permutation matrix In particular,

and

and

are given.

and

such that

are similar and therefore have the same minimal polynomial, characteristic polynomial,

eigenvalues, determinant and trace. These can therefore serve as isomorphism invariants of graphs. However, two graphs may possess the same set of eigenvalues but not be isomorphic.  If A is the adjacency matrix of the directed or undirected graph G, then the matrix An (i.e., the matrix product of n copies of A) has an interesting interpretation: the entry in row i and column j gives the number of (directed or undirected) walks of length n from vertex i to vertex j. This implies, for example, that the number of triangles in an undirected graph G is exactly the trace of A3 divided by 6. The main diagonal of every adjacency matrix corresponding to a graph without loops has all zero entries. Note that here 'loops' means, for example A→A, not 'cycles' such as A→B→A. For

-regular graphs, d is also an eigenvalue of A for the vector

only if the multiplicity of

is 1. It can be shown that

, and

is connected if and

is also an eigenvalue of A if G is a connected bipartite

graph. The above are results of Perron–Frobenius theorem.

Variations An (a, b, c)-adjacency matrix A of a simple graph has Aij = a if ij is an edge, b if it is not, and c on the diagonal. The Seidel adjacency matrix is a (−1,1,0)-adjacency matrix. This matrix is used in studying strongly regular graphs and two-graphs. The distance matrix has in position (i,j) the distance between vertices vi and vj . The distance is the length of a shortest path connecting the vertices. Unless lengths of edges are explicitly provided, the length of a path is the number of edges in it. The distance matrix resembles a high power of the adjacency matrix, but instead of telling only whether or not two vertices are connected (i.e., the connection matrix, which contains boolean values), it gives the exact distance between them.

93

Data structures For use as a data structure, the main alternative to the adjacency matrix is the adjacency list. Because each entry in the adjacency matrix requires only one bit, it can be represented in a very compact way, occupying only bytes of contiguous space, where

is the number of vertices. Besides avoiding wasted space, this compactness

encourages locality of reference. However, for a sparse graph, adjacency lists require less storage space, because they do not waste any space to represent edges that are not present. Using a naïve array implementation on a 32-bit computer, an adjacency list for an undirected graph requires about bytes of storage, where is the number of edges. Noting that a simple graph can have at most the graph. Then,

edges, allowing loops, we can let

denote the density of

, or the adjacency list representation occupies more space precisely when

.

Thus a graph must be sparse indeed to justify an adjacency list representation. Besides the space tradeoff, the different data structures also facilitate different operations. Finding all vertices adjacent to a given vertex in an adjacency list is as simple as reading the list. With an adjacency matrix, an entire row must instead be scanned, which takes O(n) time. Whether there is an edge between two given vertices can be determined at once with an adjacency matrix, while requiring time proportional to the minimum degree of the two vertices with the adjacency list.

References  Godsil, Chris; Royle, Gordon Algebraic Graph Theory, Springer (2001), ISBN 0-387-95241-1, p.164

Further reading • Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Section 22.1: Representations of graphs". Introduction to Algorithms (Second ed.). MIT Press and McGraw-Hill. pp. 527–531. ISBN 0-262-03293-7. • Godsil, Chris; Royle, Gordon (2001). Algebraic Graph Theory. New York: Springer. ISBN 0-387-95241-1.

External links • Fluffschack (http://www.x2d.org/java/projects/fluffschack.jnlp) — an educational Java web start game demonstrating the relationship between adjacency matrices and graphs. • Open Data Structures - Section 12.1 - AdjacencyMatrix: Representing a Graph by a Matrix (http:// opendatastructures.org/versions/edition-0.1e/ods-java/12_1_AdjacencyMatrix_Repres.html) • McKay, Brendan. "Description of graph6 and sparse6 encodings" (http://cs.anu.edu.au/~bdm/data/formats. txt). • Café math : Adjacency Matrices of Graphs (http://www.cafemath.fr/mathblog/article. php?page=GoodWillHunting.php) : Application of the adjacency matrices to the computation generating series of walks.

FloydWarshall algorithm

94

Floyd–Warshall algorithm Floyd–Warshall algorithm Class

All-pairs shortest path problem (for weighted graphs)

Data structure

Graph

Worst case performance Best case performance Worst case space complexity

Graph and tree search algorithms •

α–β

A*

B*

Backtracking

Beam

Bellman–Ford

Best-first

Bidirectional

Borůvka

Branch & bound

BFS

British Museum

D*

DFS

Depth-limited

Dijkstra

Edmonds

Floyd–Warshall

Fringe search

Hill climbing

IDA*

Iterative deepening

Kruskal

Johnson

Lexicographic BFS

Prim

SMA*

Uniform-cost Listings

• • •

Graph algorithms Search algorithms List of graph algorithms Related topics

FloydWarshall algorithm

95 • • • •

Dynamic programming Graph traversal Tree traversal Search games

• • •

v t

e 

In computer science, the Floyd–Warshall algorithm (also known as Floyd's algorithm, Roy–Warshall algorithm, Roy–Floyd algorithm, or the WFI algorithm) is a graph analysis algorithm for finding shortest paths in a weighted graph with positive or negative edge weights (but with no negative cycles, see below) and also for finding transitive closure of a relation . A single execution of the algorithm will find the lengths (summed weights) of the shortest paths between all pairs of vertices, though it does not return details of the paths themselves. The Floyd-Warshall algorithm was published in its currently recognized form by Robert Floyd in 1962. However, it is essentially the same as algorithms previously published by Bernard Roy in 1959 and also by Stephen Warshall in 1962 for finding the transitive closure of a graph. The modern formulation of Warshall's algorithm as three nested for-loops was first described by Peter Ingerman, also in 1962. The algorithm is an example of dynamic programming.

Algorithm The Floyd–Warshall algorithm compares all possible paths through the graph between each pair of vertices. It is able to do this with Θ(|V|3) comparisons in a graph. This is remarkable considering that there may be up to Ω(|V|2) edges in the graph, and every combination of edges is tested. It does so by incrementally improving an estimate on the shortest path between two vertices, until the estimate is optimal. Consider a graph G with vertices V numbered 1 through N. Further consider a function shortestPath(i, j, k) that returns the shortest possible path from i to j using vertices only from the set {1,2,...,k} as intermediate points along the way. Now, given this function, our goal is to find the shortest path from each i to each j using only vertices 1 to k + 1. For each of these pairs of vertices, the true shortest path could be either (1) a path that only uses vertices in the set {1, ..., k} or (2) a path that goes from i to k + 1 and then from k + 1 to j. We know that the best path from i to j that only uses vertices 1 through k is defined by shortestPath(i, j, k), and it is clear that if there were a better path from i to k + 1 to j, then the length of this path would be the concatenation of the shortest path from i to k + 1 (using vertices in {1, ..., k}) and the shortest path from k + 1 to j (also using vertices in {1, ..., k}). If

is the weight of the edge between vertices i and j, we can define shortestPath(i, j, k + 1) in terms of the

following recursive formula: the base case is and the recursive case is

This formula is the heart of the Floyd–Warshall algorithm. The algorithm works by first computing shortestPath(i, j, k) for all (i, j) pairs for k = 1, then k = 2, etc. This process continues until k = n, and we have found the shortest path for all (i, j) pairs using any intermediate vertices. Pseudocode for this basic version follows: 1 let dist be a |V| × |V| array of minimum distances initialized to ∞ (infinity) 2 for each vertex v 3 dist[v][v] ← 0 4 for each edge (u,v)

FloydWarshall algorithm 5 dist[u][v] ← w(u,v) // the weight of the edge (u,v) 6 for k from 1 to |V| 7 for i from 1 to |V| 8 for j from 1 to |V| 9 if dist[i][j] > dist[i][k] + dist[k][j] 10 dist[i][j] ← dist[i][k] + dist[k][j] 11 end if

Example The algorithm above is executed on the graph on the left below:

Prior to the first iteration of the outer loop, labeled k=0 above, the only known paths correspond to the single edges in the graph. At k=1, paths that go through the vertex 1 are found: in particular, the path 2→1→3 is found, replacing the path 2→3 which has fewer edges but is longer. At k=2, paths going through the vertices {1,2} are found. The red and blue boxes show how the path 4→2→1→3 is assembled from the two known paths 4→2 and 2→1→3 encountered in previous iterations, with 2 in the intersection. The path 4→2→3 is not considered, because 2→1→3 is the shortest path encountered so far from 2 to 3. At k=3, paths going through the vertices {1,2,3} are found. Finally, at k=4, all shortest paths are found.

Behavior with negative cycles A negative cycle is a cycle whose edges sum to a negative value. There is no shortest path between any pair of vertices i, j which form part of a negative cycle, because path-lengths from i to j can be arbitrarily small (negative). For numerically meaningful output, the Floyd–Warshall algorithm assumes that there are no negative cycles. Nevertheless, if there are negative cycles, the Floyd–Warshall algorithm can be used to detect them. The intuition is as follows: • The Floyd–Warshall algorithm iteratively revises path lengths between all pairs of vertices (i, j), including where i = j; • Initially, the length of the path (i,i) is zero; • A path {(i,k), (k,i)} can only improve upon this if it has length less than zero, i.e. denotes a negative cycle; • Thus, after the algorithm, (i,i) will be negative if there exists a negative-length path from i back to i. Hence, to detect negative cycles using the Floyd–Warshall algorithm, one can inspect the diagonal of the path matrix, and the presence of a negative number indicates that the graph contains at least one negative cycle. Obviously, in an undirected graph a negative edge creates a negative cycle (i.e., a closed walk) involving its incident vertices.

96

FloydWarshall algorithm

Path reconstruction The Floyd–Warshall algorithm typically only provides the lengths of the paths between all pairs of vertices. With simple modifications, it is possible to create a method to reconstruct the actual path between any two endpoint vertices. While one may be inclined to store the actual path from each vertex to each other vertex, this is not necessary, and in fact, is very costly in terms of memory. For each vertex, one need only store the information about the highest index intermediate vertex one must pass through if one wishes to arrive at any given vertex. Therefore, information to reconstruct all paths can be stored in a single |V| × |V| matrix next where next[i][j] represents the highest index vertex one must travel through if one intends to take the shortest path from i to j. To implement this, when a new shortest path is found between two vertices, the matrix containing the paths is updated. The next matrix is updated along with the matrix of minimum distances dist, so at completion both tables are complete and accurate, and any entries which are infinite in the dist table will be null in the next table. The path from i to j is the path from i to next[i][j], followed by the path from next[i][j] to j. These two shorter paths are determined recursively. This modified algorithm runs with the same time and space complexity as the unmodified algorithm. let dist be a |V| × |V| array of minimum distances initialized to ∞ (infinity) let next be a |V| × |V| array of vertex indices initialized to null procedure FloydWarshallWithPathReconstruction () for each vertex v dist[v][v] ← 0 for each edge (u,v) dist[u][v] ← w(u,v) // the weight of the edge (u,v) for k from 1 to |V| for i from 1 to |V| for j from 1 to |V| if dist[i][k] + dist[k][j] < dist[i][j] then dist[i][j] ← dist[i][k] + dist[k][j] next[i][j] ← k function Path (i, j) if dist[i][j] = ∞ then return "no path" var intermediate ← next[i][j] if intermediate = null then return " " // the direct edge from i to j gives the shortest path else return Path(i, intermediate) + intermediate + Path(intermediate, j)

97

FloydWarshall algorithm

Analysis Let n be |V|, the number of vertices. To find all n2 of shortestPath(i,j,k) (for all i and j) from those of shortestPath(i,j,k−1) requires 2n2 operations. Since we begin with shortestPath(i,j,0) = edgeCost(i,j) and compute the sequence of n matrices shortestPath(i,j,1), shortestPath(i,j,2), …, shortestPath(i,j,n), the total number of operations used is n · 2n2 = 2n3. Therefore, the complexity of the algorithm is Θ(n3).

Applications and generalizations The Floyd–Warshall algorithm can be used to solve the following problems, among others: • Shortest paths in directed graphs (Floyd's algorithm). • Transitive closure of directed graphs (Warshall's algorithm). In Warshall's original formulation of the algorithm, the graph is unweighted and represented by a Boolean adjacency matrix. Then the addition operation is replaced by logical conjunction (AND) and the minimum operation by logical disjunction (OR). • Finding a regular expression denoting the regular language accepted by a finite automaton (Kleene's algorithm) • Inversion of real matrices (Gauss–Jordan algorithm) • Optimal routing. In this application one is interested in finding the path with the maximum flow between two vertices. This means that, rather than taking minima as in the pseudocode above, one instead takes maxima. The edge weights represent fixed constraints on flow. Path weights represent bottlenecks; so the addition operation above is replaced by the minimum operation. • Fast computation of Pathfinder networks. • Widest paths/Maximum bandwidth paths

Implementations Implementations are available for many programming languages. • • • • • • • • • •

For C++, in the boost::graph  library For C#, at QuickGraph  For Java, in the Apache Commons Graph  library For JavaScript, at Turb0JS  For MATLAB, in the Matlab_bgl  package For Perl, in the Graph  module For PHP, on page  and PL/pgSQL, on page  at Microshell For Python, in the NetworkX library For R, in package e1071  For Ruby, in script 

References         

http:/ / en. wikipedia. org/ w/ index. php?title=Template:Graph_search_algorithm& action=edit http:/ / www. boost. org/ libs/ graph/ doc/ http:/ / www. codeplex. com/ quickgraph http:/ / commons. apache. org/ sandbox/ commons-graph/ http:/ / www. turb0js. com/ a/ Floyd%E2%80%93Warshall_algorithm http:/ / www. mathworks. com/ matlabcentral/ fileexchange/ 10922 https:/ / metacpan. org/ module/ Graph http:/ / www. microshell. com/ programming/ computing-degrees-of-separation-in-social-networking/ 2/ http:/ / www. microshell. com/ programming/ floyd-warshal-algorithm-in-postgresql-plpgsql/ 3/

 http:/ / cran. r-project. org/ web/ packages/ e1071/ index. html  https:/ / github. com/ chollier/ ruby-floyd

98

FloydWarshall algorithm

99

• Cormen, Thomas H.; Leiserson, Charles E., Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8.

• • • • •

• Section 26.2, "The Floyd–Warshall algorithm", pp. 558–565; • Section 26.4, "A general framework for solving path problems in directed graphs", pp. 570–576. Floyd, Robert W. (June 1962). "Algorithm 97: Shortest Path". Communications of the ACM 5 (6): 345. doi: 10.1145/367766.368168 (http://dx.doi.org/10.1145/367766.368168). Ingerman, Peter Z. (November 1962). "Algorithm 141: Path Matrix". Template:Communications of the ACM 5 (11): 556. doi: 10.1145/368996.369016 (http://dx.doi.org/10.1145/368996.369016). Kleene, S. C. (1956). "Representation of events in nerve nets and finite automata". In C. E. Shannon and J. McCarthy. Automata Studies. Princeton University Press. pp. 3–42. Warshall, Stephen (January 1962). "A theorem on Boolean matrices". Journal of the ACM 9 (1): 11–12. doi: 10.1145/321105.321107 (http://dx.doi.org/10.1145/321105.321107). Kenneth H. Rosen (2003). Discrete Mathematics and Its Applications, 5th Edition. Addison Wesley. ISBN 0-07-119881-4 (ISE) Check |isbn= value (help). Unknown parameter |ISBN status= ignored (help) Roy, Bernard (1959). "Transitivité et connexité.". C. R. Acad. Sci. Paris 249: 216–218.

External links • Interactive animation of the Floyd–Warshall algorithm (http://www.pms.informatik.uni-muenchen.de/lehre/ compgeometry/Gosper/shortest_path/shortest_path.html#visualization) • The Floyd–Warshall algorithm in C#, as part of QuickGraph (http://quickgraph.codeplex.com/) • Visualization of Floyd's algorithm (http://students.ceid.upatras.gr/~papagel/english/java_docs/allmin.htm)

Shortest path problem In graph theory, the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized. This is analogous to the problem of finding the shortest path between two intersections on a road map: the graph's vertices correspond to intersections and the edges correspond to road segments, each weighted by the length of its road segment.

Definition

(6, 4, 5, 1) and (6, 4, 3, 2, 1) are both paths between vertices 6 and 1.

The shortest path problem can be defined for graphs whether undirected, directed, or mixed. It is defined here for undirected graphs; for directed graphs the definition of path requires that consecutive vertices be connected by an appropriate directed edge.

Shortest path problem

100

Two vertices are adjacent when they are both incident to a common edge. A path in an undirected graph is a sequence of vertices such that

for

path

is called a path of length

. Such a from

to

.

(The are variables; their numbering here relates to their position in the sequence and needs not to relate to any canonical labeling of the vertices.)

Let

be the edge incident to both

undirected (simple) graph and

and

. Given a real-valued weight function

, the shortest path from

) that over all possible

Shortest path (A, C, E, D, F) between vertices A and F in the weighted directed graph.

to

minimizes the sum

is the path

, and an (where

When each edge in the graph has unit

weight or , this is equivalent to finding the path with fewest edges. The problem is also sometimes called the single-pair shortest path problem, to distinguish it from the following variations: • The single-source shortest path problem, in which we have to find shortest paths from a source vertex v to all other vertices in the graph. • The single-destination shortest path problem, in which we have to find shortest paths from all vertices in the directed graph to a single destination vertex v. This can be reduced to the single-source shortest path problem by reversing the arcs in the directed graph. • The all-pairs shortest path problem, in which we have to find shortest paths between every pair of vertices v, v' in the graph. These generalizations have significantly more efficient algorithms than the simplistic approach of running a single-pair shortest path algorithm on all relevant pairs of vertices.

Algorithms The most important algorithms for solving this problem are: • • • • • •

Dijkstra's algorithm solves the single-source shortest path problems. Bellman–Ford algorithm solves the single-source problem if edge weights may be negative. A* search algorithm solves for single pair shortest path using heuristics to try to speed up the search. Floyd–Warshall algorithm solves all pairs shortest paths. Johnson's algorithm solves all pairs shortest paths, and may be faster than Floyd–Warshall on sparse graphs. Viterbi algorithm solves the shortest stochastic path problem with an additional probabilistic weight on each node.

Additional algorithms and associated evaluations may be found in Cherkassky et al.

Shortest path problem

101

Road networks A road network can be considered as a graph with positive weights. The nodes represent road junctions and each edge of the graph is associated with a road segment between two junctions. The weight of an edge may correspond to the length of the associated road segment, the time needed to traverse the segment or the cost of traversing the segment. Using directed edges it is also possible to model one-way streets. Such graphs are special in the sense that some edges are more important than others for long distance travel (e.g. highways). This property has been formalized using the notion of highway dimension. There are a great number of algorithms that exploit this property and are therefore able to compute the shortest path a lot quicker than would be possible on general graphs. All of these algorithms work in two phases. In the first phase, the graph is preprocessed without knowing the source or target node. This phase may take several days for realistic data and some techniques. The second phase is the query phase. In this phase, source and target node are known. The running time of the second phase is generally less than a second. The idea is that the road network is static, so the preprocessing phase can be done once and used for a large number of queries on the same road network. The algorithm with the fastest known query time is called hub labeling and is able to compute shortest path on the road networks of Europe or the USA in a fraction of a microsecond. Other techniques that have been used are: • ALT • • • • •

Arc Flags Contraction hierarchies Transit Node Routing Reach based Pruning Labeling

Single-source shortest paths Directed unweighted graphs Algorithm

Time complexity Author

This list is incomplete; you can help by expanding it .

Directed graphs with nonnegative weights Algorithm

Time complexity

Author

O(V2EL)

Ford 1956

O(VE)

Bellman 1958, Moore 1959

O(V2 log V)

Dantzig 1958, Dantzig 1960, Minty (cf. Pollack & Wiebenson 1960), Whiting & Hillier 1960

Dijkstra's algorithm with list

O(V2)

Leyzorek et al. 1957, Dijkstra 1959

Dijkstra's algorithm with modified binary heap

O((E + V) log V)

...

...

...

Dijkstra's algorithm with Fibonacci heap

O(E + V log V)

Fredman & Tarjan 1984, Fredman & Tarjan 1987

O(E log log L)

Johnson 1982, Karlsson & Poblete 1983

Bellman–Ford algorithm

Shortest path problem

102

Gabow's algorithm

O(E logE/V L)

Gabow 1983b, Gabow 1985b

O(E + V√log L)

Ahuja et al. 1990

This list is incomplete; you can help by expanding it .

Directed graphs with arbitrary weights Algorithm

Time complexity

Bellman–Ford algorithm O(VE)

Author Bellman 1958, Moore 1959

This list is incomplete; you can help by expanding it .

All-pairs shortest paths The all-pairs shortest paths problem for unweighted directed graphs was introduced by Shimbel (1953), who observed that it could be solved by a linear number of matrix multiplications, taking a total time of O(V4). Subsequent algorithms handle edge weights (which may possibly be negative), and are faster. The Floyd–Warshall algorithm takes O(V3) time, and Johnson's algorithm (a combination of the Bellman–Ford and Dijkstra algorithms) takes O(VE + V2 log V).

Applications Shortest path algorithms are applied to automatically find directions between physical locations, such as driving directions on web mapping websites like Mapquest or Google Maps. For this application fast specialized algorithms are available. If one represents a nondeterministic abstract machine as a graph where vertices describe states and edges describe possible transitions, shortest path algorithms can be used to find an optimal sequence of choices to reach a certain goal state, or to establish lower bounds on the time needed to reach a given state. For example, if vertices represents the states of a puzzle like a Rubik's Cube and each directed edge corresponds to a single move or turn, shortest path algorithms can be used to find a solution that uses the minimum possible number of moves. In a networking or telecommunications mindset, this shortest path problem is sometimes called the min-delay path problem and usually tied with a widest path problem. For example, the algorithm may seek the shortest (min-delay) widest path, or widest shortest (min-delay) path. A more lighthearted application is the games of "six degrees of separation" that try to find the shortest path in graphs like movie stars appearing in the same film. Other applications, often studied in operations research, include plant and facility layout, robotics, transportation, and VLSI design".

Related problems For shortest path problems in computational geometry, see Euclidean shortest path. The travelling salesman problem is the problem of finding the shortest path that goes through every vertex exactly once, and returns to the start. Unlike the shortest path problem, which can be solved in polynomial time in graphs without negative cycles, the travelling salesman problem is NP-complete and, as such, is believed not to be efficiently solvable for large sets of data (see P = NP problem). The problem of finding the longest path in a graph is also NP-complete.

Shortest path problem

103

The Canadian traveller problem and the stochastic shortest path problem are generalizations where either the graph isn't completely known to the mover, changes over time, or where actions (traversals) are probabilistic. The shortest multiple disconnected path is a representation of the primitive path network within the framework of Reptation theory. The widest path problem seeks a path so that the minimum label of any edge is as large as possible.

Linear programming formulation There is a natural linear programming formulation for the shortest path problem, given below. It is very simple compared to most other uses of linear programs in discrete optimization, however it illustrates connections to other concepts. Given a directed graph (V, A) with source node s, target node t, and cost wij for each arc (i, j) in A, consider the program with variables xij minimize

subject to

and for all i,

This LP has the special property that it is integral; more specifically, every basic optimal solution (when one exists) has all variables equal to 0 or 1, and the set of edges whose variables equal 1 form an s-t dipath. See Ahuja et al. for one proof, although the origin of this approach dates back to mid-20th century. The dual for this linear program is maximize yt − ys subject to for all ij, yj − yi ≤ wij and feasible duals correspond to the concept of a consistent heuristic for the A* algorithm for shortest paths. For any feasible dual y the reduced costs are nonnegative and A* essentially runs Dijkstra's algorithm on these reduced costs.

References  Abraham, Ittai; Fiat, Amos; Goldberg, Andrew V.; Werneck, Renato F. "Highway Dimension, Shortest Paths, and Provably Efficient Algorithms" (http:/ / research. microsoft. com/ pubs/ 115272/ soda10. pdf research. microsoft. com/ pubs/ 115272/ soda10. pdf). ACM-SIAM Symposium on Discrete Algorithms, pages 782-793, 2010.  Abraham, Ittai; Delling, Daniel; Goldberg, Andrew V.; Werneck, Renato F. research.microsoft.com/pubs/142356/HL-TR.pdf "A Hub-Based Labeling Algorithm for Shortest Paths on Road Networks" (http:/ / research. microsoft. com/ pubs/ 142356/ HL-TR. pdf). Symposium on Experimental Algorithms, pages 230-241, 2011.  http:/ / en. wikipedia. org/ w/ index. php?title=Shortest_path_problem& action=edit

• Bellman, Richard (1958). "On a routing problem". Quarterly of Applied Mathematics 16: 87–90. MR  0102435 (http://www.ams.org/mathscinet-getitem?mr=0102435). • Cormen, Thomas H.; Leiserson, Charles E., Rivest, Ronald L., Stein, Clifford (2001) . "Single-Source Shortest Paths and All-Pairs Shortest Paths". Introduction to Algorithms (2nd ed.). MIT Press and McGraw-Hill. pp. 580–642. ISBN 0-262-03293-7. • Dijkstra, E. W. (1959). "A note on two problems in connexion with graphs" (http://www-m3.ma.tum.de/twiki/ pub/MN0506/WebHome/dijkstra.pdf). Numerische Mathematik 1: 269–271. doi: 10.1007/BF01386390 (http:// dx.doi.org/10.1007/BF01386390). • Fredman, Michael Lawrence; Tarjan, Robert E. (1984). "Fibonacci heaps and their uses in improved network optimization algorithms" (http://www.computer.org/portal/web/csdl/doi/10.1109/SFCS.1984.715934). 25th Annual Symposium on Foundations of Computer Science. IEEE. pp. 338–346. doi: 10.1109/SFCS.1984.715934 (http://dx.doi.org/10.1109/SFCS.1984.715934). ISBN 0-8186-0591-X.

Shortest path problem • Fredman, Michael Lawrence; Tarjan, Robert E. (1987). "Fibonacci heaps and their uses in improved network optimization algorithms" (http://portal.acm.org/citation.cfm?id=28874). Journal of the Association for Computing Machinery 34 (3): 596–615. doi: 10.1145/28869.28874 (http://dx.doi.org/10.1145/28869.28874). • Leyzorek, M.; Gray, R. S.; Johnson, A. A.; Ladew, W. C.; Meaker, S. R., Jr.; Petry, R. M.; Seitz, R. N. (1957). Investigation of Model Techniques — First Annual Report — 6 June 1956 — 1 July 1957 — A Study of Model Techniques for Communication Systems. Cleveland, Ohio: Case Institute of Technology. • Moore, E. F. (1959). "The shortest path through a maze". Proceedings of an International Symposium on the Theory of Switching (Cambridge, Massachusetts, 2–5 April 1957). Cambridge: Harvard University Press. pp. 285–292. • Shimbel, Alfonso (1953). "Structural parameters of communication networks". Bulletin of Mathematical Biophysics 15 (4): 501–507. doi: 10.1007/BF02476438 (http://dx.doi.org/10.1007/BF02476438).

Further reading • D. Frigioni; A. Marchetti-Spaccamela and U. Nanni (1998). "Fully dynamic output bounded single source shortest path problem" (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.9856). Proc. 7th Annu. ACM-SIAM Symp. Discrete Algorithms. Atlanta, GA. pp. 212–221.

104

105

Order in which the nodes are expanded Class

Search algorithm

Data structure

Graph

Worst case performance Worst case space complexity

In graph theory, breadth-first search (BFS) is a strategy for searching in a graph when search is limited to essentially two operations: (a) visit and inspect a node of a graph; (b) gain access to visit the nodes that neighbor the currently visited node. The BFS begins at a root node and inspects all the neighboring nodes. Then for each of those neighbor nodes in turn, it inspects their neighbor nodes which were unvisited, and so on. Compare BFS with the equivalent, but more memory-efficient Iterative deepening depth-first search and contrast with depth-first search.

Animated example of a breadth-first search

106

Algorithm

The breadth-first tree obtained when running BFS on the given map and starting in Frankfurt

An example map of Germany with some connections between cities

Graph and tree search algorithms •

α–β

A*

B*

Backtracking

Beam

Bellman–Ford

Best-first

Bidirectional

Borůvka

Branch & bound

BFS

British Museum

D*

DFS

Depth-limited

Dijkstra

Edmonds

Floyd–Warshall

Fringe search

Hill climbing

IDA*

Iterative deepening

Kruskal

Johnson

Lexicographic BFS

Prim

107 •

SMA*

Uniform-cost Listings

• • •

Graph algorithms Search algorithms List of graph algorithms Related topics

• • • •

Dynamic programming Graph traversal Tree traversal Search games

• • •

v t

e 

The algorithm uses a queue data structure to store intermediate results as it traverses the graph, as follows: 1. Enqueue the root node 2. Dequeue a node and examine it • If the element sought is found in this node, quit the search and return a result. • Otherwise enqueue any successors (the direct child nodes) that have not yet been discovered. 3. If the queue is empty, every node on the graph has been examined – quit the search and return "not found". 4. If the queue is not empty, repeat from Step 2. Note: Using a stack instead of a queue would turn this algorithm into a depth-first search.

Pseudocode Input: A graph G and a root v of G 1 procedure BFS(G,v) is 2 create a queue Q 3 create a set V 4 enqueue v onto Q 5 add v to V 6 while Q is not empty loop 7 t ← Q.dequeue() 8 if t is what we are looking for then 9 return t 10 end if 11 for all edges e in G.adjacentEdges(t) loop 12 u ← G.adjacentVertex(t,e) 13 if u is not in V then 14 add u to V 15 enqueue u onto Q 16 end if 17 end loop 18 end loop 19 return none 20 end BFS

108

Features Space complexity When the number of vertices in the graph is known ahead of time, and additional data structures are used to determine which vertices have already been added to the queue, the space complexity can be expressed as where

is the cardinality of the set of vertices. If the graph is represented by an Adjacency list it occupies 

space in memory, while an Adjacency matrix representation occupies

.

Time complexity The time complexity can be expressed as the worst case. Note:

may vary between



and

since every vertex and every edge will be explored in , depending on how sparse the input graph

is (assuming that the graph is connected).

Applications Breadth-first search can be used to solve many problems in graph theory, for example: • Finding all nodes within one connected component • • • • • •

Copying Collection, Cheney's algorithm Finding the shortest path between two nodes u and v (with path length measured by number of edges) Testing a graph for bipartiteness (Reverse) Cuthill–McKee mesh numbering Ford–Fulkerson method for computing the maximum flow in a flow network Serialization/Deserialization of a binary tree vs serialization in sorted order, allows the tree to be re-constructed in an efficient manner. • Construction of the failure function of the Aho-Corasick pattern matcher.

Finding connected components The set of nodes reached by a BFS (breadth-first search) form the connected component containing the starting node.

Testing bipartiteness BFS can be used to test bipartiteness, by starting the search at any vertex and giving alternating labels to the vertices visited during the search. That is, give label 0 to the starting vertex, 1 to all its neighbours, 0 to those neighbours' neighbours, and so on. If at any step a vertex has (visited) neighbours with the same label as itself, then the graph is not bipartite. If the search ends without such a situation occurring, then the graph is bipartite.

References  Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest. p.590  Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest. p.591  Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest. p.597

• Knuth, Donald E. (1997), The Art Of Computer Programming Vol 1. 3rd ed. (http://www-cs-faculty.stanford. edu/~knuth/taocp.html), Boston: Addison-Wesley, ISBN 0-201-89683-4

109

Depth-first search

110

Depth-first search Depth-first search

Order in which the nodes are visited Class

Search algorithm

Data structure

Graph

Worst case performance

for explicit graphs traversed without repetition,

for implicit graphs with branching factor b

searched to depth d Worst case space complexity

if entire graph is traversed without repetition, O(longest path length searched) for implicit graphs without elimination of duplicate nodes

Graph and tree search algorithms •

α–β

A*

B*

Backtracking

Beam

Bellman–Ford

Best-first

Bidirectional

Borůvka

Branch & bound

BFS

British Museum

D*

DFS

Depth-limited

Dijkstra

Edmonds

Floyd–Warshall

Depth-first search

111 •

Fringe search

Hill climbing

IDA*

Iterative deepening

Kruskal

Johnson

Lexicographic BFS

Prim

SMA*

Uniform-cost Listings

• • •

Graph algorithms Search algorithms List of graph algorithms Related topics

• • • •

Dynamic programming Graph traversal Tree traversal Search games

• • •

v t

e 

Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. One starts at the root (selecting some node as the root in the graph case) and explores as far as possible along each branch before backtracking. A version of depth-first search was investigated in the 19th century by French mathematician Charles Pierre Trémaux as a strategy for solving mazes.

Properties The time and space analysis of DFS differs according to its application area. In theoretical computer science, DFS is typical used to traverse an entire graph, and takes time , linear in the size of the graph. In these applications it also uses space

in the worst case to store the stack of vertices on the current search path as well as the set

of already-visited vertices. Thus, in this setting, the time and space bounds are the same as for breadth-first search and the choice of which of these two algorithms to use depends less on their complexity and more on the different properties of the vertex orderings the two algorithms produce. For applications of DFS to search problems in artificial intelligence, however, the graph to be searched is often either too large to visit in its entirety or even infinite, and DFS may suffer from non-termination when the length of a path in the search tree is infinite. Therefore, the search is only performed to a limited depth, and due to limited memory availability one typically does not use data structures that keep track of the set of all previously visited vertices. In this case, the time is still linear in the number of expanded vertices and edges (although this number is not the same as the size of the entire graph because some vertices may be searched more than once and others not at all) but the space complexity of this variant of DFS is only proportional to the depth limit, much smaller than the space needed for searching to the same depth using breadth-first search. For such applications, DFS also lends itself much better to heuristic methods of choosing a likely-looking branch. When an appropriate depth limit is not known a priori, iterative deepening depth-first search applies DFS repeatedly with a sequence of increasing limits; in the artificial intelligence mode of analysis, with a branching factor greater than one, iterative deepening increases the running

Depth-first search

112

time by only a constant factor over the case in which the correct depth limit is known due to the geometric growth of the number of nodes per level. DFS may be also used to collect a sample of graph nodes. However, incomplete DFS, similarly to incomplete BFS, is biased towards nodes of high degree.

Example For the following graph:

a depth-first search starting at A, assuming that the left edges in the shown graph are chosen before right edges, and assuming the search remembers previously visited nodes and will not repeat them (since this is a small graph), will visit the nodes in the following order: A, B, D, F, E, C, G. The edges traversed in this search form a Trémaux tree, a structure with important applications in graph theory. Performing the same search without remembering previously visited nodes results in visiting nodes in the order A, B, D, F, E, A, B, D, F, E, etc. forever, caught in the A, B, D, F, E cycle and never reaching C or G. Iterative deepening is one technique to avoid this infinite loop and would reach all nodes.

Output of a depth-first search A convenient description of a depth first search of a graph is in terms of a spanning tree of the vertices reached during the search. Based on this spanning tree, the edges of the original graph can be divided into three classes: forward edges, which point from a node of the tree to one of its descendants, back edges, which point from a node to one of its ancestors, and cross edges, which do neither. Sometimes tree edges, edges which belong to the spanning tree itself, are classified separately from forward edges. If the original graph is undirected then all of its edges are tree edges or back edges.

The four types of edges defined by a spanning tree

Depth-first search

Vertex orderings It is also possible to use the depth-first search to linearly order the vertices of the original graph (or tree). There are three common ways of doing this: • A preordering is a list of the vertices in the order that they were first visited by the depth-first search algorithm. This is a compact and natural way of describing the progress of the search, as was done earlier in this article. A preordering of an expression tree is the expression in Polish notation. • A postordering is a list of the vertices in the order that they were last visited by the algorithm. A postordering of an expression tree is the expression in reverse Polish notation. • A reverse postordering is the reverse of a postordering, i.e. a list of the vertices in the opposite order of their last visit. Reverse postordering is not the same as preordering. For example, when searching the directed graph

beginning at node A, one visits the nodes in sequence, to produce lists either A B D B A C A, or A C D C A B A (depending upon whether the algorithm chooses to visit B or C first). Note that repeat visits in the form of backtracking to a node, to check if it has still unvisited neighbours, are included here (even if it is found to have none). Thus the possible preorderings are A B D C and A C D B (order by node's leftmost occurrence in above list), while the possible reverse postorderings are A C B D and A B C D (order by node's rightmost occurrence in above list). Reverse postordering produces a topological sorting of any directed acyclic graph. This ordering is also useful in control flow analysis as it often represents a natural linearization of the control flow. The graph above might represent the flow of control in a code fragment like if (A) then { B } else { C } D and it is natural to consider this code in the order A B C D or A C B D, but not natural to use the order A B D C or A C D B.

113

Depth-first search

114

Pseudocode Input: A graph G and a vertex v of G Output: All vertices reachable from v labeled as discovered A recursive implementation of DFS: 1 2 3 4 5

procedure DFS(G,v): label v as discovered for all edges from v to w in G.adjacentEdges(v) do if vertex w is not labeled as discovered then recursively call DFS(G,w)

A non-recursive implementation of DFS: 1 2 3 4 5 6 7 8 9

procedure DFS-iterative(G,v): let S be a stack S.push(v) while S is not empty v ← S.pop() if v is not labeled as discovered: label v as discovered for all edges from v to w in G.adjacentEdges(v) do S.push(w)

These two variations of DFS visit the neighbors of each vertex in the opposite order from each other: the first neighbor of v visited by the recursive variation is the first one in the list of adjacent edges, while in the iterative variation the first visited neighbor is the last one in the list of adjacent edges. The non-recursive implementation is similar to breadth-first search but differs from it in two ways: it uses a stack instead of a queue, and it delays checking whether a vertex has been discovered until the vertex is popped from the stack rather than making this check before pushing the vertex.

Applications Algorithms that use depth-first search as a building block include: • Finding connected components. • Topological sorting. • Finding 2-(edge or vertex)-connected components. • Finding 3-(edge or vertex)-connected components. • Finding the bridges of a graph. • Generating words in order to plot the Limit Set of a Group. • Finding strongly connected components.

Randomized algorithm similar to depth-first search used in generating a

maze. • Planarity testing • Solving puzzles with only one solution, such as mazes. (DFS can be adapted to find all solutions to a maze by only including nodes on the current path in the visited set.)

Depth-first search • Maze generation may use a randomized depth-first search. • Finding biconnectivity in graphs.

Notes  Charles Pierre Trémaux (1859–1882) École Polytechnique of Paris (X:1876), French engineer of the telegraph in Public conference, December 2, 2010 – by professor Jean Pelletier-Thibert in Académie de Macon (Burgundy – France) – (Abstract published in the Annals academic, March 2011 – ISSN: 0980-6032)  Goodrich and Tamassia; Cormen, Leiserson, Rivest, and Stein  Kleinberg and Tardos

References • Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 22.3: Depth-first search, pp. 540–549. • Goodrich, Michael T.; Tamassia, Roberto (2001), Algorithm Design: Foundations, Analysis, and Internet Examples, Wiley, ISBN 0-471-38365-1 • Kleinberg, Jon; Tardos, Éva (2006), Algorithm Design, Addison Wesley, pp. 92–94 • Knuth, Donald E. (1997), The Art Of Computer Programming Vol 1. 3rd ed (http://www-cs-faculty.stanford. edu/~knuth/taocp.html), Boston: Addison-Wesley, ISBN 0-201-89683-4, OCLC  155842391 (http://www. worldcat.org/oclc/155842391)

External links • Depth-First Explanation and Example (http://www.cse.ohio-state.edu/~gurari/course/cis680/cis680Ch14. html) • C++ Boost Graph Library: Depth-First Search (http://www.boost.org/libs/graph/doc/depth_first_search. html) • Depth-First Search Animation (for a directed graph) (http://www.cs.duke.edu/csed/jawaa/DFSanim.html) • Depth First and Breadth First Search: Explanation and Code (http://www.kirupa.com/developer/actionscript/ depth_breadth_search.htm) • QuickGraph (http://quickgraph.codeplex.com/Wiki/View.aspx?title=Depth First Search Example), depth first search example for .Net • Depth-first search algorithm illustrated explanation (Java and C++ implementations) (http://www.algolist.net/ Algorithms/Graph_algorithms/Undirected/Depth-first_search) • YAGSBPL – A template-based C++ library for graph search and planning (http://code.google.com/p/yagsbpl/ )

115

Backtracking

116

Backtracking Graph and tree search algorithms •

α–β

A*

B*

Backtracking

Beam

Bellman–Ford

Best-first

Bidirectional

Borůvka

Branch & bound

BFS

British Museum

D*

DFS

Depth-limited

Dijkstra

Edmonds

Floyd–Warshall

Fringe search

Hill climbing

IDA*

Iterative deepening

Kruskal

Johnson

Lexicographic BFS

Prim

SMA*

Uniform-cost Listings

• • •

Graph algorithms Search algorithms List of graph algorithms Related topics

• • • •

Dynamic programming Graph traversal Tree traversal Search games

• • •

v t

e 

Backtracking is a general algorithm for finding all (or some) solutions to some computational problem, that incrementally builds candidates to the solutions, and abandons each partial candidate c ("backtracks") as soon as it

Backtracking determines that c cannot possibly be completed to a valid solution. The classic textbook example of the use of backtracking is the eight queens puzzle, that asks for all arrangements of eight chess queens on a standard chessboard so that no queen attacks any other. In the common backtracking approach, the partial candidates are arrangements of k queens in the first k rows of the board, all in different rows and columns. Any partial solution that contains two mutually attacking queens can be abandoned, since it cannot possibly be completed to a valid solution. Backtracking can be applied only for problems which admit the concept of a "partial candidate solution" and a relatively quick test of whether it can possibly be completed to a valid solution. It is useless, for example, for locating a given value in an unordered table. When it is applicable, however, backtracking is often much faster than brute force enumeration of all complete candidates, since it can eliminate a large number of candidates with a single test. Backtracking is an important tool for solving constraint satisfaction problems, such as crosswords, verbal arithmetic, Sudoku, and many other puzzles. It is often the most convenient (if not the most efficient[citation needed]) technique for parsing, for the knapsack problem and other combinatorial optimization problems. It is also the basis of the so-called logic programming languages such as Icon, Planner and Prolog. Backtracking is also utilized in the (diff) difference engine for the MediaWiki software[citation needed]. Backtracking depends on user-given "black box procedures" that define the problem to be solved, the nature of the partial candidates, and how they are extended into complete candidates. It is therefore a metaheuristic rather than a specific algorithm – although, unlike many other meta-heuristics, it is guaranteed to find all solutions to a finite problem in a bounded amount of time. The term "backtrack" was coined by American mathematician D. H. Lehmer in the 1950s. The pioneer string-processing language SNOBOL (1962) may have been the first to provide a built-in general backtracking facility.

Description of the method The backtracking algorithm enumerates a set of partial candidates that, in principle, could be completed in various ways to give all the possible solutions to the given problem. The completion is done incrementally, by a sequence of candidate extension steps. Conceptually, the partial candidates are the nodes of a tree structure, the potential search tree. Each partial candidate is the parent of the candidates that differ from it by a single extension step; the leaves of the tree are the partial candidates that cannot be extended any further. The backtracking algorithm traverses this search tree recursively, from the root down, in depth-first order. At each node c, the algorithm checks whether c can be completed to a valid solution. If it cannot, the whole sub-tree rooted at c is skipped (pruned). Otherwise, the algorithm (1) checks whether c itself is a valid solution, and if so reports it to the user; and (2) recursively enumerates all sub-trees of c. The two tests and the children of each node are defined by user-given procedures. Therefore, the actual search tree that is traversed by the algorithm is only a part of the potential tree. The total cost of the algorithm is the number of nodes of the actual tree times the cost of obtaining and processing each node. This fact should be considered when choosing the potential search tree and implementing the pruning test.

117

Backtracking

Pseudocode In order to apply backtracking to a specific class of problems, one must provide the data P for the particular instance of the problem that is to be solved, and six procedural parameters, root, reject, accept, first, next, and output. These procedures should take the instance data P as a parameter and should do the following: 1. 2. 3. 4. 5. 6.

'root(P): return the partial candidate at the root of the search tree. reject(P,c): return true only if the partial candidate c is not worth completing. accept(P,c): return true if c is a solution of P, and false otherwise. first(P,c): generate the first extension of candidate c. next(P,s): generate the next alternative extension of a candidate, after the extension s. output(P,c): use the solution c of P, as appropriate to the application.

The backtracking algorithm reduces then to the call bt(root(P)), where bt is the following recursive procedure: procedure bt(c) if reject(P,c) then return if accept(P,c) then output(P,c) s ← first(P,c) while s ≠ Λ do bt(s) s ← next(P,s)

Usage considerations The reject procedure should be a boolean-valued function that returns true only if it is certain that no possible extension of c is a valid solution for P. If the procedure cannot reach a definite conclusion, it should return false. An incorrect true result may cause the bt procedure to miss some valid solutions. The procedure may assume that reject(P,t) returned false for every ancestor t of c in the search tree. On the other hand, the efficiency of the backtracking algorithm depends on reject returning true for candidates that are as close to the root as possible. If reject always returns false, the algorithm will still find all solutions, but it will be equivalent to a brute-force search. The accept procedure should return true if c is a complete and valid solution for the problem instance P, and false otherwise. It may assume that the partial candidate c and all its ancestors in the tree have passed the reject test. Note that the general pseudo-code above does not assume that the valid solutions are always leaves of the potential search tree. In other words, it admits the possibility that a valid solution for P can be further extended to yield other valid solutions. The first and next procedures are used by the backtracking algorithm to enumerate the children of a node c of the tree, that is, the candidates that differ from c by a single extension step. The call first(P,c) should yield the first child of c, in some order; and the call next(P,s) should return the next sibling of node s, in that order. Both functions should return a distinctive "null" candidate, denoted here by 'Λ', if the requested child does not exist. Together, the root, first, and next functions define the set of partial candidates and the potential search tree. They should be chosen so that every solution of P occurs somewhere in the tree, and no partial candidate occurs more than once. Moreover, they should admit an efficient and effective reject predicate.

118

Backtracking

119

Early stopping variants The pseudo-code above will call output for all candidates that are a solution to the given instance P. The algorithm is easily modified to stop after finding the first solution, or a specified number of solutions; or after testing a specified number of partial candidates, or after spending a given amount of CPU time.

Examples Typical examples are • Puzzles such as eight queens puzzle, crosswords, verbal arithmetic, Sudoku, Peg Solitaire. • Combinatorial optimization problems such as parsing and the knapsack problem. • Logic programming languages such as Icon, Planner and Prolog, which use backtracking internally to generate answers. • Backtracking is also utilized in the "diff" (version comparing) engine for the MediaWiki software. • Parsing • Knapsack problem Below is an example for the constraint satisfaction problem:

Sudoku puzzle solved by backtracking.

Constraint satisfaction The general constraint satisfaction problem consists in finding a list of integers x = (x,x, ..., x[n]), each in some range {1, 2, ..., m}, that satisfies some arbitrary constraint (boolean function) F. For this class of problems, the instance data P would be the integers m and n, and the predicate F. In a typical backtracking solution to this problem, one could define a partial candidate as a list of integers c = (c,c, ... c[k]), for any k between 0 and n, that are to be assigned to the first k variables x,x, ..., x[k]). The root candidate would then be the empty list (). The first and next procedures would then be function first(P,c) k ← length(c) if k = n then return Λ else return (c, c, ..., c[k], 1) function next(P,s) k ← length(s) if s[k] = m then return Λ else return (s, s, ..., s[k-1], 1 + s[k]) Here "length(c)" is the number of elements in the list c. The call reject(P,c) should return true if the constraint F cannot be satisfied by any list of n integers that begins with the k elements of c. For backtracking to be effective, there must be a way to detect this situation, at least for some candidates c, without enumerating all those mn-k n-tuples. For example, if F is the conjunction of several boolean predicates, F = F F F[p], and each F[i] depends only on a small subset of the variables x, ..., x[n], then the reject procedure could simply check the terms

Backtracking F[i] that depend only on variables x, ..., x[k], and return true if any of those terms returns false. In fact, reject needs only check those terms that do depend on x[k], since the terms that depend only on x, ..., x[k-1] will have been tested further up in the search tree. Assuming that reject is implemented as above, then accept(P,c) needs only check whether c is complete, that is, whether it has n elements. It is generally better to order the list of variables so that it begins with the most critical ones (i.e. the ones with fewest value options, or which have a greater impact on subsequent choices). One could also allow the next function to choose which variable should be assigned when extending a partial candidate, based on the values of the variables already assigned by it. Further improvements can be obtained by the technique of constraint propagation. In addition to retaining minimal recovery values used in backing up, backtracking implementations commonly keep a variable trail, to record value change history. An efficient implementation will avoid creating a variable trail entry between two successive changes when there is no choice point, as the backtracking will erase all of the changes as a single operation. An alternative to the variable trail is to keep a timestamp of when the last change was made to the variable. The timestamp is compared to the timestamp of a choice point. If the choice point has an associated time later than that of the variable, it is unnecessary to revert the variable when the choice point is backtracked, as it was changed before the choice point occurred.

Notes References • Gilles Brassard, Paul Bratley (1995). Fundamentals of Algorithmics. Prentice-Hall.

External links • HBmeyer.de (http://www.hbmeyer.de/backtrack/backtren.htm), Interactive animation of a backtracking algorithm • Solving Combinatorial Problems with STL and Backtracking (http://www.drdobbs.com/cpp/ solving-combinatorial-problems-with-stl/184401194), Article and C++ source code for a generic implementation of backtracking • Sample Java Code (http://github.com/kapild/Permutations), Sample code for backtracking of 8 Queens problem.

120

Topological sorting

121

Topological sorting In computer science, a topological sort (sometimes abbreviated topsort or toposort) or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time.

Examples The canonical application of topological sorting (topological order) is in scheduling a sequence of jobs or tasks based on their dependencies; topological sorting algorithms were first studied in the early 1960s in the context of the PERT technique for scheduling in project management (Jarnagin 1960). The jobs are represented by vertices, and there is an edge from x to y if job x must be completed before job y can be started (for example, when washing clothes, the washing machine must finish before we put the clothes to dry). Then, a topological sort gives an order in which to perform the jobs. In computer science, applications of this type arise in instruction scheduling, ordering of formula cell evaluation when recomputing formula values in spreadsheets, logic synthesis, determining the order of compilation tasks to perform in makefiles, data serialization, and resolving symbol dependencies in linkers. It is also used to decide in which order to load tables with foreign keys in databases. The graph shown to the left has many valid topological sorts, including: • •

7, 5, 3, 11, 8, 2, 9, 10 (visual left-to-right, top-to-bottom) 3, 5, 7, 8, 11, 2, 9, 10 (smallest-numbered available vertex first)

• •

3, 7, 8, 5, 11, 10, 2, 9 (because we can) 5, 7, 3, 8, 11, 10, 9, 2 (fewest edges first)

• •

7, 5, 11, 3, 10, 8, 9, 2 (largest-numbered available vertex first) 7, 5, 11, 2, 3, 8, 9, 10 (attempting top-to-bottom, left-to-right)

Algorithms The usual algorithms for topological sorting have running time linear in the number of nodes plus the number of edges ( ). One of these algorithms, first described by Kahn (1962), works by choosing vertices in the same order as the eventual topological sort. First, find a list of "start nodes" which have no incoming edges and insert them into a set S; at least one such node must exist in an acyclic graph. Then: L ← Empty list that will contain the sorted elements S ← Set of all nodes with no incoming edges while S is non-empty do remove a node n from S

Topological sorting add n to tail of L for each node m with an edge e from n to m do remove edge e from the graph if m has no other incoming edges then insert m into S if graph has edges then return error (graph has at least one cycle) else return L (a topologically sorted order) If the graph is a DAG, a solution will be contained in the list L (the solution is not necessarily unique). Otherwise, the graph must have at least one cycle and therefore a topological sorting is impossible. Note that, reflecting the non-uniqueness of the resulting sort, the structure S can be simply a set or a queue or a stack. Depending on the order that nodes n are removed from set S, a different solution is created. A variation of Kahn's algorithm that breaks ties lexicographically forms a key component of the Coffman–Graham algorithm for parallel scheduling and layered graph drawing. An alternative algorithm for topological sorting is based on depth-first search. The algorithm loops through each node of the graph, in an arbitrary order, initiating a depth-first search that terminates when it hits any node that has already been visited since the beginning of the topological sort: L ← Empty list that will contain the sorted nodes while there are unmarked nodes do select an unmarked node n visit(n) function visit(node n) if n has a temporary mark then stop (not a DAG) if n is not marked (i.e. has not been visited yet) then mark n temporarily for each node m with an edge from n to m do visit(m) mark n permanently add n to head of L Note that each node n gets prepended to the output list L only after considering all other nodes on which n depends (all ancestral nodes of n in the graph). Specifically, when the algorithm adds node n, we are guaranteed that all nodes on which n depends are already in the output list L: they were added to L either by the preceding recursive call to visit(), or by an earlier call to visit(). Since each edge and node is visited once, the algorithm runs in linear time. This depth-first-search-based algorithm is the one described by Cormen et al. (2001); it seems to have been first described in print by Tarjan (1976).

122

Topological sorting

Complexity The computational complexity of the problem of computing a topological ordering of a directed acyclic graph is NC2; that is, it can be computed in O(log2 n) time on a parallel computer using a polynomial number O(nk) of processors, for some constant k (Cook 1985).

Uniqueness If a topological sort has the property that all pairs of consecutive vertices in the sorted order are connected by edges, then these edges form a directed Hamiltonian path in the DAG. If a Hamiltonian path exists, the topological sort order is unique; no other order respects the edges of the path. Conversely, if a topological sort does not form a Hamiltonian path, the DAG will have two or more valid topological orderings, for in this case it is always possible to form a second valid ordering by swapping two consecutive vertices that are not connected by an edge to each other. Therefore, it is possible to test in linear time whether a unique ordering exists, and whether a Hamiltonian path exists, despite the NP-hardness of the Hamiltonian path problem for more general directed graphs (Vernet & Markenzon 1997).

Relation to partial orders Topological orderings are also closely related to the concept of a linear extension of a partial order in mathematics. A partially ordered set is just a set of objects together with a definition of the "≤" inequality relation, satisfying the axioms of reflexivity (x ≤ x), antisymmetry (if x ≤ y and y ≤x then x = y) and transitivity (if x ≤ y and y ≤ z, then x ≤ z). A total order is a partial order in which, for every two objects x and y in the set, either x ≤ y or y ≤ x. Total orders are familiar in computer science as the comparison operators needed to perform comparison sorting algorithms. For finite sets, total orders may be identified with linear sequences of objects, where the "≤" relation is true whenever the first object precedes the second object in the order; a comparison sorting algorithm may be used to convert a total order into a sequence in this way. A linear extension of a partial order is a total order that is compatible with it, in the sense that, if x ≤ y in the partial order, then x ≤ y in the total order as well. One can define a partial ordering from any DAG by letting the set of objects be the vertices of the DAG, and defining x ≤ y to be true, for any two vertices x and y, whenever there exists a directed path from x to y; that is, whenever y is reachable from x. With these definitions, a topological ordering of the DAG is the same thing as a linear extension of this partial order. Conversely, any partial ordering may be defined as the reachability relation in a DAG. One way of doing this is to define a DAG that has a vertex for every object in the partially ordered set, and an edge xy for every pair of objects for which x ≤ y. An alternative way of doing this is to use the transitive reduction of the partial ordering; in general, this produces DAGs with fewer edges, but the reachability relation in these DAGs is still the same partial order. By using these constructions, one can use topological ordering algorithms to find linear extensions of partial orders.

References • Cook, Stephen A. (1985), "A Taxonomy of Problems with Fast Parallel Algorithms", Information and Control 64 (1–3): 2–22, doi:10.1016/S0019-9958(85)80041-3 . • Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001), "Section 22.4: Topological sort", Introduction to Algorithms (2nd ed.), MIT Press and McGraw-Hill, pp. 549–552, ISBN 0-262-03293-7. • Jarnagin, M. P. (1960), Automatic machine methods of testing PERT networks for consistency, Technical Memorandum No. K-24/60, Dahlgren, Virginia: U. S. Naval Weapons Laboratory. • Kahn, Arthur B. (1962), "Topological sorting of large networks", Communications of the ACM 5 (11): 558–562, doi:10.1145/368996.369025 .

123

Topological sorting • Tarjan, Robert E. (1976), "Edge-disjoint spanning trees and depth-first search", Acta Informatica 6 (2): 171–185, doi:10.1007/BF00268499 . • Vernet, Oswaldo; Markenzon, Lilian (1997), "Hamiltonian problems for reducible flowgraphs", Proc. 17th International Conference of the Chilean Computer Science Society (SCCC '97), pp. 264–267, doi:10.1109/SCCC.1997.637099 .

External links • NIST Dictionary of Algorithms and Data Structures: topological sort  • Weisstein, Eric W., "TopologicalSort ", MathWorld.

References      

http:/ / dx. doi. org/ 10. 1016%2FS0019-9958%2885%2980041-3 http:/ / dx. doi. org/ 10. 1145%2F368996. 369025 http:/ / dx. doi. org/ 10. 1007%2FBF00268499 http:/ / dx. doi. org/ 10. 1109%2FSCCC. 1997. 637099 http:/ / www. nist. gov/ dads/ HTML/ topologicalSort. html http:/ / mathworld. wolfram. com/ TopologicalSort. html

124

Dijkstra's algorithm

125

Dijkstra's algorithm Dijkstra's algorithm

Dijkstra's algorithm. It picks the unvisited vertex with the lowest-distance, calculates the distance through it to each unvisited neighbor, and updates the neighbor's distance if smaller. Mark visited (set to red) when done with neighbors. Class

Search algorithm

Data structure

Graph

Worst case performance

Graph and tree search algorithms •

α–β

A*

B*

Backtracking

Beam

Bellman–Ford

Best-first

Bidirectional

Borůvka

Branch & bound

BFS

British Museum

D*

DFS

Depth-limited

Dijkstra

Edmonds

Floyd–Warshall

Fringe search

Hill climbing

IDA*

Iterative deepening

Kruskal

Johnson

Lexicographic BFS

Prim

Dijkstra's algorithm

126 •

SMA*

Uniform-cost Listings

• • •

Graph algorithms Search algorithms List of graph algorithms Related topics

• • • •

Dynamic programming Graph traversal Tree traversal Search games

• • •

v t

e 

Dijkstra's algorithm, conceived by computer scientist Edsger Dijkstra in 1956 and published in 1959, is a graph search algorithm that solves the single-source shortest path problem for a graph with non-negative edge path costs, producing a shortest path tree. This algorithm is often used in routing and as a subroutine in other graph algorithms. For a given source vertex (node) in the graph, the algorithm finds the path with lowest cost (i.e. the shortest path) between that vertex and every other vertex. It can also be used for finding costs of shortest paths from a single vertex to a single destination vertex by stopping the algorithm once the shortest path to the destination vertex has been determined. For example, if the vertices of the graph represent cities and edge path costs represent driving distances between pairs of cities connected by a direct road, Dijkstra's algorithm can be used to find the shortest route between one city and all other cities. As a result, the shortest path first is widely used in network routing protocols, most notably IS-IS and OSPF (Open Shortest Path First). Dijkstra's original algorithm does not use a min-priority queue and runs in

(where

is the number of

vertices). The idea of this algorithm is also given in (Leyzorek et al. 1957). The implementation based on a min-priority queue implemented by a Fibonacci heap and running in (where is the number of edges) is due to (Fredman & Tarjan 1984). This is asymptotically the fastest known single-source shortest-path algorithm for arbitrary directed graphs with unbounded non-negative weights.

Dijkstra's algorithm

127

Algorithm Let the node at which we are starting be called the initial node. Let the distance of node Y be the distance from the initial node to Y. Dijkstra's algorithm will assign some initial distance values and will try to improve them step by step. 1. Assign to every node a tentative distance value: set it to zero for our initial node and to infinity for all other nodes. 2. Mark all nodes unvisited. Set the initial node as current. Create a set of the unvisited nodes called the unvisited set consisting of all the nodes. 3. For the current node, consider all of its unvisited neighbors and calculate their tentative distances. For example, if the current node A is marked with a distance of 6, and the edge connecting it with a neighbor B has length 2, then the distance to B (through A) will be 6 + 2 = 8. 4. When we are done considering all of the neighbors of the current node, mark the current node as visited and remove it from the unvisited set. A visited node will never be checked again.

Illustration of Dijkstra's algorithm search for finding path from a start node (lower left, red) to a goal node (upper right, green) in a robot motion planning problem. Open nodes represent the "tentative" set. Filled nodes are visited ones, with color representing the distance: the greener, the farther. Nodes in all the different directions are explored uniformly, appearing as a more-or-less circular wavefront as Dijkstra's algorithm uses a heuristic identically equal to 0.

5. If the destination node has been marked visited (when planning a route between two specific nodes) or if the smallest tentative distance among the nodes in the unvisited set is infinity (when planning a complete traversal; occurs when there is no connection between the initial node and remaining unvisited nodes), then stop. The algorithm has finished.

6. Select the unvisited node that is marked with the smallest tentative distance, and set it as the new "current node" then go back to step 3.

Description Note: For ease of understanding, this discussion uses the terms intersection, road and map — however, formally these terms are vertex, edge and graph, respectively. Suppose you would like to find the shortest path between two intersections on a city map, a starting point and a destination. The order is conceptually simple: to start, mark the distance to every intersection on the map with infinity. This is done not to imply there is an infinite distance, but to note that intersection has not yet been visited; some variants of this method simply leave the intersection unlabeled. Now, at each iteration, select a current intersection. For the first iteration, the current intersection will be the starting point and the distance to it (the intersection's label) will be zero. For subsequent iterations (after the first), the current intersection will be the closest unvisited intersection to the starting point—this will be easy to find. From the current intersection, update the distance to every unvisited intersection that is directly connected to it. This is done by determining the sum of the distance between an unvisited intersection and the value of the current intersection, and relabeling the unvisited intersection with this value if it is less than its current value. In effect, the intersection is relabeled if the path to it through the current intersection is shorter than the previously known paths. To facilitate shortest path identification, in pencil, mark the road with an arrow pointing to the relabeled intersection if you label/relabel it, and erase all others pointing to it. After you have updated the distances to each neighboring intersection, mark the current intersection as visited and select the unvisited intersection with lowest distance (from the starting point) – or the lowest label—as the current intersection. Nodes marked as visited are labeled with the shortest path from the starting point to it and will not be revisited or returned to.

Dijkstra's algorithm

128

Continue this process of updating the neighboring intersections with the shortest distances, then marking the current intersection as visited and moving onto the closest unvisited intersection until you have marked the destination as visited. Once you have marked the destination as visited (as is the case with any visited intersection) you have determined the shortest path to it, from the starting point, and can trace your way back, following the arrows in reverse. Of note is the fact that this algorithm makes no attempt to direct "exploration" towards the destination as one might expect. Rather, the sole consideration in determining the next "current" intersection is its distance from the starting point. This algorithm, therefore "expands outward" from the starting point, interactively considering every node that is closer in terms of shortest path distance until it reaches the destination. When understood in this way, it is clear how the algorithm necessarily finds the shortest path, however, it may also reveal one of the algorithm's weaknesses: its relative slowness in some topologies.

Pseudocode In the following algorithm, the code u := vertex in Q with smallest dist[], searches for the vertex u in the vertex set Q that has the least dist[u] value. That vertex is removed from the set Q and returned to the user. dist_between(u, v) calculates the length between the two neighbor-nodes u and v. The variable alt on lines 17 & 19 is the length of the path from the root node to the neighbor node v if it were to go through u. If this path is shorter than the current shortest path recorded for v, that current path is replaced with this alt path. The previous array is populated with a pointer to the "next-hop" node on the source graph to get the shortest route to the source. 1 2

function Dijkstra(Graph, source): for each vertex v in Graph:

// Initializations

3

dist[v]

:= infinity;

// Mark distances from source to v as not yet computed

4

visited[v]

:= false;

// Mark all nodes as unvisited

previous[v]

:= undefined;

// Previous node in optimal path from source

5 6

end for

7 8

dist[source]

:= 0;

// Distance from source to itself is zero

9

insert source into Q;

// Start off with the source node

while Q is not empty:

// The main loop

10 11 12

u := vertex in Q with smallest distance in dist[] and has not been visited;

13

remove u from Q;

14

visited[u] := true

// Source node in first case

// mark this node as visited

15 16

for each neighbor v of u:

17

alt := dist[u] + dist_between(u, v);

18

if alt < dist[v]:

19

dist[v]

20

previous[v]

21

if !visited[v]:

22

25 26

end if end if end for end while

// keep the shortest dist from src to v

:= u;

insert v into Q;

23 24

:= alt;

// accumulate shortest dist from source

// Add unvisited v into the Q to be processed

Dijkstra's algorithm 27 28

129

return dist; endfunction

If we are only interested in a shortest path between vertices source and target, we can terminate the search at line 12 if u = target. Now we can read the shortest path from source to target by reverse iteration: 1

S := empty sequence

2

u := target

3

while previous[u] is defined:

// Construct the shortest path with a stack S

4

insert u at the beginning of S

// Push the vertex into the stack

5

u := previous[u]

// Traverse from target to source

6

end while ;

Now sequence S is the list of vertices constituting one of the shortest paths from source to target, or the empty sequence if no path exists. A more general problem would be to find all the shortest paths between source and target (there might be several different ones of the same length). Then instead of storing only a single node in each entry of previous[] we would store all nodes satisfying the relaxation condition. For example, if both r and source connect to target and both of them lie on different shortest paths through target (because the edge cost is the same in both cases), then we would add both r and source to previous[target]. When the algorithm completes, previous[] data structure will actually describe a graph that is a subset of the original graph with some edges removed. Its key property will be that if the algorithm was run with some starting node, then every path from that node to any other node in the new graph will be the shortest path between those nodes in the original graph, and all paths of that length from the original graph will be present in the new graph. Then to actually find all these shortest paths between two given nodes we would use a path finding algorithm on the new graph, such as depth-first search.

Using a priority queue A min-priority queue is an abstract data structure that provides 3 basic operations : add_with_priority(), decrease_priority() and extract_min(). As mentioned earlier, using such a data structure can lead to fastest computing times than using a basic queue. Notably, Fibonacci heap (Fredman & Tarjan 1984) or Brodal queue offer optimal implementations for those 3 operations. As the algorithm is slightly different, we mention it here, in pseudo-code as well : 1

function Dijkstra(Graph, source):

2

dist[source] := 0

3

for each vertex v in Graph:

4

if v ≠ source

5 6

dist[v] := infinity

// Unknown distance from source to v

previous[v] := undefined

// Predecessor of v

7

end if

8

9

// Initializations

end for

10 11 12

while PQ is not empty:

// The main loop

13

u := PQ.extract_min()

// Remove and return best vertex

14

for each neighbor v of u:

// where v has not yet been removed from PQ.

15

alt = dist[u] + length(u, v)

16

if alt < dist[v]

// Relax the edge (u,v)

Dijkstra's algorithm

130

17

dist[v] := alt

18

previous[v] := u

19

PQ.decrease_priority(v,alt)

20

end if

21

end for

22

end while

23

return previous[]

It should be noted that other data structures can be used to achieve even faster computing times in practice.

Running time An upper bound of the running time of Dijkstra's algorithm on a graph with edges expressed as a function of

and

For any implementation of vertex set

and vertices

can be

, where

and

using big-O notation. the running time is in

are times needed to perform decrease key and extract minimum operations in set , respectively. The simplest implementation of the Dijkstra's algorithm stores vertices of set in an ordinary linked list or array, and extract minimum from

is simply a linear search through all vertices in

. In this case, the running time is

. For sparse graphs, that is, graphs with far fewer than

edges, Dijkstra's algorithm can be implemented more

efficiently by storing the graph in the form of adjacency lists and using a self-balancing binary search tree, binary heap, pairing heap, or Fibonacci heap as a priority queue to implement extracting minimum efficiently. With a self-balancing binary search tree or binary heap, the algorithm requires time (which is dominated by

, assuming the graph is connected). To avoid

look-up in decrease-key step

on a vanilla binary heap, it is necessary to maintain a supplementary index mapping each vertex to the heap's index (and keep it up to date as priority queue

changes), making it take only

heap improves this to . Note that for directed acyclic graphs, it is possible to find shortest paths from a given starting vertex in linear time, by processing the vertices in a topological order, and calculating the path length for each vertex to be the minimum length obtained via any of its incoming edges.

Related problems and algorithms The functionality of Dijkstra's original algorithm can be extended with a variety of modifications. For example, sometimes it is desirable to present solutions which are less than mathematically optimal. To obtain a ranked list of less-than-optimal solutions, the optimal solution is first calculated. A single edge appearing in the optimal solution is removed from the graph, and the optimum solution to this new graph is calculated. Each edge of the original solution is suppressed in turn and a new shortest-path calculated. The secondary solutions are then ranked and presented after the first optimal solution. Dijkstra's algorithm is usually the working principle behind link-state routing protocols, OSPF and IS-IS being the most common ones. Unlike Dijkstra's algorithm, the Bellman–Ford algorithm can be used on graphs with negative edge weights, as long as the graph contains no negative cycle reachable from the source vertex s. The presence of such cycles means there is no shortest path, since the total weight becomes lower each time the cycle is traversed. It is possible to adapt Dijkstra's algorithm to handle negative weight edges by combining it with the Bellman-Ford algorithm (to remove negative edges and detect negative cycles), such an algorithm is called Johnson's algorithm.

Dijkstra's algorithm

131

The A* algorithm is a generalization of Dijkstra's algorithm that cuts down on the size of the subgraph that must be explored, if additional information is available that provides a lower bound on the "distance" to the target. This approach can be viewed from the perspective of linear programming: there is a natural linear program for computing shortest paths, and solutions to its dual linear program are feasible if and only if they form a consistent heuristic (speaking roughly, since the sign conventions differ from place to place in the literature). This feasible dual / consistent heuristic defines a non-negative reduced cost and A* is essentially running Dijkstra's algorithm with these reduced costs. If the dual satisfies the weaker condition of admissibility, then A* is instead more akin to the Bellman–Ford algorithm. The process that underlies Dijkstra's algorithm is similar to the greedy process used in Prim's algorithm. Prim's purpose is to find a minimum spanning tree that connects all nodes in the graph; Dijkstra is concerned with only two nodes. Prim's does not evaluate the total weight of the path from the starting node, only the individual path. Breadth-first search can be viewed as a special-case of Dijkstra's algorithm on unweighted graphs, where the priority queue degenerates into a FIFO queue.

Dynamic programming perspective From a dynamic programming point of view, Dijkstra's algorithm is a successive approximation scheme that solves the dynamic programming functional equation for the shortest path problem by the Reaching method. In fact, Dijkstra's explanation of the logic behind the algorithm, namely Problem 2. Find the path of minimum total length between two given nodes We use the fact that, if

is a node on the minimal path from

knowledge of the minimal path from

to

to

and

.

, knowledge of the latter implies the

.

is a paraphrasing of Bellman's famous Principle of Optimality in the context of the shortest path problem.

Notes  http:/ / www. boost. org/ doc/ libs/ 1_44_0/ libs/ graph/ doc/ dag_shortest_paths. html  Online version of the paper with interactive computational modules. (http:/ / www. ifors. ms. unimelb. edu. au/ tutorial/ dijkstra_new/ index. html)

References • Dijkstra, E. W. (1959). "A note on two problems in connexion with graphs" (http://www-m3.ma.tum.de/twiki/ pub/MN0506/WebHome/dijkstra.pdf). Numerische Mathematik 1: 269–271. doi: 10.1007/BF01386390 (http:// dx.doi.org/10.1007/BF01386390). • Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Section 24.3: Dijkstra's algorithm". Introduction to Algorithms (Second ed.). MIT Press and McGraw–Hill. pp. 595–601. ISBN 0-262-03293-7. • Fredman, Michael Lawrence; Tarjan, Robert E. (1984). "Fibonacci heaps and their uses in improved network optimization algorithms" (http://www.computer.org/portal/web/csdl/doi/10.1109/SFCS.1984.715934). 25th Annual Symposium on Foundations of Computer Science. IEEE. pp. 338–346. doi: 10.1109/SFCS.1984.715934 (http://dx.doi.org/10.1109/SFCS.1984.715934). • Fredman, Michael Lawrence; Tarjan, Robert E. (1987). "Fibonacci heaps and their uses in improved network optimization algorithms" (http://portal.acm.org/citation.cfm?id=28874). Journal of the Association for Computing Machinery 34 (3): 596–615. doi: 10.1145/28869.28874 (http://dx.doi.org/10.1145/28869.28874). • Zhan, F. Benjamin; Noon, Charles E. (February 1998). "Shortest Path Algorithms: An Evaluation Using Real Road Networks". Transportation Science 32 (1): 65–73. doi: 10.1287/trsc.32.1.65 (http://dx.doi.org/10.1287/ trsc.32.1.65).

Dijkstra's algorithm • Leyzorek, M.; Gray, R. S.; Johnson, A. A.; Ladew, W. C.; Meaker, Jr., S. R.; Petry, R. M.; Seitz, R. N. (1957). Investigation of Model Techniques — First Annual Report — 6 June 1956 — 1 July 1957 — A Study of Model Techniques for Communication Systems. Cleveland, Ohio: Case Institute of Technology. • Knuth, D.E. (1977). "A Generalization of Dijkstra's Algorithm". Information Processing Letters 6 (1): 1–5.

External links • C/C++ • Dijkstra's Algorithm in C++ (https://github.com/xtaci/algorithms/blob/master/include/dijkstra.h) • Implementation in Boost C++ library (http://www.boost.org/doc/libs/1_43_0/libs/graph/doc/ dijkstra_shortest_paths.html) • Dijkstra's Algorithm in C Programming Language (http://www.rawbytes.com/dijkstras-algorithm-in-c/) • Java • Applet by Carla Laffra of Pace University (http://www.dgp.toronto.edu/people/JamesStewart/270/9798s/ Laffra/DijkstraApplet.html) • Visualization of Dijkstra's Algorithm (http://students.ceid.upatras.gr/~papagel/english/java_docs/ minDijk.htm) • Shortest Path Problem: Dijkstra's Algorithm (http://www-b2.is.tokushima-u.ac.jp/~ikeda/suuri/dijkstra/ Dijkstra.shtml) • Dijkstra's Algorithm Applet (http://www.unf.edu/~wkloster/foundations/DijkstraApplet/DijkstraApplet. htm) • Open Source Java Graph package with implementation of Dijkstra's Algorithm (http://code.google.com/p/ annas/) • A Java library for path finding with Dijkstra's Algorithm and example Applet (http://www.stackframe.com/ software/PathFinder) • Dijkstra's algorithm as bidirectional version in Java (https://github.com/graphhopper/graphhopper/tree/ 90879ad05c4dfedf0390d44525065f727b043357/core/src/main/java/com/graphhopper/routing) • C#/.Net

• • • • • • •

• Dijkstra's Algorithm in C# (http://www.codeproject.com/KB/recipes/ShortestPathCalculation.aspx) • Fast Priority Queue Implementation of Dijkstra's Algorithm in C# (http://www.codeproject.com/KB/ recipes/FastHeapDijkstra.aspx) • QuickGraph, Graph Data Structures and Algorithms for .NET (http://quickgraph.codeplex.com/) Dijkstra's Algorithm Simulation (http://optlab-server.sce.carleton.ca/POAnimations2007/DijkstrasAlgo.html) Oral history interview with Edsger W. Dijkstra (http://purl.umn.edu/107247), Charles Babbage Institute University of Minnesota, Minneapolis. Animation of Dijkstra's algorithm (http://www.cs.sunysb.edu/~skiena/combinatorica/animations/dijkstra. html) Haskell implementation of Dijkstra's Algorithm (http://bonsaicode.wordpress.com/2011/01/04/ programming-praxis-dijkstra’s-algorithm/) on Bonsai code Implementation in T-SQL (http://hansolav.net/sql/graphs.html) A MATLAB program for Dijkstra's algorithm (http://www.mathworks.com/matlabcentral/fileexchange/ 20025-advanced-dijkstras-minimum-path-algorithm) Step through Dijkstra's Algorithm in an online JavaScript Debugger (http://www.turb0js.com/a/ Dijkstra's_Algorithm)

132

Greedy algorithm

133

Greedy algorithm A greedy algorithm is an algorithm that follows the problem solving heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum. In many problems, a greedy strategy does not in general produce an optimal solution, but nonetheless a greedy heuristic may yield locally optimal solutions that approximate a global optimal solution in a reasonable time. For example, a greedy strategy for the traveling salesman problem (which is of a high computational complexity) is the following heuristic: "At each stage visit an unvisited city nearest to the current city". This heuristic need not find a best solution but terminates in a reasonable number of steps; finding an optimal solution typically requires unreasonably many steps. In mathematical optimization, greedy algorithms solve combinatorial problems having the properties of matroids.

Greedy algorithms determine the minimum number of coins to give while making change. These are the steps a human would take to emulate a greedy algorithm to represent 36 cents using only coins with values {1, 5, 10, 20}. The coin of the highest value, less than the remaining change owed, is the local optimum. (Note that in general the change-making problem requires dynamic programming or integer programming to find an optimal solution; However, most currency systems, including the Euro and US Dollar, are special cases where the greedy strategy does find an optimum solution.)

Specifics In general, greedy algorithms have five components: 1. 2. 3. 4. 5.

A candidate set, from which a solution is created A selection function, which chooses the best candidate to be added to the solution A feasibility function, that is used to determine if a candidate can be used to contribute to a solution An objective function, which assigns a value to a solution, or a partial solution, and A solution function, which will indicate when we have discovered a complete solution

Greedy algorithms produce good solutions on some mathematical problems, but not on others. Most problems for which they work, will have two properties: Greedy choice property We can make whatever choice seems best at the moment and then solve the subproblems that arise later. The choice made by a greedy algorithm may depend on choices made so far but not on future choices or all the solutions to the subproblem. It iteratively makes one greedy choice after another, reducing each given problem into a smaller one. In other words, a greedy algorithm never reconsiders its choices. This is the main difference from dynamic programming, which is exhaustive and is guaranteed to find the solution. After every stage, dynamic programming makes decisions based on all the decisions made in the previous stage, and may reconsider the previous stage's algorithmic path to solution. Optimal substructure "A problem exhibits optimal substructure if an optimal solution to the problem contains optimal solutions to the sub-problems."

Greedy algorithm

134

Cases of failure Examples on how a greedy algorithm may fail to achieve the optimal solution.

Starting at A, a greedy algorithm will find the local maximum at "m", oblivious of the global maximum at "M".

With a goal of reaching the largest-sum, at each step, the greedy algorithm will choose what appears to be the optimal immediate choice, so it will choose 12 instead of 3 at the second step, and will not reach the best solution, which contains 99. For many other problems, greedy algorithms fail to produce the optimal solution, and may even produce the unique worst possible solution. One example is the traveling salesman problem mentioned above: for each number of cities, there is an assignment of distances between the cities for which the nearest neighbor heuristic produces the unique worst possible tour. Imagine the coin example with only 25-cent, 10-cent, and 4-cent coins. The greedy algorithm would not be able to make change for 41 cents, since after committing to use one 25-cent coin and one 10-cent coin it would be impossible to use 4-cent coins for the balance of 6 cents, whereas a person or a more sophisticated algorithm could make change for 41 cents with one 25-cent coin and four 4-cent coins.

Greedy algorithm

Types Greedy algorithms can be characterized as being 'short sighted', and as 'non-recoverable'. They are ideal only for problems which have 'optimal substructure'. Despite this, greedy algorithms are best suited for simple problems (e.g. giving change). It is important, however, to note that the greedy algorithm can be used as a selection algorithm to prioritize options within a search, or branch and bound algorithm. There are a few variations to the greedy algorithm: • Pure greedy algorithms • Orthogonal greedy algorithms • Relaxed greedy algorithms

Applications Greedy algorithms mostly (but not always) fail to find the globally optimal solution, because they usually do not operate exhaustively on all the data. They can make commitments to certain choices too early which prevent them from finding the best overall solution later. For example, all known greedy coloring algorithms for the graph coloring problem and all other NP-complete problems do not consistently find optimum solutions. Nevertheless, they are useful because they are quick to think up and often give good approximations to the optimum. If a greedy algorithm can be proven to yield the global optimum for a given problem class, it typically becomes the method of choice because it is faster than other optimization methods like dynamic programming. Examples of such greedy algorithms are Kruskal's algorithm and Prim's algorithm for finding minimum spanning trees, Dijkstra's algorithm for finding single-source shortest paths, and the algorithm for finding optimum Huffman trees. The theory of matroids, and the more general theory of greedoids, provide whole classes of such algorithms. Greedy algorithms appear in network routing as well. Using greedy routing, a message is forwarded to the neighboring node which is "closest" to the destination. The notion of a node's location (and hence "closeness") may be determined by its physical location, as in geographic routing used by ad hoc networks. Location may also be an entirely artificial construct as in small world routing and distributed hash table.

Examples • The activity selection problem is characteristic to this class of problems, where the goal is to pick the maximum number of activities that do not clash with each other. • In the Macintosh computer game Crystal Quest the objective is to collect crystals, in a fashion similar to the travelling salesman problem. The game has a demo mode, where the game uses a greedy algorithm to go to every crystal. The artificial intelligence does not account for obstacles, so the demo mode often ends quickly. • The matching pursuit is an example of greedy algorithm applied on signal approximation. • A greedy algorithm finds the optimal solution to Malfatti's problem of finding three disjoint circles within a given triangle that maximize the total area of the circles; it is conjectured that the same greedy algorithm is optimal for any number of circles. • A greedy algorithm is used to construct a Huffman tree during Huffman coding where it finds an optimal solution. • In decision tree learning greedy algorithms are commonly used however they are not guaranteed to find the optimal solution.

135

Greedy algorithm

Notes  Introduction to Algorithms (Cormen, Leiserson, Rivest, and Stein) 2001, Chapter 16 "Greedy Algorithms".  (G. Gutin, A. Yeo and A. Zverovich, 2002)

References • Introduction to Algorithms (Cormen, Leiserson, and Rivest) 1990, Chapter 17 "Greedy Algorithms" p. 329. • Introduction to Algorithms (Cormen, Leiserson, Rivest, and Stein) 2001, Chapter 16 "Greedy Algorithms". • G. Gutin, A. Yeo and A. Zverovich, Traveling salesman should not be greedy: domination analysis of greedy-type heuristics for the TSP. Discrete Applied Mathematics 117 (2002), 81–86. • J. Bang-Jensen, G. Gutin and A. Yeo, When the greedy algorithm fails. Discrete Optimization 1 (2004), 121–127. • G. Bendall and F. Margot, Greedy Type Resistance of Combinatorial Problems, Discrete Optimization 3 (2006), 288–298.

External links • Greedy algorithm visualization (http://yuval.bar-or.org/index.php?item=9) A visualization of a greedy solution to the N-Queens puzzle by Yuval Baror. • Python greedy coin (http://www.oreillynet.com/onlamp/blog/2008/04/python_greedy_coin_changer_alg. html) example by Noah Gift.

Travelling salesman problem The travelling salesman problem (TSP) asks the following question: Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city? It is an NP-hard problem in combinatorial optimization, important in operations research and theoretical computer science. The problem was first formulated in 1930 and is one of the most intensively studied problems in optimization. It is used as a benchmark for many optimization methods. Even though the problem is computationally difficult, a large number of heuristics and exact methods are known, so that some instances with tens of thousands of cities can be solved. The TSP has several applications even in its purest formulation, such as planning, logistics, and the manufacture of microchips. Slightly modified, it appears as a sub-problem in many areas, such as DNA sequencing. In these applications, the concept city represents, for example, customers, soldering points, or DNA fragments, and the concept distance represents travelling times or cost, or a similarity measure between DNA fragments. In many applications, additional constraints such as limited resources or time windows make the problem considerably harder. TSP is a special case of the travelling purchaser problem. In the theory of computational complexity, the decision version of the TSP (where, given a length L, the task is to decide whether the graph has any tour shorter than L) belongs to the class of NP-complete problems. Thus, it is likely that the worst-case running time for any algorithm for the TSP increases superpolynomially (or perhaps exponentially) with the number of cities.

136

Travelling salesman problem

History The origins of the travelling salesman problem are unclear. A handbook for travelling salesmen from 1832 mentions the problem and includes example tours through Germany and Switzerland, but contains no mathematical treatment. The travelling salesman problem was defined in the 1800s by the Irish mathematician W. R. Hamilton and by the British mathematician Thomas Kirkman. Hamilton’s Icosian Game was a recreational puzzle based on finding a Hamiltonian cycle. The general form of the TSP appears to have been first studied by mathematicians during the 1930s in Vienna and at Harvard, notably by Karl Menger, who defines the problem, considers the obvious brute-force algorithm, and observes the non-optimality of the nearest neighbour heuristic: We denote by messenger problem (since in practice this question should be solved by each postman, anyway also by many travelers) the task to find, for ﬁnitely many points whose pairwise distances are known, the shortest route connecting the points. Of course, this problem is solvable by finitely many trials. Rules which would push the number of trials below the William Rowan Hamilton number of permutations of the given points, are not known. The rule that one first should go from the starting point to the closest point, then to the point closest to this, etc., in general does not yield the shortest route. Hassler Whitney at Princeton University introduced the name travelling salesman problem soon after. In the 1950s and 1960s, the problem became increasingly popular in scientific circles in Europe and the USA. Notable contributions were made by George Dantzig, Delbert Ray Fulkerson and Selmer M. Johnson at the RAND Corporation in Santa Monica, who expressed the problem as an integer linear program and developed the cutting plane method for its solution. With these new methods they solved an instance with 49 cities to optimality by constructing a tour and proving that no other tour could be shorter. In the following decades, the problem was studied by many researchers from mathematics, computer science, chemistry, physics, and other sciences. Richard M. Karp showed in 1972 that the Hamiltonian cycle problem was NP-complete, which implies the NP-hardness of TSP. This supplied a mathematical explanation for the apparent computational difficulty of finding optimal tours. Great progress was made in the late 1970s and 1980, when Grötschel, Padberg, Rinaldi and others managed to exactly solve instances with up to 2392 cities, using cutting planes and branch-and-bound. In the 1990s, Applegate, Bixby, Chvátal, and Cook developed the program Concorde that has been used in many recent record solutions. Gerhard Reinelt published the TSPLIB in 1991, a collection of benchmark instances of varying difficulty, which has been used by many research groups for comparing results. In 2006, Cook and others computed an optimal tour through an 85,900-city instance given by a microchip layout problem, currently the largest solved TSPLIB instance. For many other instances with millions of cities, solutions can be found that are guaranteed to be within 2-3% of an optimal tour.

137

Travelling salesman problem

138

Description As a graph problem TSP can be modelled as an undirected weighted graph, such that cities are the graph's vertices, paths are the graph's edges, and a path's distance is the edge's length. It is a minimization problem starting and finishing at a specified vertex after having visited each other vertex exactly once. Often, the model is a complete graph (i.e. each pair of vertices is connected by an edge). If no path exists between two cities, adding an arbitrarily long edge will complete the graph without affecting the optimal tour.

Asymmetric and symmetric

Symmetric TSP with four cities

In the symmetric TSP, the distance between two cities is the same in each opposite direction, forming an undirected graph. This symmetry halves the number of possible solutions. In the asymmetric TSP, paths may not exist in both directions or the distances might be different, forming a directed graph. Traffic collisions, one-way streets, and airfares for cities with different departure and arrival fees are examples of how this symmetry could break down.

Related problems • An equivalent formulation in terms of graph theory is: Given a complete weighted graph (where the vertices would represent the cities, the edges would represent the roads, and the weights would be the cost or distance of that road), find a Hamiltonian cycle with the least weight. • The requirement of returning to the starting city does not change the computational complexity of the problem, see Hamiltonian path problem. • Another related problem is the bottleneck traveling salesman problem (bottleneck TSP): Find a Hamiltonian cycle in a weighted graph with the minimal weight of the weightiest edge. The problem is of considerable practical importance, apart from evident transportation and logistics areas. A classic example is in printed circuit manufacturing: scheduling of a route of the drill machine to drill holes in a PCB. In robotic machining or drilling applications, the "cities" are parts to machine or holes (of different sizes) to drill, and the "cost of travel" includes time for retooling the robot (single machine job sequencing problem). • The generalized traveling salesman problem deals with "states" that have (one or more) "cities" and the salesman has to visit exactly one "city" from each "state". Also known as the "traveling politician problem". One application is encountered in ordering a solution to the cutting stock problem in order to minimise knife changes. Another is concerned with drilling in semiconductor manufacturing, see e.g., U.S. Patent 7,054,798 . Surprisingly, Behzad and Modarres demonstrated that the generalised traveling salesman problem can be transformed into a standard traveling salesman problem with the same number of cities, but a modified distance matrix. • The sequential ordering problem deals with the problem of visiting a set of cities where precedence relations between the cities exist. • The traveling purchaser problem deals with a purchaser who is charged with purchasing a set of products. He can purchase these products in several cities, but at different prices and not all cities offer the same products. The objective is to find a route between a subset of the cities, which minimizes total cost (travel cost + purchasing cost) and which enables the purchase of all required products.

Travelling salesman problem

139

Integer linear programming formulation TSP can be formulated as an integer linear program. Let 0 otherwise, for cities 0, ..., n. Let

equal 1 if the path goes from city i to city j, and

for i = 1, ..., n be artificial variables, and let

be the distance from city i to

city j. Then the integer linear programming problem can be written as

The first set of equalities requires that each city 0, ..., n be arrived at from exactly one other city, and the second set of equalities requires that from each city 1, ..., n there is a departure to exactly one other city. (These constraints together also imply that there is exactly one departure from city 0.) The last constraints enforce that there is only a single tour covering all cities, and not two or more disjointed tours that only collectively cover all cities. To prove this, it is shown below (1) that every feasible solution contains only one closed sequence of cities, and (2) that for every single tour covering all cities, there are values for the dummy variables that satisfy the constraints. To prove that every feasible solution contains only one closed sequence of cities, it suffices to show that every subtour in a feasible solution passes through city 0 (noting that the equalities ensure there can only be one such tour). For if we sum all the inequalities corresponding to for any subtour of k steps not passing through city 0, we obtain

It now must be shown that for every single tour covering all cities, there are values for the dummy variables satisfy the constraints. Without loss of generality, define the tour as originating (and ending) at city 0. Choose step t (i, t = 1, 2, ..., n). Then hence

the

constraints

since are

can be no greater than n and satisfied

whenever

that

if city i is visited in can be no less than 1; For

satisfying the constraint.

Computing a solution The traditional lines of attack for the NP-hard problems are the following: • Devising algorithms for finding exact solutions (they will work reasonably fast only for small problem sizes). • Devising "suboptimal" or heuristic algorithms, i.e., algorithms that deliver either seemingly or probably good solutions, but which could not be proved to be optimal. • Finding special cases for the problem ("subproblems") for which either better or exact heuristics are possible.

Travelling salesman problem

140

Computational complexity The problem has been shown to be NP-hard (more precisely, it is complete for the complexity class FPNP; see function problem), and the decision problem version ("given the costs and a number x, decide whether there is a round-trip route cheaper than x") is NP-complete. The bottleneck travelling salesman problem is also NP-hard. The problem remains NP-hard even for the case when the cities are in the plane with Euclidean distances, as well as in a number of other restrictive cases. Removing the condition of visiting each city "only once" does not remove the NP-hardness, since it is easily seen that in the planar case there is an optimal tour that visits each city only once (otherwise, by the triangle inequality, a shortcut that skips a repeated visit would not increase the tour length). Complexity of approximation In the general case, finding a shortest travelling salesman tour is NPO-complete. If the distance measure is a metric and symmetric, the problem becomes APX-complete and Christofides’s algorithm approximates it within 1.5. If the distances are restricted to 1 and 2 (but still are a metric) the approximation ratio becomes 8/7. In the asymmetric, metric case, only logarithmic performance guarantees are known, the best current algorithm achieves performance ratio 0.814 log n; it is an open question if a constant factor approximation exists. The corresponding maximization problem of finding the longest travelling salesman tour is approximable within 63/38. If the distance function is symmetric, the longest tour can be approximated within 4/3 by a deterministic algorithm and within by a randomised algorithm.

Exact algorithms The most direct solution would be to try all permutations (ordered combinations) and see which one is cheapest (using brute force search). The running time for this approach lies within a polynomial factor of , the factorial of the number of cities, so this solution becomes impractical even for only 20 cities. One of the earliest applications of dynamic programming is the Held–Karp algorithm that solves the problem in time

.

Improving these time bounds seems to be difficult. For example, it has not been determined whether an exact algorithm for TSP that runs in time exists. Other approaches include: • Various branch-and-bound algorithms, which can be used to process TSPs containing 40–60 cities. • Progressive improvement algorithms which use techniques reminiscent of linear programming. Works well for up to 200 cities. • Implementations of branch-and-bound and problem-specific cut generation (branch-and-cut); this is the method of choice for solving large instances. This approach holds the current record, solving an instance with 85,900 cities, see Applegate et al. (2006). An exact solution for 15,112 German towns from TSPLIB was found in 2001 using the cutting-plane method proposed by George Dantzig, Ray Fulkerson, and Selmer M. Johnson in 1954, based on linear programming. The computations were performed on a network of 110 processors located at Rice University and Princeton University (see the Princeton external link). The total computation time was equivalent to 22.6 years on a single 500 MHz Alpha processor. In May 2004, the travelling salesman problem of visiting all 24,978 towns in Sweden was solved: a tour of length approximately 72,500 kilometers was found and it was proven that no shorter tour exists. In March 2005, the travelling salesman problem of visiting all 33,810 points in a circuit board was solved using Concorde TSP Solver: a tour of length 66,048,945 units was found and it was proven that no shorter tour exists. The computation took approximately 15.7 CPU-years (Cook et al. 2006). In April 2006 an instance with 85,900 points was solved using Concorde TSP Solver, taking over 136 CPU-years, see Applegate et al. (2006).

Travelling salesman problem

Heuristic and approximation algorithms Various heuristics and approximation algorithms, which quickly yield good solutions have been devised. Modern methods can find solutions for extremely large problems (millions of cities) within a reasonable time which are with a high probability just 2–3% away from the optimal solution. Several categories of heuristics are recognized. Constructive heuristics The nearest neighbor (NN) algorithm (or so-called greedy algorithm) lets the salesman choose the nearest unvisited city as his next move. This algorithm quickly yields an effectively short route. For N cities randomly distributed on a plane, the algorithm on average yields a path 25% longer than the shortest possible path. However, there exist many specially arranged city distributions which make the NN algorithm give the worst route (Gutin, Yeo, and Zverovich, 2002). This is true for both asymmetric and symmetric TSPs (Gutin and Yeo, 2007). Rosenkrantz et al.  showed that the NN algorithm has the approximation factor for instances satisfying the triangle inequality. A variation of NN algorithm, called Nearest Fragment (NF) operator, which connects a group (fragment) of nearest unvisited cities, can find shorter route with successive iterations. The NF operator can also be applied on an initial solution obtained by NN algorithm for further improvement in an elitist model, where only better solutions are accepted. Constructions based on a minimum spanning tree have an approximation ratio of 2. The Christofides algorithm achieves a ratio of 1.5. The bitonic tour of a set of points is the minimum-perimeter monotone polygon that has the points as its vertices; it can be computed efficiently by dynamic programming. Another constructive heuristic, Match Twice and Stitch (MTS) (Kahng, Reda 2004 ), performs two sequential matchings, where the second matching is executed after deleting all the edges of the first matching, to yield a set of cycles. The cycles are then stitched to produce the final tour. Iterative improvement Pairwise exchange The pairwise exchange or 2-opt technique involves iteratively removing two edges and replacing these with two different edges that reconnect the fragments created by edge removal into a new and shorter tour. This is a special case of the k-opt method. Note that the label Lin–Kernighan is an often heard misnomer for 2-opt. Lin–Kernighan is actually the more general k-opt method. k-opt heuristic, or Lin–Kernighan heuristics Take a given tour and delete k mutually disjoint edges. Reassemble the remaining fragments into a tour, leaving no disjoint subtours (that is, don't connect a fragment's endpoints together). This in effect simplifies the TSP under consideration into a much simpler problem. Each fragment endpoint can be connected to 2k − 2 other possibilities: of 2k total fragment endpoints available, the two endpoints of the fragment under consideration are disallowed. Such a constrained 2k-city TSP can then be solved with brute force methods to find the least-cost recombination of the original fragments. The k-opt technique is a special case of the V-opt or variable-opt technique. The most popular of the k-opt methods are 3-opt, and these were introduced by Shen Lin of Bell Labs in 1965. There is a special case of 3-opt where the edges are not disjoint (two of the edges are adjacent to one another). In practice, it is often possible to achieve substantial improvement over 2-opt without the combinatorial cost of the general 3-opt by restricting the 3-changes to this special subset where two of the removed edges are adjacent. This so-called two-and-a-half-opt typically falls roughly midway between 2-opt and 3-opt, both in terms of the quality of tours achieved and the time required to achieve those tours. V-opt heuristic

141

Travelling salesman problem The variable-opt method is related to, and a generalization of the k-opt method. Whereas the k-opt methods remove a fixed number (k) of edges from the original tour, the variable-opt methods do not fix the size of the edge set to remove. Instead they grow the set as the search process continues. The best known method in this family is the Lin–Kernighan method (mentioned above as a misnomer for 2-opt). Shen Lin and Brian Kernighan first published their method in 1972, and it was the most reliable heuristic for solving travelling salesman problems for nearly two decades. More advanced variable-opt methods were developed at Bell Labs in the late 1980s by David Johnson and his research team. These methods (sometimes called Lin–Kernighan–Johnson) build on the Lin–Kernighan method, adding ideas from tabu search and evolutionary computing. The basic Lin–Kernighan technique gives results that are guaranteed to be at least 3-opt. The Lin–Kernighan–Johnson methods compute a Lin–Kernighan tour, and then perturb the tour by what has been described as a mutation that removes at least four edges and reconnecting the tour in a different way, then v-opting the new tour. The mutation is often enough to move the tour from the local minimum identified by Lin–Kernighan. V-opt methods are widely considered the most powerful heuristics for the problem, and are able to address special cases, such as the Hamilton Cycle Problem and other non-metric TSPs that other heuristics fail on. For many years Lin–Kernighan–Johnson had identified optimal solutions for all TSPs where an optimal solution was known and had identified the best known solutions for all other TSPs on which the method had been tried. Randomised improvement Optimized Markov chain algorithms which use local searching heuristic sub-algorithms can find a route extremely close to the optimal route for 700 to 800 cities. TSP is a touchstone for many general heuristics devised for combinatorial optimization such as genetic algorithms, simulated annealing, Tabu search, ant colony optimization, river formation dynamics (see swarm intelligence) and the cross entropy method. Ant colony optimization Artificial intelligence researcher Marco Dorigo described in 1997 a method of heuristically generating "good solutions" to the TSP using a simulation of an ant colony called ACS (Ant Colony System). It models behavior observed in real ants to find short paths between food sources and their nest, an emergent behaviour resulting from each ant's preference to follow trail pheromones deposited by other ants. ACS sends out a large number of virtual ant agents to explore many possible routes on the map. Each ant probabilistically chooses the next city to visit based on a heuristic combining the distance to the city and the amount of virtual pheromone deposited on the edge to the city. The ants explore, depositing pheromone on each edge that they cross, until they have all completed a tour. At this point the ant which completed the shortest tour deposits virtual pheromone along its complete tour route (global trail updating). The amount of pheromone deposited is inversely proportional to the tour length: the shorter the tour, the more it deposits.

142

Travelling salesman problem

143

Ant Colony Optimization Algorithm for a TSP with 7 cities: Red and thick lines in the pheromone map indicate presence of more pheromone

Special cases Metric TSP In the metric TSP, also known as delta-TSP or Δ-TSP, the intercity distances satisfy the triangle inequality. A very natural restriction of the TSP is to require that the distances between cities form a metric to satisfy the triangle inequality; that is the direct connection from A to B is never farther than the route via intermediate C: . The edge spans then build a metric on the set of vertices. When the cities are viewed as points in the plane, many natural distance functions are metrics, and so many natural instances of TSP satisfy this constraint. The following are some examples of metric TSPs for various metrics. • In the Euclidean TSP (see below) the distance between two cities is the Euclidean distance between the corresponding points. • In the rectilinear TSP the distance between two cities is the sum of the differences of their x- and y-coordinates. This metric is often called the Manhattan distance or city-block metric. • In the maximum metric, the distance between two points is the maximum of the absolute values of differences of their x- and y-coordinates. The last two metrics appear for example in routing a machine that drills a given set of holes in a printed circuit board. The Manhattan metric corresponds to a machine that adjusts first one co-ordinate, and then the other, so the time to move to a new point is the sum of both movements. The maximum metric corresponds to a machine that adjusts both co-ordinates simultaneously, so the time to move to a new point is the slower of the two movements.

Travelling salesman problem

144

In its definition, the TSP does not allow cities to be visited twice, but many applications do not need this constraint. In such cases, a symmetric, non-metric instance can be reduced to a metric one. This replaces the original graph with a complete graph in which the inter-city distance is replaced by the shortest path between and in the original graph. The span of the minimum spanning tree of the network

is a natural lower bound for the span of the optimal route,

because deleting any edge of the optimal route yields a Hamiltonian path, which is a spanning tree in

. In the TSP

with triangle inequality case it is possible to prove upper bounds in terms of the minimum spanning tree and design an algorithm that has a provable upper bound on the span of the route. The first published (and the simplest) example follows: 1. Construct a minimum spanning tree for . 2. Duplicate all edges of . That is, wherever there is an edge from u to v, add a second edge from v to u. This gives us an Eulerian graph . 3. Find an Eulerian circuit in . Clearly, its span is twice the span of the tree. 4. Convert the Eulerian circuit of into a Hamiltonian cycle of in the following way: walk along each time you are about to come into an already visited vertex, skip it and try to go to the next one (along

, and ).

It is easy to prove that the last step works. Moreover, thanks to the triangle inequality, each skipping at Step 4 is in fact a shortcut; i.e., the length of the cycle does not increase. Hence it gives us a TSP tour no more than twice as long as the optimal one. The Christofides algorithm follows a similar outline but combines the minimum spanning tree with a solution of another problem, minimum-weight perfect matching. This gives a TSP tour which is at most 1.5 times the optimal. The Christofides algorithm was one of the first approximation algorithms, and was in part responsible for drawing attention to approximation algorithms as a practical approach to intractable problems. As a matter of fact, the term "algorithm" was not commonly extended to approximation algorithms until later; the Christofides algorithm was initially referred to as the Christofides heuristic. Euclidean TSP The Euclidean TSP, or planar TSP, is the TSP with the distance being the ordinary Euclidean distance. The Euclidean TSP is a particular case of the metric TSP, since distances in a plane obey the triangle inequality. Like the general TSP, the Euclidean TSP is NP-hard. With discretized metric (distances rounded up to an integer), the problem is NP-complete. However, in some respects it seems to be easier than the general metric TSP. For example, the minimum spanning tree of the graph associated with an instance of the Euclidean TSP is a Euclidean minimum spanning tree, and so can be computed in expected O(n log n) time for n points (considerably less than the number of edges). This enables the simple 2-approximation algorithm for TSP with triangle inequality above to operate more quickly. In general, for any c > 0, where d is the number of dimensions in the Euclidean space, there is a polynomial-time algorithm that finds a tour of length at most (1 + 1/c) times the optimal for geometric instances of TSP in time; this is called a polynomial-time approximation scheme (PTAS). Sanjeev Arora and Joseph S. B. Mitchell were awarded the Gödel Prize in 2010 for their concurrent discovery of a PTAS for the Euclidean TSP. In practice, heuristics with weaker guarantees continue to be used.

Travelling salesman problem

145

Asymmetric TSP In most cases, the distance between two nodes in the TSP network is the same in both directions. The case where the distance from A to B is not equal to the distance from B to A is called asymmetric TSP. A practical application of an asymmetric TSP is route optimisation using street-level routing (which is made asymmetric by one-way streets, slip-roads, motorways, etc.). Solving by conversion to symmetric TSP Solving an asymmetric TSP graph can be somewhat complex. The following is a 3×3 matrix containing all possible path weights between the nodes A, B and C. One option is to turn an asymmetric matrix of size N into a symmetric matrix of size 2N. A B C A

1 2

B 6

3

C 5 4

|+ Asymmetric path weights To double the size, each of the nodes in the graph is duplicated, creating a second ghost node. Using duplicate points with very low weights, such as −∞, provides a cheap route "linking" back to the real node and allowing symmetric evaluation to continue. The original 3×3 matrix shown above is visible in the bottom left and the inverse of the original in the top-right. Both copies of the matrix have had their diagonals replaced by the low-cost hop paths, represented by −∞. A

A′

B′

C′

A

−∞

6

5

B

1

−∞

4

C

2

3

−∞

A′ −∞

B

C

1

2

B′

6

−∞

3

C′

5

4

−∞

|+ Symmetric path weights The original 3×3 matrix would produce two Hamiltonian cycles (a path that visits every node once), namely A-B-C-A [score 9] and A-C-B-A [score 12]. Evaluating the 6×6 symmetric version of the same problem now produces many paths, including A-A′-B-B′-C-C′-A, A-B′-C-A′-A, A-A′-B-C′-A [all score 9 – ∞]. The important thing about each new sequence is that there will be an alternation between dashed (A′,B′,C′) and un-dashed nodes (A, B, C) and that the link to "jump" between any related pair (A-A′) is effectively free. A version of the algorithm could use any weight for the A-A′ path, as long as that weight is lower than all other path weights present in the graph. As the path weight to "jump" must effectively be "free", the value zero (0) could be used to represent this cost—if zero is not being used for another purpose already (such as designating invalid paths). In the two examples above, non-existent paths between nodes are shown as a blank square.

Travelling salesman problem

Benchmarks For benchmarking of TSP algorithms, TSPLIB  is a library of sample instances of the TSP and related problems is maintained, see the TSPLIB external reference. Many of them are lists of actual cities and layouts of actual printed circuits.

Human performance on TSP The TSP, in particular the Euclidean variant of the problem, has attracted the attention of researchers in cognitive psychology. It is observed that humans are able to produce good quality solutions quickly. The first issue of the Journal of Problem Solving  is devoted to the topic of human performance on TSP.

TSP path length for random pointset in a square Suppose N points are randomly distributed in a 1×1 square with N>>1. Consider many such squares. Suppose we want to know the average of the shortest path length (i.e. TSP solution) of each square.

Lower bound is a lower bound obtained by assuming i be a point in the tour sequence and i has its next neighbor as its latter in the path. is a better lower bound obtained by assuming i's latter is i's next, and i's former is i's after next. is an even better lower bound obtained by dividing the path sequence into two parts as before_i and behind_i with each part containing N/2 points, and then deleting the before_i part to form a diluted pointset (see discussion). • David S. Johnson obtained a lower bound by computer experiment: , where 0.522 comes from the points near square boundary which have fewer neighbors. • Christine L. Valenzuela and Antonia J. Jones  obtained another lower bound by computer experiment:

Analyst's travelling salesman problem There is an analogous problem in geometric measure theory which asks the following: under what conditions may a subset E of Euclidean space be contained in a rectifiable curve (that is, when is there a curve with finite length that visits every point in E)? This problem is known as the analyst's travelling salesman problem or the geometric travelling salesman problem.

Free software for solving TSP

146

Travelling salesman problem

Name (alphabetically) Concorde

DynOpt

LKH







OpenOpt

R TSP package TSP Solver and  Generator TSPGA





147

API language

Brief info

only executable

requires a linear solver installation for its MILP subproblem

?

C

an ANSI C implementation a dynamic programming based algorithm developed by Balas and Simonetti, approximate solution

research only C

an effective implementation of the Lin-Kernighan heuristic for Euclidean traveling salesman problem

BSD

Python

exact and approximate solvers, STSP / ATSP, can handle multigraphs, constraints,  multiobjective problems, see its TSP page for details and examples

GPL

R

infrastructure and solvers for STSP / ATSP, interface to Concorde

GPL

C++

branch and bound algorithm

?

C

approximate solution of the STSP using the "pgapack" package

Popular culture Travelling Salesman, by director Timothy Lanzone, is the story of 4 mathematicians hired by the U.S. government to solve the most elusive problem in computer-science history: P vs. NP.

Notes  "Der Handlungsreisende – wie er sein soll und was er zu tun hat, um Aufträge zu erhalten und eines glücklichen Erfolgs in seinen Geschäften gewiß zu sein – von einem alten Commis-Voyageur" (The traveling salesman — how he must be and what he should do in order to get comissions and be sure of the happy success in his business — by an old commis-voyageur)  A discussion of the early work of Hamilton and Kirkman can be found in Graph Theory 1736–1936  Cited and English translation in . Original German: "Wir bezeichnen als Botenproblem (weil diese Frage in der Praxis von jedem Postboten, übrigens auch von vielen Reisenden zu lösen ist) die Aufgabe, für endlich viele Punkte, deren paarweise Abstände bekannt sind, den kürzesten die Punkte verbindenden Weg zu finden. Dieses Problem ist natürlich stets durch endlich viele Versuche lösbar. Regeln, welche die Anzahl der Versuche unter die Anzahl der Permutationen der gegebenen Punkte herunterdrücken würden, sind nicht bekannt. Die Regel, man solle vom Ausgangspunkt erst zum nächstgelegenen Punkt, dann zu dem diesem nächstgelegenen Punkt gehen usw., liefert im allgemeinen nicht den kürzesten Weg."  A detailed treatment of the connection between Menger and Whitney as well as the growth in the study of TSP can be found in Alexander Schrijver's 2005 paper "On the history of combinatorial optimization (till 1960). Handbook of Discrete Optimization (K. Aardal, G.L. Nemhauser, R. Weismantel, eds.), Elsevier, Amsterdam, 2005, pp. 1–68. PS (http:/ / homepages. cwi. nl/ ~lex/ files/ histco. ps), PDF (http:/ / homepages. cwi. nl/ ~lex/ files/ histco. pdf)  http:/ / www. google. com/ patents/ US7054798  , pp.308-309.  Tucker, A. W. (1960), "On Directed Graphs and Integer Programs", IBM Mathematical research Project (Princeton University)  Dantzig, George B. (1963), Linear Programming and Extensions, Princeton, NJ: PrincetonUP, pp. 545–7, ISBN 0-691-08000-3, sixth printing, 1974.  Berman & Karpinski (2006).  , ,  Work by David Applegate, AT&T Labs – Research, Robert Bixby, ILOG and Rice University, Vašek Chvátal, Concordia University, William Cook, University of Waterloo, and Keld Helsgaun, Roskilde University is discussed on their project web page hosted by the University of Waterloo and last updated in June 2004, here (http:/ / www. math. uwaterloo. ca/ tsp/ sweden/ )  Johnson, D.S. and McGeoch, L.A.. "The traveling salesman problem: A case study in local optimization", Local search in combinatorial optimization, 1997, 215-310  S. S. Ray, S. Bandyopadhyay and S. K. Pal, "Genetic Operators for Combinatorial Optimization in TSP and Microarray Gene Ordering," Applied Intelligence, 2007, 26(3). pp. 183-195.  A. B. Kahng and S. Reda, "Match Twice and Stitch: A New TSP Tour Construction Heuristic," Operations Research Letters, 2004, 32(6). pp. 499–509. http:/ / dx. doi. org/ 10. 1016/ j. orl. 2004. 04. 001

Travelling salesman problem  Marco Dorigo. Ant Colonies for the Traveling Salesman Problem. IRIDIA, Université Libre de Bruxelles. IEEE Transactions on Evolutionary Computation, 1(1):53–66. 1997. http:/ / citeseer. ist. psu. edu/ 86357. html  Papadimitriou (1977).  Arora (1998).  Roy Jonker and Ton Volgenant. "Transforming asymmetric into symmetric traveling salesman problems". Operations Research Letters 2:161–163, 1983.  http:/ / comopt. ifi. uni-heidelberg. de/ software/ TSPLIB95/  http:/ / docs. lib. purdue. edu/ jps/  David S. Johnson (http:/ / www. research. att. com/ ~dsj/ papers/ HKsoda. pdf)  Christine L. Valenzuela and Antonia J. Jones (http:/ / users. cs. cf. ac. uk/ Antonia. J. Jones/ Papers/ EJORHeldKarp/ HeldKarp. pdf)  http:/ / www. math. uwaterloo. ca/ tsp/ concorde. html  http:/ / www. andrew. cmu. edu/ user/ neils/ tsp/  http:/ / www. akira. ruc. dk/ ~keld/ research/ LKH/  http:/ / openopt. org/ TSP  http:/ / tsp. r-forge. r-project. org/  http:/ / tspsg. info/  http:/ / www. rz. uni-karlsruhe. de/ ~lh71/

References • Applegate, D. L.; Bixby, R. M.; Chvátal, V.; Cook, W. J. (2006), The Traveling Salesman Problem, ISBN 0-691-12993-2. • Arora, Sanjeev (1998), "Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems", Journal of the ACM 45 (5): 753–782, doi: 10.1145/290179.290180 (http://dx.doi.org/10. 1145/290179.290180), MR  1668147 (http://www.ams.org/mathscinet-getitem?mr=1668147). • Bellman, R. (1960), "Combinatorial Processes and Dynamic Programming", in Bellman, R., Hall, M., Jr. (eds.), Combinatorial Analysis, Proceedings of Symposia in Applied Mathematics 10, American Mathematical Society, pp. 217–249. • Bellman, R. (1962), "Dynamic Programming Treatment of the Travelling Salesman Problem", J. Assoc. Comput. Mach. 9: 61–63, doi: 10.1145/321105.321111 (http://dx.doi.org/10.1145/321105.321111). • Berman, Piotr; Karpinski, Marek (2006), "8/7-approximation algorithm for (1,2)-TSP", Proc. 17th ACM-SIAM Symposium on Discrete Algorithms (SODA '06), pp. 641–648, doi: 10.1145/1109557.1109627 (http://dx.doi. org/10.1145/1109557.1109627), ISBN 0898716055, ECCC  TR05-069 (http://eccc.uni-trier.de/report/2005/ 069/). • Christofides, N. (1976), Worst-case analysis of a new heuristic for the travelling salesman problem, Technical Report 388, Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh. • Hassin, R.; Rubinstein, S. (2000), "Better approximations for max TSP", Information Processing Letters 75 (4): 181–186, doi: 10.1016/S0020-0190(00)00097-1 (http://dx.doi.org/10.1016/S0020-0190(00)00097-1). • Held, M.; Karp, R. M. (1962), "A Dynamic Programming Approach to Sequencing Problems", Journal of the Society for Industrial and Applied Mathematics 10 (1): 196–210, doi: 10.1137/0110015 (http://dx.doi.org/10. 1137/0110015). • Kaplan, H.; Lewenstein, L.; Shafrir, N.; Sviridenko, M. (2004), "Approximation Algorithms for Asymmetric TSP by Decomposing Directed Regular Multigraphs", In Proc. 44th IEEE Symp. on Foundations of Comput. Sci, pp. 56–65. • Kosaraju, S. R.; Park, J. K.; Stein, C. (1994), "Long tours and short superstrings'", Proc. 35th Ann. IEEE Symp. on Foundations of Comput. Sci, IEEE Computer Society, pp. 166–177. • Orponen, P.; Mannila, H. (1987), "On approximation preserving reductions: Complete problems and robust measures'", Technical Report C-1987–28, Department of Computer Science, University of Helsinki. • Padberg, M.; Rinaldi, G. (1991), "A Branch-and-Cut Algorithm for the Resolution of Large-Scale Symmetric Traveling Salesman Problems", Siam Review: 60–100, doi: 10.1137/1033004 (http://dx.doi.org/10.1137/ 1033004).

148

Travelling salesman problem • Papadimitriou, Christos H. (1977), "The Euclidean traveling salesman problem is NP-complete", Theoretical Computer Science 4 (3): 237–244, doi: 10.1016/0304-3975(77)90012-3 (http://dx.doi.org/10.1016/ 0304-3975(77)90012-3), MR  0455550 (http://www.ams.org/mathscinet-getitem?mr=0455550). • Papadimitriou, C. H.; Yannakakis, M. (1993), "The traveling salesman problem with distances one and two", Math. Oper. Res. 18: 1–11, doi: 10.1287/moor.18.1.1 (http://dx.doi.org/10.1287/moor.18.1.1). • Serdyukov, A. I. (1984), "An algorithm with an estimate for the traveling salesman problem of the maximum'", Upravlyaemye Sistemy 25: 80–86. • Woeginger, G.J. (2003), "Exact Algorithms for NP-Hard Problems: A Survey", Combinatorial Optimization – Eureka, You Shrink! Lecture notes in computer science, vol. 2570, Springer, pp. 185–207.

Further reading • Adleman, Leonard (1994), "Molecular Computation of Solutions To Combinatorial Problems" (http://www.usc. edu/dept/molecular-science/papers/fp-sci94.pdf), Science 266 (5187): 1021–4, Bibcode: 1994Sci...266.1021A (http://adsabs.harvard.edu/abs/1994Sci...266.1021A), doi: 10.1126/science.7973651 (http://dx.doi.org/ 10.1126/science.7973651), PMID  7973651 (http://www.ncbi.nlm.nih.gov/pubmed/7973651) • Arora, S. (1998), "Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems" (http://graphics.stanford.edu/courses/cs468-06-winter/Papers/arora-tsp.pdf), Journal of the ACM 45 (5): 753–782, doi: 10.1145/290179.290180 (http://dx.doi.org/10.1145/290179.290180). • Babin, Gilbert; Deneault, Stéphanie; Laportey, Gilbert (2005), Improvements to the Or-opt Heuristic for the Symmetric Traveling Salesman Problem (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.89. 9953), Cahiers du GERAD, G-2005-02, Montreal: Group for Research in Decision Analysis. • Cook, William (2011), In Pursuit of the Travelling Salesman: Mathematics at the Limits of Computation, Princeton University Press, ISBN 978-0-691-15270-7. • Cook, William; Espinoza, Daniel; Goycoolea, Marcos (2007), "Computing with domino-parity inequalities for the TSP", INFORMS Journal on Computing 19 (3): 356–365, doi: 10.1287/ijoc.1060.0204 (http://dx.doi.org/10. 1287/ijoc.1060.0204). • Cormen, T. H.; Leiserson, C. E.; Rivest, R. L.; Stein, C. (2001), "35.2: The traveling-salesman problem", Introduction to Algorithms (2nd ed.), MIT Press and McGraw-Hill, pp. 1027–1033, ISBN 0-262-03293-7. • Dantzig, G. B.; Fulkerson, R.; Johnson, S. M. (1954), "Solution of a large-scale traveling salesman problem", Operations Research 2 (4): 393–410, doi: 10.1287/opre.2.4.393 (http://dx.doi.org/10.1287/opre.2.4.393), JSTOR  166695 (http://www.jstor.org/stable/166695). • Garey, M. R.; Johnson, D. S. (1979), "A2.3: ND22–24", Computers and Intractability: A Guide to the Theory of NP-Completeness, W.H. Freeman, pp. 211–212, ISBN 0-7167-1045-5. • Goldberg, D. E. (1989), "Genetic Algorithms in Search, Optimization & Machine Learning", Reading: Addison-Wesley (New York: Addison-Wesley), Bibcode: 1989gaso.book.....G (http://adsabs.harvard.edu/abs/ 1989gaso.book.....G), ISBN 0-201-15767-5. • Gutin, G.; Yeo, A.; Zverovich, A. (2002), "Traveling salesman should not be greedy: domination analysis of greedy-type heuristics for the TSP", Discrete Applied Mathematics 117 (1–3): 81–86, doi: 10.1016/S0166-218X(01)00195-0 (http://dx.doi.org/10.1016/S0166-218X(01)00195-0). • Gutin, G.; Punnen, A. P. (2006), The Traveling Salesman Problem and Its Variations, Springer, ISBN 0-387-44459-9. • Johnson, D. S.; McGeoch, L. A. (1997), "The Traveling Salesman Problem: A Case Study in Local Optimization", in Aarts, E. H. L.; Lenstra, J. K., Local Search in Combinatorial Optimisation, John Wiley and Sons Ltd, pp. 215–310. • Lawler, E. L.; Lenstra, J. K.; Rinnooy Kan, A. H. G.; Shmoys, D. B. (1985), The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization, John Wiley & Sons, ISBN 0-471-90413-9.

149

Travelling salesman problem • MacGregor, J. N.; Ormerod, T. (1996), "Human performance on the traveling salesman problem" (http://www. psych.lancs.ac.uk/people/uploads/TomOrmerod20030716T112601.pdf), Perception & Psychophysics 58 (4): 527–539, doi: 10.3758/BF03213088 (http://dx.doi.org/10.3758/BF03213088). • Mitchell, J. S. B. (1999), "Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time approximation scheme for geometric TSP, k-MST, and related problems" (http://citeseer.ist.psu.edu/622594. html), SIAM Journal on Computing 28 (4): 1298–1309, doi: 10.1137/S0097539796309764 (http://dx.doi.org/ 10.1137/S0097539796309764). • Rao, S.; Smith, W. (1998), "Approximating geometrical graphs via 'spanners' and 'banyans'", Proc. 30th Annual ACM Symposium on Theory of Computing, pp. 540–550. • Rosenkrantz, Daniel J.; Stearns, Richard E.; Lewis, Philip M., II (1977), "An Analysis of Several Heuristics for the Traveling Salesman Problem", SIAM Journal on Computing 6 (5): 563–581, doi: 10.1137/0206041 (http:// dx.doi.org/10.1137/0206041). • Vickers, D.; Butavicius, M.; Lee, M.; Medvedev, A. (2001), "Human performance on visually presented traveling salesman problems", Psychological Research 65 (1): 34–45, doi: 10.1007/s004260000031 (http://dx.doi.org/ 10.1007/s004260000031), PMID  11505612 (http://www.ncbi.nlm.nih.gov/pubmed/11505612). • Walshaw, Chris (2000), A Multilevel Approach to the Travelling Salesman Problem, CMS Press. • Walshaw, Chris (2001), A Multilevel Lin-Kernighan-Helsgaun Algorithm for the Travelling Salesman Problem, CMS Press.

External links • Traveling Salesman Problem (http://www.math.uwaterloo.ca/tsp/index.html) at University of Waterloo • TSPLIB (http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/) at the University of Heidelberg • Traveling Salesman Problem (http://demonstrations.wolfram.com/TravelingSalesmanProblem/) by Jon McLoone at the Wolfram Demonstrations Project • Source code library for the travelling salesman problem (http://www.adaptivebox.net/CILib/code/ tspcodes_link.html) • TSP solvers in R (http://tsp.r-forge.r-project.org/) for symmetric and asymmetric TSPs. Implements various insertion, nearest neighbor and 2-opt heuristics and an interface to Concorde and Chained Lin-Kernighan heuristics. • Traveling Salesman movie (on IMDB) (http://www.imdb.com/title/tt1801123/) • " Traveling Salesman in Python and Linear Optimization, IBM Developerworks with Source Code (http://www. ibm.com/developerworks/cloud/library/cl-optimizepythoncloud1/index.html)" by Noah Gift

150

Article Sources and Contributors

151

152

153

154