Algorithms and Data Structures

4 downloads 143 Views 9MB Size Report
Dec 22, 2013 - 22. Programming language. 27. Algorithm. 40. Deterministic algorithm. 65. Data structure. 68. List (abstract data type). 70. Array data structure.

Algorithms and Data Structures Part 1: Introduction (Wikipedia Book 2014)

By Wikipedians

Editors: Reiner Creutzburg, Jenny Knackmuß

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 22 Dec 2013 11:59:19 UTC

Contents Articles Computer

1

Informatics (academic field)

22

Programming language

27

Algorithm

40

Deterministic algorithm

65

Data structure

68

List (abstract data type)

70

Array data structure

74

FIFO

79

Queue (abstract data type)

82

LIFO

84

Stack (abstract data type)

84

Computer program

114

References Article Sources and Contributors

119

Image Sources, Licenses and Contributors

123

Article Licenses License

125

Computer

1

Computer

A computer is a general purpose device that can be programmed to carry out a set of arithmetic or logical operations. Since a sequence of operations can be readily changed, the computer can solve more than one kind of problem. Conventionally, a computer consists of at least one processing element, typically a central processing unit (CPU) and some form of memory. The processing element carries out arithmetic and logic operations, and a sequencing and control unit that can change the order of operations based on stored information. Peripheral devices allow information to be retrieved from an external source, and the result of operations saved and retrieved. In World War II, mechanical analog computers were used for specialized military applications. During this time the first electronic digital computers were developed. Originally they were the size of a large room, consuming as much power as several hundred modern personal computers (PCs).[1] Modern computers based on integrated circuits are millions to billions of times more capable than the early machines, and occupy a fraction of the space.[2] Simple computers are small enough to fit into mobile devices, and mobile computers can be powered by small batteries. Personal computers in their various forms are icons of the Information Age and are what most people think of as “computers.” However, the embedded computers found in many devices from MP3 players to fighter aircraft and from toys to industrial robots are the most numerous.

History of computing Etymology The first recorded use of the word “computer” was in 1613 in a book called “The yong mans gleanings” by English writer Richard Braithwait I haue read the truest computer of Times, and the best Arithmetician that euer breathed, and he reduceth thy dayes into a short number. It referred to a person who carried out calculations, or computations, and the word continued with the same meaning until the middle of the 20th century. From the end of the 19th century the word began to take on its more familiar meaning, a machine that carries out computations.

Mechanical aids to computing The history of the modern computer begins with two separate technologies, automated calculation and programmability. However no single device can be identified as the earliest computer, partly because of the inconsistent application of that term.[3] A few precursors are worth mentioning though, like some

The Jacquard loom, on display at the Museum of Science and Industry in Manchester, England, was one of the first programmable devices.

mechanical aids to computing, which were very successful and survived for centuries until the advent of the electronic calculator, like the Sumerian abacus, designed around 2500 BC[4] of which a descendant won a speed

Computer

2

competition against a contemporary desk calculating machine in Japan in 1946,[5] the slide rules, invented in the 1620s, which were carried on five Apollo space missions, including to the moon[6] and arguably the astrolabe and the Antikythera mechanism, an ancient astronomical analog computer built by the Greeks around 80 BC. The Greek mathematician Hero of Alexandria (c. 10–70 AD) built a mechanical theater which performed a play lasting 10 minutes and was operated by a complex system of ropes and drums that might be considered to be a means of deciding which parts of the mechanism performed which actions and when. This is the essence of programmability.

Mechanical calculators and programmable looms Blaise Pascal invented the mechanical calculator in 1642,[9] known as Pascal's calculator. It was the first machine to better human performance of arithmetical computations[10] and would turn out to be the only functional mechanical calculator in the 17th century.[11] Two hundred years later, in 1851, Thomas de Colmar released, after thirty years of development, his simplified arithmometer; it became the first machine to be commercialized because it was strong enough and reliable enough to be used daily in an office environment. The mechanical calculator was at the root of the development of computers in two separate ways. Initially, it was in trying to develop more powerful and more flexible calculators[12] that the computer was first theorized by Charles Babbage[13][14] and then developed.[15] Secondly, development of a low-cost electronic calculator, successor to the mechanical calculator, resulted in the development by Intel of the first commercially available microprocessor integrated circuit. In 1801, Joseph Marie Jacquard made an improvement to the textile loom by introducing a series of punched paper cards as a template which allowed his loom to weave intricate patterns automatically. The resulting Jacquard loom was an important step in the development of computers because the use of punched cards to define woven patterns can be viewed as an early, albeit limited, form of programmability.

First use of punched paper cards in computing

The Most Famous Image in the Early History of Computing From cave paintings to the internet HistoryofScience.comThis portrait of Jacquard was woven in silk on a Jacquard loom and required 24,000 punched cards to create (1839). It was only produced to order. Charles Babbage started exhibiting this portrait in 1840 to explain how his analytical engine would work.See #JACWEBJames Essinger, p.3-4 (2004), also see: Anthony Hyman, ed., Science and Reform: Selected Works of Charles Babbage (Cambridge, England: Cambridge University Press, 1989), page 298. It is in the collection of the Science Museum in London, England. (Delve (2007), page 99.)

It was the fusion of automatic calculation with programmability that produced the first recognizable computers. In 1837, Charles Babbage, "the actual father of the computer",[16] was the first to conceptualize and design a fully programmable mechanical calculator,[17] his analytical engine.[18] Babbage started in 1834. Initially he was to program his analytical engine with drums similar to the ones used in Vaucanson's automata which by design were limited in size, but soon he replaced them by Jacquard's card readers, one for data and one for the program. "The introduction of punched cards into the new engine was important not only as a more convenient form of control than the drums, or because programs could now be of unlimited extent, and could be stored and repeated without the danger of introducing errors in setting the machine by hand; it was important also because it served to crystallize Babbage's feeling that he had invented something really new, something much more than a sophisticated calculating machine."[19] Now it is obvious that no finite machine can include infinity...It is impossible to construct machinery occupying unlimited space; but it is possible to construct finite machinery, and to use it through unlimited time. It is this substitution of the infinity of time for the infinity of space which I have made use of, to limit the size of the engine and yet to retain its unlimited power.

Computer

3 —Charles Babbage, Passages from the Life of a Philosopher, Chapter VIII: On the Analytical Engine

After this breakthrough, he redesigned his difference engine (No. 2, still not programmable) incorporating his new ideas. Allan Bromley came to the science museum of London starting in 1979 to study Babbage's engines and determined that difference engine No. 2 was the only engine that had a complete enough set of drawings to be built, and he convinced the museum to do it. This engine, finished in 1991, proved without doubt the validity of Charles Babbage's work.[20] Except for a pause between 1848 and 1857, Babbage would spend the rest of his life simplifying each part of his engine: "Gradually he developed plans for Engines of great logical power and elegant simplicity (although the term 'simple' is used here in a purely relative sense)."[21] Between 1842 and 1843, Ada Lovelace, an analyst of Charles Babbage's analytical engine, translated an article by Italian military engineer Luigi Menabrea on the engine, which she supplemented with an elaborate set of notes of her own. These notes contained what is considered the first computer program – that is, an algorithm encoded for processing by a machine. She also stated: “We may say most aptly, that the Analytical Engine weaves algebraical patterns just as the Jacquard-loom weaves flowers and leaves.”; furthermore she developed a vision on the capability of computers to go beyond mere calculating or number-crunching[22] claiming that: should “...the fundamental relations of pitched sounds in the science of harmony and of musical composition...” be susceptible “...of adaptations to the action of the operating notation and mechanism of the engine...” it “...might compose elaborate and scientific pieces of music of any degree of complexity or extent”.

Ada Lovelace, considered to be the first computer programmer

In the late 1880s, Herman Hollerith invented the recording of data on a machine-readable medium. Earlier uses of machine-readable media had been for control, not data. “After some initial trials with paper tape, he settled on punched cards...” To process these punched cards he invented the tabulator, and the keypunch machines. These three inventions were the foundation of the modern information processing industry. Large-scale automated data processing of punched cards was performed for the 1890 United States Census by Hollerith's company, which later became the core of IBM. By the end of the 19th century a number of ideas and technologies, that would later prove useful in the realization of practical computers, had begun to appear: Boolean algebra, the vacuum tube (thermionic valve), punched cards and tape, and the teleprinter.

Babbage's dream comes true In 1888, Henry Babbage, Charles Babbage's son, completed a simplified version of the analytical engine's computing unit (the mill) . He gave a successful demonstration of its use in 1906, calculating and printing the first 40 multiples of pi with a precision of 29 decimal places.[23] This machine was given to the Science Museum in South Kensington in 1910. He also gave a demonstration piece of one of his father's engines to Harvard University which convinced Howard Aiken, 50 years later, to incorporate the architecture of the analytical engine in what would become the ASCC/Mark I built by IBM.[24] Leonardo Torres y Quevedo built two analytical machines to prove that all of the functions of Babbage's analytical engine could be replaced with electromechanical devices. The first one, built in 1914, had a little electromechanical memory and the second one, built in 1920 to celebrate the one hundredth anniversary of the invention of the arithmometer, received its commands and printed its results on a typewriter.[25] Torres y Quevedo published functional schematics of all of these functions: addition, multiplication, division ... and even a decimal comparator, in his "Essais sur l'automatique" in 1915.

Computer

4

Some inventors like Percy Ludgate, Vannevar Bush and Louis Couffignal[26] tried to improve on the analytical engine but didn't succeed at building a machine. Howard Aiken wanted to build a giant calculator and was looking for a sponsor to build it. He first presented his design to the Monroe Calculator Company and then to Harvard University, both without success. Carmello Lanza, a technician in Harvard's physics laboratory who had heard Aiken's presentation "...couldn't see why in the world I (Howard Aiken) wanted to do anything like this in the Physics laboratory, because we already had such a machine and nobody used it... Lanza led him up into the attic... There, sure enough... were the wheels that Aiken later put on display in the lobby of the Computer Laboratory. With them was a letter from Henry Prevost Babbage describing these wheels as part of his father's proposed calculating engine. This was the first time Aiken ever heard of Babbage he said, and it was this experience that led him to look up Babbage in the library and to come across his autobiography" which gave a description of his analytical engine. Aiken first contacted IBM in November 1937,[27] presenting a machine which, by then, had an architecture based on Babbage's analytical engine. This was the first development of a programmable calculator that would succeed and that would end up being used for many years to come: the ASCC/Mark I. Zuse first heard of Aiken and IBM's work from the German Secret Service.[28] He considered his Z3 to be a Babbage type machine.[29]

First general-purpose computers During the first half of the 20th century, many scientific computing needs were met by increasingly sophisticated analog computers, which used a direct mechanical or electrical model of the problem as a basis for computation. However, these were not programmable and generally lacked the versatility and accuracy of modern digital computers. Alan Turing is widely regarded as the father of modern computer science. In 1936, Turing provided an influential formalization of the concept of the algorithm and computation with the Turing machine, providing a blueprint for the electronic digital computer. Of his role in The Zuse Z3, 1941, considered the world's first the creation of the modern computer, Time magazine in naming Turing working programmable, fully automatic computing machine one of the 100 most influential people of the 20th century, states: “The fact remains that everyone who taps at a keyboard, opening a spreadsheet or a word-processing program, is working on an incarnation of a Turing machine.” The first really functional computer was the Z1, originally created by Germany's Konrad Zuse in his parents' living room in 1936 to 1938, and it is considered to be the first electro-mechanical binary programmable (modern) computer. George Stibitz is internationally recognized as a father of the modern digital computer. While working at Bell Labs in November 1937, Stibitz invented and built a relay-based calculator he dubbed the “Model K” (for “kitchen table,” on which he had assembled it), which was the first to use binary circuits to perform an arithmetic operation. Later models added greater sophistication including complex arithmetic and programmability.

The ENIAC, which became operational in 1946, is considered to be the first general-purpose electronic computer. Programmers Betty Jean Jennings (left) and Fran Bilas (right) are depicted here operating the ENIAC's main control panel.

Computer

5

The Atanasoff–Berry Computer (ABC) was the world's first electronic digital computer, albeit not programmable. Atanasoff is considered to be one of the fathers of the computer. Conceived in 1937 by Iowa State College physics professor John Atanasoff, and built with the assistance of graduate student Clifford Berry, the machine was not programmable, being designed only to solve systems of linear equations. The computer did employ parallel computation. A 1973 court ruling in a patent dispute found that the patent for the 1946 ENIAC computer derived from the Atanasoff–Berry Computer. The first program-controlled computer was invented by Konrad Zuse, who built the Z3, an electromechanical computing machine, in 1941. The first programmable electronic computer was the Colossus, built in 1943 by Tommy Flowers. Key steps towards modern computers

EDSAC was one of the first computers to implement the stored-program (von Neumann) architecture.

A succession of steadily more powerful and flexible computing devices were constructed in the 1930s and 1940s, gradually adding the key features that are seen in modern computers. The use of digital electronics (largely invented by Claude Shannon in 1937) and more flexible programmability were vitally important steps, but defining one point along this road as “the first digital electronic computer” is difficult.Shannon 1940 Notable achievements include: • Konrad Zuse's electromechanical “Z machines.” The Z3 (1941) was the first working machine featuring binary arithmetic, including floating point arithmetic and a measure of programmability. In 1998 the Z3 was proved to be Turing complete, therefore being the world's first operational computer. Thus, Zuse is often regarded as the inventor of the computer.[30][31][32][33] • The non-programmable Atanasoff–Berry Computer (commenced in 1937, completed in 1941) which used vacuum tube based computation, binary numbers, and regenerative capacitor memory. The use of regenerative memory allowed it to be much more compact than its peers (being approximately the size of a large desk or workbench), since intermediate results could be stored and then fed back into the same set of computation elements. • The secret British Colossus computers (1943),[34] which had limited programmability but demonstrated that a device using thousands of tubes could be reasonably reliable and electronically re-programmable. It was used for breaking German wartime codes. • The Harvard Mark I (1944), a large-scale electromechanical computer with limited programmability. • The U.S. Army's Ballistic Research Laboratory ENIAC (1946), which used decimal arithmetic and is sometimes called the first general purpose electronic computer (since Konrad Zuse's Z3 of 1941 used electromagnets instead of electronics). Initially, however, ENIAC had an architecture which required rewiring a plugboard to change its programming. • The Ferranti Mark 1 was the world's first commercially available general-purpose computer.

Stored-program architecture Several developers of ENIAC, recognizing its flaws, came up with a far more flexible and elegant design, which came to be known as the “stored-program architecture” or von Neumann architecture. This design was first formally described by John von Neumann in the paper First Draft of a Report on the EDVAC, distributed in 1945. A number of projects to develop computers based on the stored-program architecture commenced around this time, the first of which was completed in 1948 at the University of Manchester in England, the Manchester Small-Scale Experimental Machine (SSEM or “Baby”). The Electronic Delay Storage Automatic Calculator (EDSAC), completed a year after the SSEM at Cambridge University, was the first practical, non-experimental implementation of the stored-program design and was put to use immediately for research work at the university. Shortly thereafter, the machine originally

Computer

6

described by von Neumann's paper—EDVAC—was completed but did not see full-time use for an additional two years. Nearly all modern computers implement some form of the stored-program architecture, making it the single trait by which the word “computer” is now defined. While the technologies used in computers have changed dramatically since the first electronic, general-purpose computers of the 1940s, most still use the von Neumann architecture. Beginning in the 1950s, Soviet scientists Sergei Sobolev and Nikolay Brusentsov conducted research on ternary computers, devices that operated on a base three numbering system of -1, 0, and 1 rather than the conventional binary numbering system upon which most computers are based. They designed the Setun, a functional ternary computer, at Moscow State University. The device was put into limited production in the Soviet Union, but supplanted by the more common binary architecture.

Semiconductors and microprocessors

Die of an Intel 80486DX2 microprocessor (actual size: 12×6.75 mm) in its packaging

Computers using vacuum tubes as their electronic elements were in use throughout the 1950s, but by the 1960s they had been largely replaced by transistor-based machines, which were smaller, faster, cheaper to produce, required less power, and were more reliable. The first transistorized computer was demonstrated at the University of Manchester in 1953. In the 1970s, integrated circuit technology and the subsequent creation of microprocessors, such as the Intel 4004, further decreased size and cost and further increased speed and reliability of computers. By the late 1970s, many products such as video recorders contained dedicated computers called microcontrollers, and they started to appear as a replacement to mechanical controls in domestic appliances such as washing machines. The 1980s witnessed home computers and the now ubiquitous personal computer. With the evolution of the Internet, personal computers are becoming as common as the television and the telephone in the household.[citation needed] Modern smartphones are fully programmable computers in their own right, and as of 2009 may well be the most common form of such computers in existence.[citation needed]

Programs The defining feature of modern computers which distinguishes them from all other machines is that they can be programmed. That is to say that some type of instructions (the program) can be given to the computer, and it will process them. Modern computers based on the von Neumann architecture often have machine code in the form of an imperative programming language. In practical terms, a computer program may be just a few instructions or extend to many millions of instructions, as do the programs for word processors and web browsers for example. A typical modern computer can execute billions of instructions per second (gigaflops) and rarely makes a mistake over many years of operation. Large computer programs consisting of several million instructions may take teams of programmers years to write, and due to the complexity of the task almost certainly contain errors.

Computer

7

Stored program architecture This section applies to most common RAM machine-based computers. In most cases, computer instructions are simple: add one number to another, move some data from one location to another, send a message to some external device, etc. These instructions are read from the computer's memory and are generally carried out (executed) in the order they were given. However, there are usually specialized instructions to tell the computer to jump ahead or backwards to some other place in the program and to carry on executing from there. These Replica of the Small-Scale Experimental are called “jump” instructions (or branches). Furthermore, jump Machine (SSEM), the world's first instructions may be made to happen conditionally so that different stored-program computer, at the Museum of Science and Industry in Manchester, England sequences of instructions may be used depending on the result of some previous calculation or some external event. Many computers directly support subroutines by providing a type of jump that “remembers” the location it jumped from and another instruction to return to the instruction following that jump instruction. Program execution might be likened to reading a book. While a person will normally read each word and line in sequence, they may at times jump back to an earlier place in the text or skip sections that are not of interest. Similarly, a computer may sometimes go back and repeat the instructions in some section of the program over and over again until some internal condition is met. This is called the flow of control within the program and it is what allows the computer to perform tasks repeatedly without human intervention. Comparatively, a person using a pocket calculator can perform a basic arithmetic operation such as adding two numbers with just a few button presses. But to add together all of the numbers from 1 to 1,000 would take thousands of button presses and a lot of time, with a near certainty of making a mistake. On the other hand, a computer may be programmed to do this with just a few simple instructions. For example: mov No. 0, sum mov No. 1, num loop: add num, sum add No. 1, num cmp num, #1000 ble loop halt

; ; ; ;

; set sum to 0 ; set num to 1 add num to sum ; add 1 to num compare num to 1000 if num Mathematical Algorithms: 2100 Patentability (http://www.uspto.gov/web/offices/pac/mpep/documents/2100_2106_02.htm), Manual of Patent Examining Procedure (MPEP). Latest revision August 2006

Secondary references • Bolter, David J. (1984). Turing's Man: Western Culture in the Computer Age (1984 ed.). The University of North Carolina Press, Chapel Hill NC. ISBN 0-8078-1564-0., ISBN 0-8078-4108-0 pbk. • Dilson, Jesse (2007). The Abacus ((1968,1994) ed.). St. Martin's Press, NY. ISBN 0-312-10409-X., ISBN 0-312-10409-X (pbk.) • van Heijenoort, Jean (2001). From Frege to Gödel, A Source Book in Mathematical Logic, 1879–1931 ((1967) ed.). Harvard University Press, Cambridge, MA. ISBN 0-674-32449-8., 3rd edition 1976[?], ISBN 0-674-32449-8 (pbk.) • Hodges, Andrew (1983). Alan Turing: The Enigma ((1983) ed.). Simon and Schuster, New York. ISBN 0-671-49207-1., ISBN 0-671-49207-1. Cf. Chapter "The Spirit of Truth" for a history leading to, and a discussion of, his proof.

Further reading • Jean Luc Chabert (1999). A History of Algorithms: From the Pebble to the Microchip. Springer Verlag. ISBN 978-3-540-63369-3. • Algorithmics.: The Spirit of Computing. Addison-Wesley. 2004. ISBN 978-0-321-11784-7. • Knuth, Donald E. (2000). Selected Papers on Analysis of Algorithms (http://www-cs-faculty.stanford.edu/ ~uno/aa.html). Stanford, California: Center for the Study of Language and Information. • Knuth, Donald E. (2010). Selected Papers on Design of Algorithms (http://www-cs-faculty.stanford.edu/~uno/ da.html). Stanford, California: Center for the Study of Language and Information. • Berlinski, David (2001). The Advent of the Algorithm: The 300-Year Journey from an Idea to the Computer. Harvest Books. ISBN 978-0-15-601391-8. • Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein (2009). Introduction To Algorithms, Third Edition. MIT Press. ISBN 978-0262033848.

63

Algorithm

External links • Hazewinkel, Michiel, ed. (2001), "Algorithm" (http://www.encyclopediaofmath.org/index.php?title=p/ a011780), Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4 • Algorithms (http://www.dmoz.org/Computers/Algorithms//) on the Open Directory Project • Weisstein, Eric W., " Algorithm (http://mathworld.wolfram.com/Algorithm.html)", MathWorld. • Dictionary of Algorithms and Data Structures (http://www.nist.gov/dads/)—National Institute of Standards and Technology • Algorithms and Data Structures by Dr Nikolai Bezroukov (http://www.softpanorama.org/Algorithms/index. shtml) Algorithm repositories • The Stony Brook Algorithm Repository (http://www.cs.sunysb.edu/~algorith/)—State University of New York at Stony Brook • Netlib Repository (http://www.netlib.org/)—University of Tennessee and Oak Ridge National Laboratory • Collected Algorithms of the ACM (http://calgo.acm.org/)—Association for Computing Machinery • The Stanford GraphBase (http://www-cs-staff.stanford.edu/~knuth/sgb.html)—Stanford University • Combinatorica (http://www.combinatorica.com/)—University of Iowa and State University of New York at Stony Brook • Library of Efficient Datastructures and Algorithms (LEDA) (http://www.algorithmic-solutions.com/ )—previously from Max-Planck-Institut für Informatik • Archive of Interesting Code (http://www.keithschwarz.com/interesting/) • A semantic wiki to collect, categorize and relate all algorithms and data structures (http://allmyalgorithms.org) Lecture notes • Algorithms Course Materials (http://compgeom.cs.uiuc.edu/~jeffe//teaching/algorithms/). Jeff Erickson. University of Illinois. Community • Algorithms (https://plus.google.com/communities/101392274103811461838) on Google+

64

Deterministic algorithm

Deterministic algorithm In computer science, a deterministic algorithm is an algorithm which, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states. Deterministic algorithms are by far the most studied and familiar kind of algorithm, as well as one of the most practical, since they can be run on real machines efficiently. Formally, a deterministic algorithm computes a mathematical function; a function has a unique value for any given input, and the algorithm is a process that produces this particular value as output.

Formal definition Deterministic algorithms can be defined in terms of a state machine: a state describes what a machine is doing at a particular instant in time. State machines pass in a discrete manner from one state to another. Just after we enter the input, the machine is in its initial state or start state. If the machine is deterministic, this means that from this point onwards, its current state determines what its next state will be; its course through the set of states is predetermined. Note that a machine can be deterministic and still never stop or finish, and therefore fail to deliver a result. Examples of particular abstract machines which are deterministic include the deterministic Turing machine and deterministic finite automaton.

What makes algorithms non-deterministic? A variety of factors can cause an algorithm to behave in a way which is not deterministic, or non-deterministic: • If it uses external state other than the input, such as user input, a global variable, a hardware timer value, a random value, or stored disk data. • If it operates in a way that is timing-sensitive, for example if it has multiple processors writing to the same data at the same time. In this case, the precise order in which each processor writes its data will affect the result. • If a hardware error causes its state to change in an unexpected way. Although real programs are rarely purely deterministic, it is easier for humans as well as other programs to reason about programs that are. For this reason, most programming languages and especially functional programming languages make an effort to prevent the above events from happening except under controlled conditions. The prevalence of multi-core processors has resulted in a surge of interest in determinism in parallel programming and challenges of non-determinism have been well documented. A number of tools to help deal with the challenges have been proposed[1] to deal with deadlocks and race conditions.

Problems with deterministic algorithms Unfortunately, for some problems deterministic algorithms are also hard to find. For example, there are simple and efficient probabilistic algorithms that determine whether a given number is prime and have a very small chance of being wrong. These have been known since the 1970s (see for example Fermat primality test); the known deterministic algorithms remain considerably slower in practice. As another example, NP-complete problems, which include many of the most important practical problems, can be solved quickly using a machine called a nondeterministic Turing machine, but efficient practical algorithms have never been found for any of them. At best, we can currently only find approximate solutions or solutions in special cases. Another major problem with deterministic algorithms is that sometimes, we don't want the results to be predictable. For example, if you are playing an on-line game of blackjack that shuffles its deck using a pseudorandom number generator, a clever gambler might guess precisely the numbers the generator will choose and so determine the entire

65

Deterministic algorithm

66

contents of the deck ahead of time, allowing him to cheat; for example, the Software Security Group at Reliable Software Technologies was able to do this for an implementation of Texas Hold 'em Poker that is distributed by ASF Software, Inc, allowing them to consistently predict the outcome of hands ahead of time.[2] Similar problems arise in cryptography, where private keys are often generated using such a generator. This sort of problem is generally avoided using a cryptographically secure pseudo-random number generator.

Failure / Success in algorithms Exceptions Exception throwing is a usual mechanism to signal failure due to unexpected/undesired states.

Failure as a return value In order to overcome the exception unhandling problem that may result in non termination, the "Total functional programming" way is to wrap the result of a partial function in an option type result. • the option type in ML and the Maybe type in Haskell (* Standard ML *) datatype 'a option = NONE | SOME of 'a (* OCaml *) type 'a option = None | Some of 'a -- Haskell data Maybe a = Nothing | Just a • the Either type in Haskell, include the failure reason. data Either

errorType resultType = Right resultType | Left errorType

Failure in Monads, the Left zero property As Monads model sequential composition, the Left zero property (z * s = z) in a monad means that the right side of the sequence will not be evaluated. -- Left zero in the Maybe monad Nothing >> k = Nothing Nothing >>= f = Nothing -- Left zero in the Either monad Left err >> k = Left err Left err >>= f = Left err

Deterministic algorithm

Determinism categories in languages Mercury This logic-functional programming language establish different determinism categories for predicate modes as explained in the ref.[3][4]

Haskell Haskell provides several mechanisms: non-determinism or notion of Fail • the Maybe and Either types include the notion of success in the result. • the fail method of the class Monad, may be used to signal fail as exception. • the Maybe monad and MaybeT monad transformer provide for failed computations (stop the computation sequence and return Nothing)[5] determinism/non-det with multiple solutions you may retrieve all possible outcomes of a multiple result computation, by wrapping its result type in a MonadPlus monad. (its method mzero makes an outcome fail and mplus collects the successful results).[6]

ML family and derived languages As seen in Standard ML, OCaml and Scala • The option type includes the notion of success.

Java • The null reference value may represent an unsuccessful (out-of-domain) result.

References [1] Parallel Studio [2] Gary McGraw and John Viega. Make your software behave: Playing the numbers: How to cheat in online gambling. http:/ / www. ibm. com/ developerworks/ library/ s-playing/ #h4 [3] Determinism categories in the Mercury programming language (http:/ / www. mercury. csse. unimelb. edu. au/ information/ doc-release/ mercury_ref/ Determinism-categories. html#Determinism-categories) [4] Mercury predicate modes (http:/ / www. mercury. csse. unimelb. edu. au/ information/ doc-release/ mercury_ref/ Predicate-and-function-mode-declarations. html#Predicate-and-function-mode-declarations) [5] Representing failure using the Maybe monad (http:/ / www. haskell. org/ haskellwiki/ Monad#Common_monads) [6] The class MonadPlus (http:/ / www. haskell. org/ haskellwiki/ MonadPlus)

67

Data structure

Data structure In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.[1][2] Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks. For example, B-trees are particularly well-suited for implementation of databases, while compiler implementations usually use hash tables to look up identifiers. Data structures provide a means to manage large amounts of data efficiently, A hash table such as large databases and internet indexing services. Usually, efficient data structures are a key to designing efficient algorithms. Some formal design methods and programming languages emphasize data structures, rather than algorithms, as the key organizing factor in software design. Storing and retrieving can be carried out on data stored in both main memory and in secondary memory.

Overview • An array stores a number of elements in a specific order. They are accessed using an integer to specify which element is required (although the elements may be of almost any type). Arrays may be fixed-length or expandable. • Records (also called tuples or structs) are among the simplest data structures. A record is a value that contains other values, typically in fixed number and sequence and typically indexed by names. The elements of records are usually called fields or members. • A hash table (also called a dictionary or map) is a more flexible variation on a record, in which name-value pairs can be added and deleted freely. • A union type specifies which of a number of permitted primitive types may be stored in its instances, e.g. "float or long integer". Contrast with a record, which could be defined to contain a float and an integer; whereas, in a union, there is only one value at a time. • A tagged union (also called a variant, variant record, discriminated union, or disjoint union) contains an additional field indicating its current type, for enhanced type safety. • A set is an abstract data structure that can store specific values, without any particular order, and with no repeated values. Values themselves are not retrieved from sets, rather one tests a value for membership to obtain a boolean "in" or "not in". • Graphs and trees are linked abstract data structures composed of nodes. Each node contains a value and also one or more pointers to other nodes. Graphs can be used to represent networks, while trees are generally used for sorting and searching, having their nodes arranged in some relative order based on their values. • An object contains data fields, like a record, and also contains program code fragments for accessing or modifying those fields. Data structures not containing code, like those above, are called plain old data structures. Many others are possible, but they tend to be further variations and compounds of the above.

68

Data structure

Basic principles Data structures are generally based on the ability of a computer to fetch and store data at any place in its memory, specified by an address—a bit string that can be itself stored in memory and manipulated by the program. Thus the record and array data structures are based on computing the addresses of data items with arithmetic operations; while the linked data structures are based on storing addresses of data items within the structure itself. Many data structures use both principles, sometimes combined in non-trivial ways (as in XOR linking). The implementation of a data structure usually requires writing a set of procedures that create and manipulate instances of that structure. The efficiency of a data structure cannot be analyzed separately from those operations. This observation motivates the theoretical concept of an abstract data type, a data structure that is defined indirectly by the operations that may be performed on it, and the mathematical properties of those operations (including their space and time cost).

Language support Most assembly languages and some low-level languages, such as BCPL (Basic Combined Programming Language), lack support for data structures. Many high-level programming languages and some higher-level assembly languages, such as MASM, on the other hand, have special syntax or other built-in support for certain data structures, such as vectors (one-dimensional arrays) in the C language or multi-dimensional arrays in Pascal. Most programming languages feature some sort of library mechanism that allows data structure implementations to be reused by different programs. Modern languages usually come with standard libraries that implement the most common data structures. Examples are the C++ Standard Template Library, the Java Collections Framework, and Microsoft's .NET Framework. Modern languages also generally support modular programming, the separation between the interface of a library module and its implementation. Some provide opaque data types that allow clients to hide implementation details. Object-oriented programming languages, such as C++, Java and Smalltalk may use classes for this purpose. Many known data structures have concurrent versions that allow multiple computing threads to access the data structure simultaneously.

References [1] Paul E. Black (ed.), entry for data structure in Dictionary of Algorithms and Data Structures. U.S. National Institute of Standards and Technology. 15 December 2004. Online version (http:/ / www. itl. nist. gov/ div897/ sqg/ dads/ HTML/ datastructur. html) Accessed May 21, 2009. [2] Entry data structure in the Encyclopædia Britannica (2009) Online entry (http:/ / www. britannica. com/ EBchecked/ topic/ 152190/ data-structure) accessed on May 21, 2009.

Further reading • Peter Brass, Advanced Data Structures, Cambridge University Press, 2008. • Donald Knuth, The Art of Computer Programming, vol. 1. Addison-Wesley, 3rd edition, 1997. • Dinesh Mehta and Sartaj Sahni Handbook of Data Structures and Applications, Chapman and Hall/CRC Press, 2007. • Niklaus Wirth, Algorithms and Data Structures, Prentice Hall, 1985. • Diane Zak, Introduction to programming with c++, copyright 2011 Cengage Learning Asia Pte Ltd

69

Data structure

70

External links • • • •

UC Berkeley video course on data structures (http://academicearth.org/courses/data-structures) Descriptions (http://nist.gov/dads/) from the Dictionary of Algorithms and Data Structures Data structures course (http://www.cs.auckland.ac.nz/software/AlgAnim/ds_ToC.html) An Examination of Data Structures from .NET perspective (http://msdn.microsoft.com/en-us/library/ aa289148(VS.71).aspx) • Schaffer, C. Data Structures and Algorithm Analysis (http://people.cs.vt.edu/~shaffer/Book/C++ 3e20110915.pdf)

List (abstract data type) In computer science, a list or sequence is an abstract data type that implements a finite ordered collection of values, where the same value may occur more than once. An instance of a list is a computer representation of the mathematical concept of a finite sequence; the (potentially) infinite analog of a list is a stream. Lists are a basic example of containers, as they contain other values. Each instance of a value in the list is usually called an item, entry, or element of the list; if the same value occurs multiple times, each occurrence is considered a distinct item. Lists are distinguished from arrays in that lists only allow sequential access, while arrays allow random access. The name list is also used for several concrete data structures that can be used to implement abstract lists, especially linked lists. The so-called static list structures allow only inspection and enumeration of the values. A mutable or dynamic list may allow items to be inserted, replaced, or deleted during the list's existence.

A singly linked list structure, implementing a list with 3 integer elements.

Many programming languages provide support for list data types, and have special syntax and semantics for lists and list operations. A list can often be constructed by writing the items in sequence, separated by commas, semicolons, or spaces, within a pair of delimiters such as parentheses '()', brackets, '[]', braces '{}', or angle brackets ''. Some languages may allow list types to be indexed or sliced like array types, in which case the data type is more accurately described as an array. In object-oriented programming languages, lists are usually provided as instances of subclasses of a generic "list" class, and traversed via separate iterators. List data types are often implemented using array data structures or linked lists of some sort, but other data structures may be more appropriate for some applications. In some contexts, such as in Lisp programming, the term list may refer specifically to a linked list rather than an array. In type theory and functional programming, abstract lists are usually defined inductively by four operations: nil that yields the empty list, cons, which adds an item at the beginning of a list, head, that returns the first element of a list, and tail that returns a list minus its first element. Formally, Peano's natural numbers can be defined as abstract lists with elements of unit type.

List (abstract data type)

Operations Implementation of the list data structure may provide some of the following operations: • • • • • •

a constructor for creating an empty list; an operation for testing whether or not a list is empty; an operation for prepending an entity to a list an operation for appending an entity to a list an operation for determining the first component (or the "head") of a list an operation for referring to the list consisting of all the components of a list except for its first (this is called the "tail" of the list.)

Characteristics Lists have the following properties: • The size of lists. It indicates how many elements there are in the list. • Equality of lists: • In mathematics, sometimes equality of lists is defined simply in terms of object identity: two lists are equal if and only if they are the same object. • In modern programming languages, equality of lists is normally defined in terms of structural equality of the corresponding entries, except that if the lists are typed, then the list types may also be relevant. • Lists may be typed. This implies that the entries in a list must have types that are compatible with the list's type. It is common that lists are typed when they are implemented using arrays. • Each element in the list has an index. The first element commonly has index 0 or 1 (or some other predefined integer). Subsequent elements have indices that are 1 higher than the previous element. The last element has index + − 1. • It is possible to retrieve the element at a particular index. • It is possible to traverse the list in the order of increasing index. • It is possible to change the element at a particular index to a different value, without affecting any other elements. • It is possible to insert an element at a particular index. The indices of higher elements at that are increased by 1. • It is possible to remove an element at a particular index. The indices of higher elements at that are decreased by 1.

Implementations Lists are typically implemented either as linked lists (either singly or doubly linked) or as arrays, usually variable length or dynamic arrays. The standard way of implementing lists, originating with the programming language Lisp, is to have each element of the list contain both its value and a pointer indicating the location of the next element in the list. This results in either a linked list or a tree, depending on whether the list has nested sublists. Some older Lisp implementations (such as the Lisp implementation of the Symbolics 3600) also supported "compressed lists" (using CDR coding) which had a special internal representation (invisible to the user). Lists can be manipulated using iteration or recursion. The former is often preferred in imperative programming languages, while the latter is the norm in functional languages. Lists can be implemented as self-balancing binary search trees holding index-value pairs, providing equal-time access to any element (e.g. all residing in the fringe, and internal nodes storing the right-most child's index, used to guide the search), taking the time logarithmic in the list's size, but as long as it doesn't change much will provide the illusion of random access and enable swap, prefix and append operations in logarithmic time as well.

71

List (abstract data type)

Programming language support Some languages do not offer a list data structure, but offer the use of associative arrays or some kind of table to emulate lists. For example, Lua provides tables. Although Lua stores lists that have numerical indices as arrays internally, they still appear as hash tables. In Lisp, lists are the fundamental data type and can represent both program code and data. In most dialects, the list of the first three prime numbers could be written as (list 2 3 5). In several dialects of Lisp, including Scheme, a list is a collection of pairs, consisting of a value and a pointer to the next pair (or null value), making a singly linked list.

Applications As the name implies, lists can be used to store a list of records. The items in a list can be sorted for the purpose of fast search (binary search). Because in computing, lists are easier to realize than sets, a finite set in mathematical sense can be realized as a list with additional restrictions, that is, duplicate elements are disallowed and such that order is irrelevant. If the list is sorted, it speeds up determining if a given item is already in the set but in order to ensure the order, it requires more time to add new entry to the list. In efficient implementations, however, sets are implemented using self-balancing binary search trees or hash tables, rather than a list.

Abstract definition The abstract list type L with elements of some type E (a monomorphic list) is defined by the following functions: nil: () → L cons: E × L → L first: L → E rest: L → L with the axioms first (cons (e, l)) = e rest (cons (e, l)) = l for any element e and any list l. It is implicit that cons (e, l) ≠ l cons (e, l) ≠ e cons (e1, l1) = cons (e2, l2) if e1 = e2 and l1 = l2 Note that first (nil ()) and rest (nil ()) are not defined. These axioms are equivalent to those of the abstract stack data type. In type theory, the above definition is more simply regarded as an inductive type defined in terms of constructors: nil and cons. In algebraic terms, this can be represented as the transformation 1 + E × L → L. first and rest are then obtained by pattern matching on the cons constructor and separately handling the nil case.

72

List (abstract data type)

The list monad The list type forms a monad with the following functions (using E* rather than L to represent monomorphic lists with elements of type E):

where append is defined as:

Alternatively, the monad may be defined in terms of operations return, fmap and join, with:

Note that fmap, join, append and bind are well-defined, since they're applied to progressively deeper arguments at each recursive call. The list type is an additive monad, with nil as the monadic zero and append as monadic sum. Lists form a monoid under the append operation. The identity element of the monoid is the empty list, nil. In fact, this is the free monoid over the set of list elements.

73

Array data structure

Array data structure In computer science, an array data structure or simply an array is a data structure consisting of a collection of elements (values or variables), each identified by at least one array index or key. An array is stored so that the position of each element can be computed from its index tuple by a mathematical formula. For example, an array of 10 integer variables, with indices 0 through 9, may be stored as 10 words at memory addresses 2000, 2004, 2008, … 2036, so that the element with index i has the address 2000 + 4 × i.[1] Because the mathematical concept of a matrix can be represented as a two-dimensional grid, two-dimensional arrays are also sometimes called matrices. In some cases the term "vector" is used in computing to refer to an array, although tuples rather than vectors are more correctly the mathematical equivalent. Arrays are often used to implement tables, especially lookup tables; the word table is sometimes used as a synonym of array. Arrays are among the oldest and most important data structures, and are used by almost every program. They are also used to implement many other data structures, such as lists and strings. They effectively exploit the addressing logic of computers. In most modern computers and many external storage devices, the memory is a one-dimensional array of words, whose indices are their addresses. Processors, especially vector processors, are often optimized for array operations. Arrays are useful mostly because the element indices can be computed at run time. Among other things, this feature allows a single iterative statement to process arbitrarily many elements of an array. For that reason, the elements of an array data structure are required to have the same size and should use the same data representation. The set of valid index tuples and the addresses of the elements (and hence the element addressing formula) are usually,[2] but not always, fixed while the array is in use. The term array is often used to mean array data type, a kind of data type provided by most high-level programming languages that consists of a collection of values or variables that can be selected by one or more indices computed at run-time. Array types are often implemented by array structures; however, in some languages they may be implemented by hash tables, linked lists, search trees, or other data structures. The term is also used, especially in the description of algorithms, to mean associative array or "abstract array", a theoretical computer science model (an abstract data type or ADT) intended to capture the essential properties of arrays.

History The first digital computers used machine-language programming to set up and access array structures for data tables, vector and matrix computations, and for many other purposes. Von Neumann wrote the first array-sorting program (merge sort) in 1945, during the building of the first stored-program computer.[3]p. 159 Array indexing was originally done by self-modifying code, and later using index registers and indirect addressing. Some mainframes designed in the 1960s, such as the Burroughs B5000 and its successors, used memory segmentation to perform index-bounds checking in hardware. Assembly languages generally have no special support for arrays, other than what the machine itself provides. The earliest high-level programming languages, including FORTRAN (1957), COBOL (1960), and ALGOL 60 (1960), had support for multi-dimensional arrays, and so has C (1972). In C++ (1983), class templates exist for multi-dimensional arrays whose dimension is fixed at runtime as well as for runtime-flexible arrays.

74

Array data structure

Applications Arrays are used to implement mathematical vectors and matrices, as well as other kinds of rectangular tables. Many databases, small and large, consist of (or include) one-dimensional arrays whose elements are records. Arrays are used to implement other data structures, such as heaps, hash tables, deques, queues, stacks, strings, and VLists. One or more large arrays are sometimes used to emulate in-program dynamic memory allocation, particularly memory pool allocation. Historically, this has sometimes been the only way to allocate "dynamic memory" portably. Arrays can be used to determine partial or complete control flow in programs, as a compact alternative to (otherwise repetitive) multiple IF statements. They are known in this context as control tables and are used in conjunction with a purpose built interpreter whose control flow is altered according to values contained in the array. The array may contain subroutine pointers (or relative subroutine numbers that can be acted upon by SWITCH statements) that direct the path of the execution.

Array element identifier and addressing formulas When data objects are stored in an array, individual objects are selected by an index that is usually a non-negative scalar integer. Indices are also called subscripts. An index maps the array value to a stored object. There are three ways in which the elements of an array can be indexed: • 0 (zero-based indexing): The first element of the array is indexed by subscript of 0. • 1 (one-based indexing): The first element of the array is indexed by subscript of 1. • n (n-based indexing): The base index of an array can be freely chosen. Usually programming languages allowing n-based indexing also allow negative index values and other scalar data types like enumerations, or characters may be used as an array index. Arrays can have multiple dimensions, thus it is not uncommon to access an array using multiple indices. For example a two-dimensional array A with three rows and four columns might provide access to the element at the 2nd row and 4th column by the expression A[1, 3] (in a row major language) or A[3, 1] (in a column major language) in the case of a zero-based indexing system. Thus two indices are used for a two-dimensional array, three for a three-dimensional array, and n for an n-dimensional array. The number of indices needed to specify an element is called the dimension, dimensionality, or rank of the array. In standard arrays, each index is restricted to a certain range of consecutive integers (or consecutive values of some enumerated type), and the address of an element is computed by a "linear" formula on the indices.

One-dimensional arrays A one-dimensional array (or single dimension array) is a type of linear array. Accessing its elements involves a single subscript which can either represent a row or column index. As an example consider the C declaration int anArrayName[10]; Syntax : datatype anArrayname[sizeofArray]; In the given example the array can contain 10 elements of any value available to the int type. In C, the array element indices are 0-9 inclusive in this case. For example, the expressions anArrayName[0] and anArrayName[9] are the first and last elements respectively. For a vector with linear addressing, the element with index i is located at the address B + c · i, where B is a fixed base address and c a fixed constant, sometimes called the address increment or stride. If the valid element indices begin at 0, the constant B is simply the address of the first element of the array. For this reason, the C programming language specifies that array indices always begin at 0; and many programmers will call

75

Array data structure that element "zeroth" rather than "first". However, one can choose the index of the first element by an appropriate choice of the base address B. For example, if the array has five elements, indexed 1 through 5, and the base address B is replaced by B + 30c, then the indices of those same elements will be 31 to 35. If the numbering does not start at 0, the constant B may not be the address of any element.

Multidimensional arrays For a two-dimensional array, the element with indices i,j would have address B + c · i + d · j, where the coefficients c and d are the row and column address increments, respectively. More generally, in a k-dimensional array, the address of an element with indices i1, i2, …, ik is B + c1 · i1 + c2 · i2 + … + ck · ik. For example: int a[3][2]; This means that array a has 3 rows and 2 columns, and the array is of integer type. Here we can store 6 elements they are stored linearly but starting from first row linear then continuing with second row. The above array will be stored as a11, a12, a13, a21, a22, a23. This formula requires only k multiplications and k additions, for any array that can fit in memory. Moreover, if any coefficient is a fixed power of 2, the multiplication can be replaced by bit shifting. The coefficients ck must be chosen so that every valid index tuple maps to the address of a distinct element. If the minimum legal value for every index is 0, then B is the address of the element whose indices are all zero. As in the one-dimensional case, the element indices may be changed by changing the base address B. Thus, if a two-dimensional array has rows and columns indexed from 1 to 10 and 1 to 20, respectively, then replacing B by B + c1 - − 3 c1 will cause them to be renumbered from 0 through 9 and 4 through 23, respectively. Taking advantage of this feature, some languages (like FORTRAN 77) specify that array indices begin at 1, as in mathematical tradition; while other languages (like Fortran 90, Pascal and Algol) let the user choose the minimum value for each index.

Dope vectors The addressing formula is completely defined by the dimension d, the base address B, and the increments c1, c2, …, ck. It is often useful to pack these parameters into a record called the array's descriptor or stride vector or dope vector. The size of each element, and the minimum and maximum values allowed for each index may also be included in the dope vector. The dope vector is a complete handle for the array, and is a convenient way to pass arrays as arguments to procedures. Many useful array slicing operations (such as selecting a sub-array, swapping indices, or reversing the direction of the indices) can be performed very efficiently by manipulating the dope vector.

Compact layouts Often the coefficients are chosen so that the elements occupy a contiguous area of memory. However, that is not necessary. Even if arrays are always created with contiguous elements, some array slicing operations may create non-contiguous sub-arrays from them. There are two systematic compact layouts for a two-dimensional array. For example, consider the matrix

In the row-major order layout (adopted by C for statically declared arrays), the elements in each row are stored in consecutive positions and all of the elements of a row have a lower address than any of the elements of a consecutive row:

76

Array data structure

77 1 2 3 4 5 6 7 8 9

In column-major order (traditionally used by Fortran), the elements in each column are consecutive in memory and all of the elements of a column have a lower address than any of the elements of a consecutive column: 1 4 7 2 5 8 3 6 9

For arrays with three or more indices, "row major order" puts in consecutive positions any two elements whose index tuples differ only by one in the last index. "Column major order" is analogous with respect to the first index. In systems which use processor cache or virtual memory, scanning an array is much faster if successive elements are stored in consecutive positions in memory, rather than sparsely scattered. Many algorithms that use multidimensional arrays will scan them in a predictable order. A programmer (or a sophisticated compiler) may use this information to choose between row- or column-major layout for each array. For example, when computing the product A·B of two matrices, it would be best to have A stored in row-major order, and B in column-major order.

Array resizing Static arrays have a size that is fixed when they are created and consequently do not allow elements to be inserted or removed. However, by allocating a new array and copying the contents of the old array to it, it is possible to effectively implement a dynamic version of an array; see dynamic array. If this operation is done infrequently, insertions at the end of the array require only amortized constant time. Some array data structures do not reallocate storage, but do store a count of the number of elements of the array in use, called the count or size. This effectively makes the array a dynamic array with a fixed maximum size or capacity; Pascal strings are examples of this.

Non-linear formulas More complicated (non-linear) formulas are occasionally used. For a compact two-dimensional triangular array, for instance, the addressing formula is a polynomial of degree 2.

Efficiency Both store and select take (deterministic worst case) constant time. Arrays take linear (O(n)) space in the number of elements n that they hold. In an array with element size k and on a machine with a cache line size of B bytes, iterating through an array of n elements requires the minimum of ceiling(nk/B) cache misses, because its elements occupy contiguous memory locations. This is roughly a factor of B/k better than the number of cache misses needed to access n elements at random memory locations. As a consequence, sequential iteration over an array is noticeably faster in practice than iteration over many other data structures, a property called locality of reference (this does not mean however, that using a perfect hash or trivial hash within the same (local) array, will not be even faster - and achievable in constant time). Libraries provide low-level optimized facilities for copying ranges of memory (such as memcpy) which can be used to move contiguous blocks of array elements significantly faster than can be achieved through individual element access. The speedup of such optimized routines varies by array element size, architecture, and implementation. Memory-wise, arrays are compact data structures with no per-element overhead. There may be a per-array overhead, e.g. to store index bounds, but this is language-dependent. It can also happen that elements stored in an array require less memory than the same elements stored in individual variables, because several array elements can be stored in a single word; such arrays are often called packed arrays. An extreme (but commonly used) case is the bit array, where every bit represents a single element. A single octet can thus hold up to 256 different combinations of up to 8

Array data structure

78

different conditions, in the most compact form. Array accesses with statically predictable access patterns are a major source of data parallelism.

Efficiency comparison with other data structures Linked list Array

Dynamic Balanced array tree

Random access list

Indexing

Θ(n)

Θ(1)

Θ(1)

Θ(log n)

Θ(log n)

Insert/delete at beginning

Θ(1)

N/A

Θ(n)

Θ(log n)

Θ(1)

Insert/delete at end

Θ(n) last element is unknown Θ(1) last element is known

N/A

Insert/delete in middle

search time + [4][5][6] Θ(1)

N/A

Wasted space (average)

Θ(n)

0

Θ(1) amortized

Θ(log n) Θ(log n) updating

Θ(n)

Θ(log n) Θ(log n) updating

Θ(n)

Θ(n)

Θ(n)

Growable arrays are similar to arrays but add the ability to insert and delete elements; adding and deleting at the end is particularly efficient. However, they reserve linear (Θ(n)) additional storage, whereas arrays do not reserve additional storage. Associative arrays provide a mechanism for array-like functionality without huge storage overheads when the index values are sparse. For example, an array that contains values only at indexes 1 and 2 billion may benefit from using such a structure. Specialized associative arrays with integer keys include Patricia tries, Judy arrays, and van Emde Boas trees. Balanced trees require O(log n) time for indexed access, but also permit inserting or deleting elements in O(log n) time,[7] whereas growable arrays require linear (Θ(n)) time to insert or delete elements at an arbitrary position. Linked lists allow constant time removal and insertion in the middle but take linear time for indexed access. Their memory use is typically worse than arrays, but is still linear. An Iliffe vector is an alternative to a multidimensional array structure. It uses a one-dimensional array of references to arrays of one dimension less. For two dimensions, in particular, this alternative structure would be a vector of pointers to vectors, one for each row. Thus an element in row i and column j of an array A would be accessed by double indexing (A[i][j] in typical notation). This alternative structure allows ragged or jagged arrays, where each row may have a different size — or, in general, where the valid range of each index depends on the values of all preceding indices. It also saves one multiplication (by the column address increment) replacing it by a bit shift (to index the vector of row pointers) and one extra memory access (fetching the row address), which may be worthwhile in some architectures.

Array data structure

Meaning of dimension The dimension of an array is the number of indices needed to select an element. Thus, if the array is seen as a function on a set of possible index combinations, it is the dimension of the space of which its domain is a discrete subset. Thus a one-dimensional array is a list of data, a two-dimensional array a rectangle of data, a three-dimensional array a block of data, etc. This should not be confused with the dimension of the set of all matrices with a given domain, that is, the number of elements in the array. For example, an array with 5 rows and 4 columns is two-dimensional, but such matrices form a 20-dimensional space. Similarly, a three-dimensional vector can be represented by a one-dimensional array of size three.

References [1] David R. Richardson (2002), The Book on Data Structures. iUniverse, 112 pages. ISBN 0-595-24039-9, ISBN 978-0-595-24039-5. [2] T. Veldhuizen. Arrays in Blitz++. In Proc. of the 2nd Int. Conf. on Scientific Computing in Object-Oriented Parallel Environments (ISCOPE), LNCS 1505, pages 223-220. Springer, 1998. [3] Donald Knuth, The Art of Computer Programming, vol. 3. Addison-Wesley [4] Gerald Kruse. CS 240 Lecture Notes (http:/ / www. juniata. edu/ faculty/ kruse/ cs240/ syllabus. htm): Linked Lists Plus: Complexity Trade-offs (http:/ / www. juniata. edu/ faculty/ kruse/ cs240/ linkedlist2. htm). Juniata College. Spring 2008. [5] Day 1 Keynote - Bjarne Stroustrup: C++11 Style (http:/ / channel9. msdn. com/ Events/ GoingNative/ GoingNative-2012/ Keynote-Bjarne-Stroustrup-Cpp11-Style) at GoingNative 2012 on channel9.msdn.com from minute 45 or foil 44 [6] Number crunching: Why you should never, ever, EVER use linked-list in your code again (http:/ / kjellkod. wordpress. com/ 2012/ 02/ 25/ why-you-should-never-ever-ever-use-linked-list-in-your-code-again/ ) at kjellkod.wordpress.com [7] Counted B-Tree (http:/ / www. chiark. greenend. org. uk/ ~sgtatham/ algorithms/ cbtree. html)

FIFO FIFO is an acronym for First In, First Out, a method for organizing and manipulating a data buffer, or data stack, where the oldest entry, or 'bottom' of the stack, is processed first. It is analagous to processing a queue with first-come, first-served (FCFS) behaviour: where the people leave the queue in the order in which they arrive. FCFS is also the jargon term for the FIFO operating system scheduling algorithm, which gives every process CPU time in the order in which it is demanded. FIFO's opposite is LIFO, Last-In-First-Out, where the youngest entry or 'top of the stack' is processed first. A priority queue is neither FIFO or LIFO but may adopt similar behaviour temporarily or by default. Queueing theory encompasses these methods for processing data structures, as well as interactions between strict-FIFO queues.

Computer science Data structure A typical data structure in the C++ language will look like struct fifo_node { struct fifo_node *next; value_type value; }; class fifo

79

FIFO

80

{ fifo_node *front; fifo_node *back; fifo_node *dequeue(void) { fifo_node *tmp = front; if ( front != NULL ) front = front->next; else back = NULL; return tmp; } queue(value) { fifo_node *tempNode = new fifo_node; tempNode->value = value; if ( front == NULL ) { front = tempNode; back = tempNode; } else { back->next = tempNode; back = tempNode; } } }; (For information on the abstract data structure, see Queue. For details of a common implementation, see Circular buffer.) Popular Unix systems include a sys/queue.h C/C++ header file which provides macros usable by applications which need to create FIFO queues.

Head or tail first Controversy over the terms "head" and "tail" exists in reference to FIFO queues. To many people, items should enter a queue at the tail, remain in the queue until they reach the head and leave the queue from there. This point of view is justified by analogy with queues of people waiting for some kind of service and parallels the use of "front" and "back" in the above example. Other people believe that objects enter a queue at the head and leave at the tail, in the manner of food passing through a snake. Queues written in that way appear in places that might be considered authoritative, such as the GNU/Linux operating system.

FIFO

81

Pipes In computing environments that support the pipes and filters model for interprocess communication, a FIFO is another name for a named pipe.

Disk scheduling Disk controllers can use the FIFO as a disk scheduling algorithm to determine the order to service disk I/O requests.

Communications and networking Communications bridges, switches and routers used in Computer networks use FIFOs to hold data packets in route to their next destination. Typically at least one FIFO structure is used per network connection. Some devices feature multiple FIFOs for simultaneously and independently queuing different types of information.

Electronics FIFOs are commonly used in electronic circuits for buffering and flow control which is from hardware to software. In its hardware form, a FIFO primarily consists of a set of read and write pointers, storage and control logic. Storage may be SRAM, flip-flops, latches or any other suitable form of storage. For FIFOs of non-trivial size, a dual-port SRAM is usually used, where one port is dedicated to writing and the other to reading. A synchronous FIFO is a FIFO where the same clock is used for both reading and writing. An asynchronous FIFO uses different clocks for reading and writing. Asynchronous FIFOs introduce metastability issues. A common implementation of an asynchronous FIFO uses a Gray code (or any unit distance code) for the read and write pointers to ensure reliable flag generation. One further note concerning flag generation is that one must necessarily use pointer arithmetic to generate flags for asynchronous FIFO implementations. Conversely, one may use either a "leaky bucket" approach or pointer arithmetic to generate flags in synchronous FIFO implementations. Examples of FIFO status flags include: full, empty, almost full, almost empty, etc. The first known FIFO implemented in electronics was done by Peter Alfke in 1969 at Fairchild Semiconductors. Peter Alfke was later a Director at Xilinx.

FIFO full/empty A hardware FIFO is used for synchronization purposes. It is often implemented as a circular queue, and thus has two pointers: 1. Read Pointer/Read Address Register 2. Write Pointer/Write Address Register Read and write addresses are initially both at the first memory location and the FIFO queue is Empty. FIFO Empty When the read address register reaches the write address register, the FIFO triggers the Empty signal. FIFO FULL When the write address register reaches the read address register, the FIFO triggers the FULL signal. In both cases, the read and write addresses end up being equal. To distinguish between the two situations, a simple and robust solution is to add one extra bit for each read and write address which is inverted each time the address wraps. With this set up, the conditions are: FIFO Empty When the read address register equals the write address register, the FIFO is empty. FIFO FULL

FIFO

82 When the read address LSBs equal the write address LSBs and the extra MSBs are different, the FIFO is full.

Notes and references • Cummings et al., Simulation and Synthesis Techniques for Asynchronous FIFO Design with Asynchronous Pointer Comparisons, SNUG San Jose 2002 (http://www.sunburst-design.com/papers/ CummingsSNUG2002SJ_FIFO2.pdf) • Ronen Perry & Tal Zarsky, Queues in Law (http://ssrn.com/abstract=2147333), Iowa Law Review (August 10, 2012)

Queue (abstract data type) In computer science, a queue (/ˈkjuː/ KEW) is a particular kind of abstract data type or collection in which the entities in the collection are kept in order and the principal (or only) operations on the collection are the addition of entities to the rear terminal position, known as enqueue, and removal of entities from the front terminal position, known as dequeue. This makes the queue a First-In-First-Out (FIFO) data structure. In a FIFO data structure, the first element added to the queue will be the first one to be removed. This is equivalent to the Representation of a Queue with FIFO (First In First Out) property requirement that once a new element is added, all elements that were added before have to be removed before the new element can be removed. Often a peek or front operation is also entered, returning the value of the front element without dequeuing it. A queue is an example of a linear data structure, or more abstractly a sequential collection. Queues provide services in computer science, transport, and operations research where various entities such as data, objects, persons, or events are stored and held to be processed later. In these contexts, the queue performs the function of a buffer. Queues are common in computer programs, where they are implemented as data structures coupled with access routines, as an abstract data structure or in object-oriented languages as classes. Common implementations are circular buffers and linked lists.

Queue implementation Theoretically, one characteristic of a queue is that it does not have a specific capacity. Regardless of how many elements are already contained, a new element can always be added. It can also be empty, at which point removing an element will be impossible until a new element has been added again. Fixed length arrays are limited in capacity, but it is not true that items need to be copied towards the head of the queue. The simple trick of turning the array into a closed circle and letting the head and tail drift around endlessly in that circle makes it unnecessary to ever move items stored in the array. If n is the size of the array, then computing indices modulo n will turn the array into a circle. This is still the conceptually simplest way to construct a queue in a high level language, but it does admittedly slow things down a little, because the array indices must be compared to

Queue (abstract data type) zero and the array size, which is comparable to the time taken to check whether an array index is out of bounds, which some languages do, but this will certainly be the method of choice for a quick and dirty implementation, or for any high level language that does not have pointer syntax. The array size must be declared ahead of time, but some implementations simply double the declared array size when overflow occurs. Most modern languages with objects or pointers can implement or come with libraries for dynamic lists. Such data structures may have not specified fixed capacity limit besides memory constraints. Queue overflow results from trying to add an element onto a full queue and queue underflow happens when trying to remove an element from an empty queue. A bounded queue is a queue limited to a fixed number of items. There are several efficient implementations of FIFO queues. An efficient implementation is one that can perform the operations—enqueuing and dequeuing—in O(1) time. • Linked list • A doubly linked list has O(1) insertion and deletion at both ends, so is a natural choice for queues. • A regular singly linked list only has efficient insertion and deletion at one end. However, a small modification—keeping a pointer to the last node in addition to the first one—will enable it to implement an efficient queue. • A deque implemented using a modified dynamic array

Queues and programming languages Queues may be implemented as a separate data type, or may be considered a special case of a double-ended queue (deque) and not implemented separately. For example, Perl and Ruby allow pushing and popping an array from both ends, so one can use push and shift functions to enqueue and dequeue a list (or, in reverse, one can use unshift and pop), although in some cases these operations are not efficient. C++'s Standard Template Library provides a "queue" templated class which is restricted to only push/pop operations. Since J2SE5.0, Java's library contains a Queue [1] interface that specifies queue operations; implementing classes include LinkedList [2] and (since J2SE 1.6) ArrayDeque [3]. PHP has an SplQueue [4] class and third party libraries like beanstalk'd and Gearman.

References General • Donald Knuth. The Art of Computer Programming, Volume 1: Fundamental Algorithms, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89683-4. Section 2.2.1: Stacks, Queues, and Deques, pp. 238–243. • Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 10.1: Stacks and queues, pp. 200–204. • William Ford, William Topp. Data Structures with C++ and STL, Second Edition. Prentice Hall, 2002. ISBN 0-13-085850-1. Chapter 8: Queues and Priority Queues, pp. 386–390. • Adam Drozdek. Data Structures and Algorithms in C++, Third Edition. Thomson Course Technology, 2005. ISBN 0-534-49182-0. Chapter 4: Stacks and Queues, pp. 137–169. Citations [1] [2] [3] [4]

http:/ / download. oracle. com/ javase/ 7/ docs/ api/ java/ util/ Queue. html http:/ / download. oracle. com/ javase/ 7/ docs/ api/ java/ util/ LinkedList. html http:/ / download. oracle. com/ javase/ 7/ docs/ api/ java/ util/ ArrayDeque. html http:/ / www. php. net/ manual/ en/ class. splqueue. php

83

Queue (abstract data type)

84

External links • Queues with algo and 'c' programe (http://scanftree.com/Data_Structure/Queues) • STL Quick Reference (http://www.halpernwightsoftware.com/stdlib-scratch/quickref.html#containers14) • VBScript implementation of stack, queue, deque, and Red-Black Tree (http://www.ludvikjerabek.com/ downloads.html) Paul E. Black, Bounded queue (http:/ / www. nist. gov/ dads/ HTML/ boundedqueue. html) at the NIST Dictionary of Algorithms and Data Structures.

LIFO LIFO may refer to:

Queues • FIFO and LIFO accounting • LIFO (computing) • LIFO (education) a layoff policy

Other • LIFO (magazine), a magazine published in Greece __DISAMBIG__

Stack (abstract data type) In computer science, a stack is a particular kind of abstract data type or collection in which the principal (or only) operations on the collection are the addition of an entity to the collection, known as push and removal of an entity, known as pop. The relation between the push and pop operations is such that the stack is a Last-In-First-Out (LIFO) data structure. In a LIFO data structure, the last element added to the structure must be the first one to be removed. This is equivalent to the requirement that, considered as a linear data structure, or more abstractly a sequential collection, the push and pop operations occur only at one end of the structure, referred to as the top of the stack. Often a peek or top operation is also implemented, returning the value of the top element without removing it.

Simple representation of a stack

A stack may be implemented to have a bounded capacity. If the stack is full and does not contain enough space to accept an entity to be pushed, the stack is then considered to be in an overflow state. The pop operation removes an item from the top of the stack. A pop either reveals previously concealed items or results in an empty stack, but, if the stack is empty, it goes into underflow state, which means no items are present in stack to be removed. A stack is a restricted data structure, because only a small number of operations are performed on it. The nature of the pop and push operations also means that stack elements have a natural order. Elements are removed from the stack in the reverse order to the order of their addition. Therefore, the lower elements are those that have been on the

Stack (abstract data type) stack the longest.[1]

History The stack was first proposed in 1946, in the computer design of Alan M. Turing (who used the terms "bury" and "unbury") as a means of calling and returning from subroutines.Wikipedia:Please clarify The Germans Klaus Samelson and Friedrich L. Bauer of Technical University Munich proposed the idea in 1955 and filed a patent in 1957. The same concept was developed, independently, by the Australian Charles Leonard Hamblin in the first half of 1957.[2]

Abstract definition A stack is a basic computer science data structure and can be defined in an abstract, implementation-free manner, or it can be generally defined as a linear list of items in which all additions and deletion are restricted to one end that is Top. This is a VDM (Vienna Development Method) description of a stack:[3] Function signatures: init: -> Stack push: N x Stack -> Stack top: Stack -> (N U ERROR) pop: Stack -> Stack isempty: Stack -> Boolean (where N indicates an element (natural numbers in this case), and U indicates set union) Semantics: top(init()) = ERROR top(push(i,s)) = i pop(init()) = init() pop(push(i, s)) = s isempty(init()) = true isempty(push(i, s)) = false

Inessential operations In many implementations, a stack has more operations than "push" and "pop". An example is "top of stack", or "peek", which observes the top-most element without removing it from the stack.[4] Since this can be done with a "pop" and a "push" with the same data, it is not essential. An underflow condition can occur in the "stack top" operation if the stack is empty, the same as "pop". Often implementations have a function which just returns if the stack is empty.

Software stacks Implementation In most high level languages, a stack can be easily implemented either through an array or a linked list. What identifies the data structure as a stack in either case is not the implementation but the interface: the user is only allowed to pop or push items onto the array or linked list, with few other helper operations. The following will demonstrate both implementations, using C.

85

Stack (abstract data type) Array The array implementation aims to create an array where the first element (usually at the zero-offset) is the bottom. That is, array[0] is the first element pushed onto the stack and the last element popped off. The program must keep track of the size, or the length of the stack. The stack itself can therefore be effectively implemented as a two-element structure in C: typedef struct { size_t size; int items[STACKSIZE]; } STACK; The push() operation is used both to initialize the stack, and to store values to it. It is responsible for inserting (copying) the value into the ps->items[] array and for incrementing the element counter (ps->size). In a responsible C implementation, it is also necessary to check whether the array is already full to prevent an overrun. void push(STACK *ps, int x) { if (ps->size == STACKSIZE) { fputs("Error: stack overflow\n", stderr); abort(); } else ps->items[ps->size++] = x; } The pop() operation is responsible for removing a value from the stack, and decrementing the value of ps->size. A responsible C implementation will also need to check that the array is not already empty. int pop(STACK *ps) { if (ps->size == 0){ fputs("Error: stack underflow\n", stderr); abort(); } else return ps->items[--ps->size]; } If we use a dynamic array, then we can implement a stack that can grow or shrink as much as needed. The size of the stack is simply the size of the dynamic array. A dynamic array is a very efficient implementation of a stack, since adding items to or removing items from the end of a dynamic array is amortized O(1) time.

86

Stack (abstract data type)

87

Linked list The linked-list implementation is equally simple and straightforward. In fact, a simple singly linked list is sufficient to implement a stack—it only requires that the head node or element can be removed, or popped, and a node can only be inserted by becoming the new head node. Unlike the array implementation, our structure typedef corresponds not to the entire stack structure, but to a single node: typedef struct stack { int data; struct stack *next; } STACK; Such a node is identical to a typical singly linked list node, at least to those that are implemented in C. The push() operation both initializes an empty stack, and adds a new node to a non-empty one. It works by receiving a data value to push onto the stack, along with a target stack, creating a new node by allocating memory for it, and then inserting it into a linked list as the new head: void push(STACK **head, int value) { STACK *node = malloc(sizeof(STACK));

/* create a new node */

if (node == NULL){ fputs("Error: no space available for node\n", stderr); abort(); } else { /* initialize node */ node->data = value; node->next = empty(*head) ? NULL : *head; /* insert new head if any */ *head = node; } } A pop() operation removes the head from the linked list, and assigns the pointer to the head to the previous second node. It checks whether the list is empty before popping from it: int pop(STACK **head) { if (empty(*head)) { /* stack is empty */ fputs("Error: stack underflow\n", stderr); abort(); } else { //pop a node STACK *top = *head; int value = top->data; *head = top->next; free(top); return value; } }

Stack (abstract data type)

Stacks and programming languages Some languages, like LISP and Python, do not call for stack implementations, since push and pop functions are available for any list. All Forth-like languages (such as Adobe PostScript) are also designed around language-defined stacks that are directly visible to and manipulated by the programmer. Examples from Common Lisp: (setf list (list 'a 'b 'c)) ;; ⇒ (A B C) (pop list) ;; ⇒ A list ;; ⇒ (B C) (push 'new list) ;; ⇒ (NEW B C) C++'s Standard Template Library provides a "stack" templated class which is restricted to only push/pop operations. Java's library contains a Stack [5] class that is a specialization of Vector [6]---this could be considered a design flaw, since the inherited get() method from Vector [6] ignores the LIFO constraint of the Stack [5]. PHP has an SplStack [7] class.

88

Stack (abstract data type)

89

Hardware stacks A common use of stacks at the architecture level is as a means of allocating and accessing memory.

Basic architecture of a stack A typical stack is an area of computer memory with a fixed origin and a variable size. Initially the size of the stack is zero. A stack pointer, usually in the form of a hardware register, points to the most recently referenced location on the stack; when the stack has a size of zero, the stack pointer points to the origin of the stack. The two operations applicable to all stacks are: • a push operation, in which a data item is placed at the location pointed to by the stack pointer, and the address in the stack pointer is adjusted by the size of the data item; • a pop or pull operation: a data item at the current location pointed to by the stack pointer is removed, and the stack pointer is adjusted by the size of the data item. There are many variations on the basic principle of stack operations. Every stack has a fixed location in memory at which it begins. As data items are added to the stack, the stack pointer is displaced to indicate the current extent of the stack, which expands away from the origin.

A typical stack, storing local data and call information for nested procedure calls (not necessarily nested procedures!). This stack grows downward from its origin. The stack pointer points to the current topmost datum on the stack. A push operation decrements the pointer and copies the data to the stack; a pop operation copies data from the stack and then increments the pointer. Each procedure called in the program stores procedure return information (in yellow) and local data (in other colors) by pushing them onto the stack. This type of stack implementation is extremely common, but it is vulnerable to buffer overflow attacks (see the text).

Stack pointers may point to the origin of a stack or to a limited range of addresses either above or below the origin (depending on the direction in which the stack grows); however, the stack pointer cannot cross the origin of the stack. In other words, if the origin of the stack is at address 1000 and the stack grows downwards (towards addresses 999, 998, and so on), the stack pointer must never be incremented beyond 1000 (to 1001, 1002, etc.). If a pop operation on the stack causes the stack pointer to move past the origin of the stack, a stack underflow occurs. If a push operation causes the stack pointer to increment or decrement beyond the maximum extent of the stack, a stack overflow occurs. Some environments that rely heavily on stacks may provide additional operations, for example: • Duplicate: the top item is popped, and then pushed again (twice), so that an additional copy of the former top item is now on top, with the original below it. • Peek: the topmost item is inspected (or returned), but the stack pointer is not changed, and the stack size does not change (meaning that the item remains on the stack). This is also called top operation in many articles.

Stack (abstract data type)

90

• Swap or exchange: the two topmost items on the stack exchange places. • Rotate (or Roll): the n topmost items are moved on the stack in a rotating fashion. For example, if n=3, items 1, 2, and 3 on the stack are moved to positions 2, 3, and 1 on the stack, respectively. Many variants of this operation are possible, with the most common being called left rotate and right rotate. Stacks are often visualized growing from the bottom up (like real-world stacks). They may also be visualized growing from left to right, so that "topmost" becomes "rightmost", or even growing from top to bottom. The important feature is that the top of the stack is in a fixed position. The image to the right is an example of a top to bottom growth visualization: the top (28) is the stack 'bottom', since the stack 'top' is where items are pushed or popped from. Sometimes stacks are also visualized metaphorically, such as coin holders or Pez dispensers. A right rotate will move the first element to the third position, the second to the first and the third to the second. Here are two equivalent visualizations of this process: apple banana cucumber cucumber banana apple

===right rotate==>

banana cucumber apple

===left rotate==>

apple cucumber banana

A stack is usually represented in computers by a block of memory cells, with the "bottom" at a fixed location, and the stack pointer holding the address of the current "top" cell in the stack. The top and bottom terminology are used irrespective of whether the stack actually grows towards lower memory addresses or towards higher memory addresses. Pushing an item on to the stack adjusts the stack pointer by the size of the item (either decrementing or incrementing, depending on the direction in which the stack grows in memory), pointing it to the next cell, and copies the new top item to the stack area. Depending again on the exact implementation, at the end of a push operation, the stack pointer may point to the next unused location in the stack, or it may point to the topmost item in the stack. If the stack points to the current topmost item, the stack pointer will be updated before a new item is pushed onto the stack; if it points to the next available location in the stack, it will be updated after the new item is pushed onto the stack. Popping the stack is simply the inverse of pushing. The topmost item in the stack is removed and the stack pointer is updated, in the opposite order of that used in the push operation.

Hardware support Stack in main memory Most CPUs have registers that can be used as stack pointers. Processor families like the x86, Z80, 6502, and many others have special instructions that implicitly use a dedicated (hardware) stack pointer to conserve opcode space. Some processors, like the PDP-11 and the 68000, also have special addressing modes for implementation of stacks, typically with a semi-dedicated stack pointer as well (such as A7 in the 68000). However, in most processors, several different registers may be used as additional stack pointers as needed (whether updated via addressing modes or via add/sub instructions).

Stack (abstract data type) Stack in registers or dedicated memory The x87 floating point architecture is an example of a set of registers organised as a stack where direct access to individual registers (relative the current top) is also possible. As with stack-based machines in general, having the top-of-stack as an implicit argument allows for a small machine code footprint with a good usage of bus bandwidth and code caches, but it also prevents some types of optimizations possible on processors permitting random access to the register file for all (two or three) operands. A stack structure also makes superscalar implementations with register renaming (for speculative execution) somewhat more complex to implement, although it is still feasible, as exemplified by modern x87 implementations. Sun SPARC, AMD Am29000, and Intel i960 are all examples of architectures using register windows within a register-stack as another strategy to avoid the use of slow main memory for function arguments and return values. There are also a number of small microprocessors that implements a stack directly in hardware and some microcontrollers have a fixed-depth stack that is not directly accessible. Examples are the PIC microcontrollers, the Computer Cowboys MuP21, the Harris RTX line, and the Novix NC4016. Many stack-based microprocessors were used to implement the programming language Forth at the microcode level. Stacks were also used as a basis of a number of mainframes and mini computers. Such machines were called stack machines, the most famous being the Burroughs B5000.

Applications Stacks have numerous applications. We see stacks in everyday life, from the books in our library, to the blank sheets of paper in our printer tray. All of them follow the Last In First Out (LIFO) logic, that is when we add a book to a pile of books, we add it to the top of the pile, whereas when we remove a book from the pile, we generally remove it from the top of the pile. Given below are a few applications of stacks in the world of computers:

Converting a decimal number into a binary number The logic for transforming a decimal number into a binary number is as follows: 1. Read a number 2. Iteration (while number is greater than zero) 1. Find out the remainder after dividing the number by 2 2. Print the remainder 3. End the iteration However, there is a problem with this Decimal to binary conversion of 23 logic. Suppose the number, whose binary form we want to find, is 23. Using this logic, we get the result as 11101, instead of getting 10111. To solve this problem, we use a stack. We make use of the LIFO property of the stack. Initially we push the binary digit formed into the stack, instead of printing it directly. After the entire number has been converted into the binary form, we pop one digit at a time from the stack and print it. Therefore we get the decimal number converted into its proper binary form. Algorithm:

91

Stack (abstract data type)

92

function outputInBinary(Integer n) Stack s = new Stack while n > 0 do Integer bit = n modulo 2 s.push(bit) if s is full then return error end if n = floor(n / 2) end while while s is not empty do output(s.pop()) end while end function

Towers of Hanoi One of the most interesting applications of stacks can be found in solving a puzzle called Tower of Hanoi. According to an old Brahmin story, the existence of the universe is calculated in terms of the time taken by a number of monks, who are working all the time, to move 64 disks from one pole to another. But there are some rules about how this should be done, which are:

Towers of Hanoi

1. move only one disk at a time. 2. for temporary storage, a third pole may be used. 3. a disk of larger diameter may not be placed on a disk of smaller diameter. For algorithm of this puzzle see Tower of Hanoi. Assume that A is the first tower, B is the second tower, and C is the third tower.

Stack (abstract data type)

93

Tower of Hanoi

Towers of Hanoi example, steps 1-2

Towers of Hanoi example, steps 3-4

Stack (abstract data type)

Towers of Hanoi example, steps 5-6

Towers of Hanoi example, steps 7-8

94

Stack (abstract data type)

95

Output: (when there are 3 disks) Let 1 be the smallest disk, 2 be the disk of medium size and 3 be the largest disk. Move disk From peg To peg 1

A

C

2

A

B

1

C

B

3

A

C

1

B

A

2

B

C

1

A

C

The C++ code for this solution can be implemented in two ways: First implementation (using stacks implicitly by recursion) #include void TowersofHanoi(int n, int a, int b, int c) { if(n > 0) { TowersofHanoi(n-1, a, c, b); //recursion printf("> Move top disk from tower %d to tower %d.\n", a, b); TowersofHanoi(n-1, c, b, a); //recursion } } Second implementation (using stacks explicitly) // Global variable , tower [1:3] are three towers arrayStack tower[4]; void TowerofHanoi(int n) { // Preprocessor for moveAndShow. for (int d = n; d > 0; d--) tower[1].push(d); moveAndShow(n, 1, 2, 3); tower 3 using

//initialize //add disk d to tower 1 /*move n disks from tower 1 to tower 2 as intermediate tower*/

} void moveAndShow(int n, int a, int b, int c) { // Move the top n disks from tower a to tower b showing states. // Use tower c for intermediate storage.

Stack (abstract data type)

96

if(n > 0) { moveAndShow(n-1, a, c, b); int d = tower[a].top(); x to top of tower[a].pop(); tower[c].push(d); showState(); moveAndShow(n-1, b, a, c); } }

//recursion //move a disc from top of tower //tower y //show state of 3 towers //recursion

However complexity for above written implementations is

. So it's obvious that problem can only be solved

for small values of n (generally ). In case of the monks, the number of turns taken to transfer 64 disks, by following the above rules, will be 18,446,744,073,709,551,615; which will surely take a lot of time!

Expression evaluation and syntax parsing Calculators employing reverse Polish notation use a stack structure to hold values. Expressions can be represented in prefix, postfix or infix notations and conversion from one form to another may be accomplished using a stack. Many compilers use a stack for parsing the syntax of expressions, program blocks etc. before translating into low level code. Most programming languages are context-free languages, allowing them to be parsed with stack based machines. Evaluation of an infix expression that is fully parenthesized Input: (((2 * 5) - (1 * 2)) / (11 - 9)) Output: 4 Analysis: Five types of input characters 1. 2. 3. 4. 5.

Opening bracket Numbers Operators Closing bracket New line character

Data structure requirement: A character stack Algorithm 1. Read one input character 2. Actions at end of each input Opening brackets

(2.1)

Push into stack and then Go to step (1)

Number

(2.2)

Push into stack and then Go to step (1)

Operator

(2.3)

Push into stack and then Go to step (1)

Closing brackets

(2.4)

Pop from character stack

(2.4.1) if it is opening bracket, then discard it, Go to step (1) (2.4.2) Pop is used four times The first popped element is assigned to op2 The second popped element is assigned to op The third popped element is assigned to op1

Stack (abstract data type)

97 The fourth popped element is the remaining opening bracket, which can be discarded Evaluate op1 op op2 Convert the result into character and push into the stack Go to step (2.4)

New line character

(2.5)

Pop from stack and print the answer STOP

Result: The evaluation of the fully parenthesized infix expression is printed as follows: Input String: (((2 * 5) - (1 * 2)) / (11 - 9)) Input Symbol Stack (from bottom to top)

Operation

(

(

(

((

(

(((

2

(((2

*

(((2*

5

(((2*5

)

( ( 10

-

( ( 10 -

(

( ( 10 - (

1

( ( 10 - ( 1

*

( ( 10 - ( 1 *

2

( ( 10 - ( 1 * 2

)

( ( 10 - 2

1 * 2 = 2 & Push

)

(8

10 - 2 = 8 & Push

/

(8/

(

(8/(

11

( 8 / ( 11

-

( 8 / ( 11 -

9

( 8 / ( 11 - 9

)

(8/2

11 - 9 = 2 & Push

)

4

8 / 2 = 4 & Push

New line

Empty

Pop & Print

2 * 5 = 10 and push

Stack (abstract data type)

98

Evaluation of infix expression which is not fully parenthesized Input: (2 * 5 - 1 * 2) / (11 - 9) Output: 4 Analysis There are five types of input characters which are: 1. 2. 3. 4. 5.

Opening parentheses Numbers Operators Closing parentheses New line character (\n)

We do not know what to do if an operator is read as an input character. By implementing the priority rule for operators, we have a solution to this problem. The Priority rule: we should perform a comparative priority check if an operator is read, and then push it. If the stack top contains an operator of priority higher than or equal to the priority of the input operator, then we pop it and print it. We keep on performing the priority check until the top of stack either contains an operator of lower priority or if it does not contain an operator. Data Structure Requirement for this problem: a character stack and an integer stack Algorithm: 1. Read an input character 2. Actions that will be performed at the end of each input Opening parentheses Number Operator

(2.1) (2.2)

Push it into character stack and then Go to step (1)

(2.3)

Push into integer stack, Go to step (1) Do the comparative priority check

(2.3.1) if the character stack's top contains an operator with equal or higher priority, then pop it into op Pop a number from integer stack into op2 Pop another number from integer stack into op1 Calculate op1 op op2 and push the result into the integer stack Closing parentheses

(2.4)

Pop from the character stack

(2.4.1) if it is an opening parentheses, then discard it and Go to step (1) (2.4.2) To op, assign the popped element Pop a number from integer stack and assign it op2 Pop another number from integer stack and assign it to op1 Calculate op1 op op2 and push the result into the integer stack Convert into character and push into stack Go to the step (2.4) New line character

(2.5)

Print the result after popping from the stack STOP

Result: The evaluation of an infix expression that is not fully parenthesized is printed as follows: Input String: (2 * 5 - 1 * 2) / (11 - 9)

Stack (abstract data type)

99

Input Symbol Character Stack (from bottom to top) Integer Stack (from bottom to top) (

(

2

(

*

(*

5

(*

-

(*

Operation performed

2 Push as * has higher priority 25 Since '-' has less priority, we do 2 * 5 = 10

(-

10

1

(-

10 1

*

(-*

10 1

2

(-*

10 1 2

)

(-

10 2

Perform 1 * 2 = 2 and push it

(

8

Pop - and 10 - 2 = 8 and push, Pop (

/

/

8

(

/(

8

11

/(

8 11

-

/(-

8 11

9

/(-

8 11 9

)

/

82

Perform 11 - 9 = 2 and push it

4

Perform 8 / 2 = 4 and push it

4

Print the output, which is 4

New line

We push 10 and then push '-'

Push * as it has higher priority

Evaluation of prefix expression Input: / - * 2 5 * 1 2 - 11 9 Output: 4 Analysis There are three types of input characters 1. Numbers 2. Operators 3. New line character (\n) Data structure requirement: a character stack and an integer stack Algorithm: 1. Read one character input at a time and keep pushing it into the character stack until the new line character is reached 2. Perform pop from the character stack. If the stack is empty, go to step (3) Number

(2.1) Push in to the integer stack and then go to step (1)

Operator

(2.2)

Assign the operator to op Pop a number from

integer stack and assign it to op1

Pop another number from integer stack and assign it to op2 Calculate op1 op op2 and push the output into the integer stack. Go to step (2) 3. Pop the result from the integer stack and display the result

Stack (abstract data type)

100

Result: the evaluation of prefix expression is printed as follows: Input String: / - * 2 5 * 1 2 - 11 9 Input Symbol Character Stack (from bottom to top) Integer Stack (from bottom to top) Operation performed /

/

-

/-

*

/-*

2

/-*2

5

/-*25

*

/-*25*

1

/-*25*1

2

/-*25*12

-

/-*25*12-

11

/ - * 2 5 * 1 2 - 11

9

/ - * 2 5 * 1 2 - 11 9

\n

/ - * 2 5 * 1 2 - 11

9

/-*25*12-

9 11

/-*25*12

2

/-*25*1

22

/-*25*

221

/-*25

22

/-*2

225

/-*

2252

/-

2 2 10

5 * 2 = 10

/

28

10 - 2 = 8

Stack is empty

4

8/2=4

Stack is empty

Print 4

11 - 9 = 2

1*2=2

Evaluation of postfix expression The calculation: 1 + 2 * 4 + 3 can be written down like this in postfix notation with the advantage of no precedence rules and parentheses needed: 1 2 4 * + 3 + The expression is evaluated from the left to right using a stack: 1. when encountering an operand: push it 2. when encountering an operator: pop two operands, evaluate the result and push it. Like the following way (the Stack is displayed after Operation has taken place):

Stack (abstract data type)

101

Input

Operation

Stack (after op)

1

Push operand 1

2

Push operand 2, 1

4

Push operand 4, 2, 1

*

Multiply

8, 1

+

Add

9

3

Push operand 3, 9

+

Add

12

The final result, 12, lies on the top of the stack at the end of the calculation. Example in C #include int main() { int a[100], i; printf("To pop enter -1\n"); for(i = 0;;) { printf("Push "); scanf("%d", &a[i]); if(a[i] == -1) { if(i == 0) { printf("Underflow\n"); } else { printf("pop = %d\n", a[--i]); } } else { i++; } } }

Stack (abstract data type)

102

Evaluation of postfix expression (Pascal) This is an implementation in Pascal, using marked sequential file as data archives. { programmer : clx321 file : stack.pas unit : Pstack.tpu } program TestStack; {this program uses ADT of Stack, I will assume that the unit of ADT of Stack has already existed} uses PStack;

{ADT of STACK}

{dictionary} const mark = '.'; var data : stack; f : text; cc : char; ccInt, cc1, cc2 : integer; {functions} IsOperand (cc : char) : boolean; {JUST Prototype} {return TRUE if cc is operand} ChrToInt (cc : char) : integer; {JUST Prototype} {change char to integer} Operator (cc1, cc2 : integer) : integer; {JUST Prototype} {operate two operands} {algorithms} begin assign (f, cc); reset (f); read (f, cc); {first if (cc = mark) then begin writeln ('empty end else begin repeat if (IsOperand begin ccInt :=

elmt}

archives !');

(cc)) then ChrToInt (cc);

Stack (abstract data type)

103

push (ccInt, data); end else begin pop (cc1, data); pop (cc2, data); push (data, Operator (cc2, cc1)); end; read (f, cc); {next elmt} until (cc = mark); end; close (f); end }

Conversion of an Infix expression that is fully parenthesized into a Postfix expression Input: (((8 + 1) - (7 - 4)) / (11 - 9)) Output: 8 1 + 7 4 - - 11 9 - / Analysis: There are five types of input characters which are: * * * * *

Opening parentheses Numbers Operators Closing parentheses New line character (\n)

Requirement: A character stack Algorithm: 1. Read an character input 2. Actions to be performed at end of each input Opening parentheses

(2.1)

Push into stack and then Go to step (1)

Number

(2.2)

Print and then Go to step (1)

Operator

(2.3)

Push into stack and then Go to step (1)

Closing parentheses

(2.4)

Pop it from the stack

(2.4.1) If it is an operator, print it, Go to step (2.4) (2.4.2) If the popped element is an opening parentheses, discard it and go to step (1) New line character

(2.5)

STOP

Therefore, the final output after conversion of an infix expression to a postfix expression is as follows:

Stack (abstract data type)

Input

104

Operation

Stack (after op)

Output on monitor

(

(2.1) Push operand into stack

(

(

(2.1) Push operand into stack

((

(

(2.1) Push operand into stack

(((

8

(2.2) Print it

+

(2.3) Push operator into stack

1

(2.2) Print it

)

(2.4) Pop from the stack: Since popped element is '+' print it

(((

81+

(2.4) Pop from the stack: Since popped element is '(' we ignore it and read next character

((

81+

-

(2.3) Push operator into stack

((-

(

(2.1) Push operand into stack

((-(

7

(2.2) Print it

-

(2.3) Push the operator in the stack

4

(2.2) Print it

)

(2.4) Pop from the stack: Since popped element is '-' print it

((-(

(2.4) Pop from the stack: Since popped element is '(' we ignore it and read next character

((-

(2.4) Pop from the stack: Since popped element is '-' print it

((

(2.4) Pop from the stack: Since popped element is '(' we ignore it and read next character

(

/

(2.3) Push the operand into the stack

(/

(

(2.1) Push into the stack

(/(

11

(2.2) Print it

-

(2.3) Push the operand into the stack

9

(2.2) Print it

)

(2.4) Pop from the stack: Since popped element is '-' print it

(/(

(2.4) Pop from the stack: Since popped element is '(' we ignore it and read next character

(/

(2.4) Pop from the stack: Since popped element is '/' print it

(

(2.4) Pop from the stack: Since popped element is '(' we ignore it and read next character

Stack is empty

)

)

New line character

(2.5) STOP

8 (((+

8 81

81+7 ((-(81+74 81+74-

81+74--

8 1 + 7 4 - - 11 (/(8 1 + 7 4 - - 11 9 8 1 + 7 4 - - 11 9 -

8 1 + 7 4 - - 11 9 - /

Stack (abstract data type)

Rearranging railroad cars Problem Description This is one useful application of stacks. Consider that a freight train has n railroad cars, each to be left at different station. They're numbered 1 through n and freight train visits these stations in order n through 1. Obviously, the railroad cars are labeled by their destination. To facilitate removal of the cars from the train, we must rearrange them in ascending order of their number (i.e. 1 through n). When cars are in this order, they can be detached at each station. We rearrange cars at a shunting yard that has input track, output track and k holding tracks between input & output tracks (i.e. holding track). Solution Strategy To rearrange cars, we examine the cars on the input from front to back. If the car being examined is next one in the output arrangement, we move it directly to output track. If not, we move it to the holding track & leave it there until it's time to place it to the output track. The holding tracks operate in a LIFO manner as the cars enter & leave these tracks from top. When rearranging cars only following moves are permitted: • A car may be moved from front (i.e. right end) of the input track to the top of one of the holding tracks or to the left end of the output track. • A car may be moved from the top of holding track to left end of the output track. The figure shows a shunting yard with k = 3, holding tracks H1, H2 & H3, also n = 9. The n cars of freight train begin in the input track & are to end up in the output track in order 1 through n from right to left. The cars initially are in the order 5,8,1,7,4,2,9,6,3 from back to front. Later cars are rearranged in desired order.

105

Stack (abstract data type)

106

A Three Track Example • Consider the input arrangement from figure, here we note that the car 3 is at the front, so it can't be output yet, as it to be preceded by cars 1 & 2. So car 3 is detached & moved to holding track H1. • The next car 6 can't be output & it is moved to holding track H2. Because we have to output car 3 before car 6 & this will not possible if we move car 6 to holding track H1. • Now it's obvious that we move car 9 to H3. The requirement of rearrangement of cars on any holding track is that the cars should be preferred to arrange in ascending order from top to bottom. • So car 2 is now moved to holding track H1 so that it satisfies the previous statement. If we move car 2 to H2 or H3, then we've no place to move cars 4,5,7,8.The least restrictions on future car placement arise when the new car λ is moved to the holding track that has a car at its top with smallest label Ψ such that λ < Ψ. We may call it an assignment rule to decide whether a particular car belongs to a specific holding track. • When car 4 is considered, there are three places to move the car H1,H2,H3. The top of these tracks are 2,6,9.So using above mentioned Assignment rule, we move car 4 to H2. • The car 7 is moved to H3. • The next car 1 has the least label, so it's moved to output track. • Now it's time for car 2 & 3 to output which are from H1(in short all the cars from H1 are appended to car 1 on output track).

Railroad cars example

The car 4 is moved to output track. No other cars can be moved to output track at this time. • The next car 8 is moved to holding track H1. • Car 5 is output from input track. Car 6 is moved to output track from H2, so is the 7 from H3,8 from H1 & 9 from H3. []

Backtracking Another important application of stacks is backtracking. Consider a simple example of finding the correct path in a maze. There are a series of points, from the starting point to the destination. We start from one point. To reach the final destination, there are several paths. Suppose we choose a random path. After following a certain path, we realise that the path we have chosen is wrong. So we need to find a way by which we can return to the beginning of that path. This can be done with the use of stacks. With the help of stacks, we remember the point where we have reached. This is done by pushing that point into the stack. In case we end up on the wrong path, we can pop the last point from the stack and thus return to the last point and continue our quest to find the right path. This is called backtracking.

Stack (abstract data type)

Quicksort Sorting means arranging the list of elements in a particular order. In case of numbers, it could be in ascending order, or in the case of letters, alphabetic order. Quicksort is an algorithm of the divide and conquer type. In this method, to sort a set of numbers, we reduce it to two smaller sets, and then sort these smaller sets. This can be explained with the help of the following example: Suppose A is a list of the following numbers:

In the reduction step, we find the final position of one of the numbers. In this case, let us assume that we have to find the final position of 48, which is the first number in the list. To accomplish this, we adopt the following method. Begin with the last number, and move from right to left. Compare each number with 48. If the number is smaller than 48, we stop at that number and swap it with 48. In our case, the number is 24. Hence, we swap 24 and 48.

The numbers 96 and 72 to the right of 48, are greater than 48. Now beginning with 24, scan the numbers in the opposite direction, that is from left to right. Compare every number with 48 until you find a number that is greater than 48. In this case, it is 60. Therefore we swap 48 and 60.

Note that the numbers 12, 24 and 36 to the left of 48 are all smaller than 48. Now, start scanning numbers from 60, in the right to left direction. As soon as you find lesser number, swap it with 48. In this case, it is 44. Swap it with 48. The final result is:

107

Stack (abstract data type)

108

Now, beginning with 44, scan the list from left to right, until you find a number greater than 48. Such a number is 84. Swap it with 48. The final result is:

Now, beginning with 84, traverse the list from right to left, until you reach a number lesser than 48. We do not find such a number before reaching 48. This means that all the numbers in the list have been scanned and compared with 48. Also, we notice that all numbers less than 48 are to the left of it, and all numbers greater than 48, are to its right. The final partitions look as follows:

Therefore, 48 has been placed in its proper position and now our task is reduced to sorting the two partitions. This above step of creating partitions can be repeated with every partition containing 2 or more elements. As we can process only a single partition at a time, we should be able to keep track of the other partitions, for future processing. This is done by using two stacks called LOWERBOUND and UPPERBOUND, to temporarily store these partitions. The addresses of the first and last elements of the partitions are pushed into the LOWERBOUND and UPPERBOUND stacks respectively. Now, the above reduction step is applied to the partitions only after its boundary values are popped from the stack. We can understand this from the following example: Take the above list A with 12 elements. The algorithm starts by pushing the boundary values of A, that is 1 and 12 into the LOWERBOUND and UPPERBOUND stacks respectively. Therefore the stacks look as follows: LOWERBOUND:

1

UPPERBOUND:

12

To perform the reduction step, the values of the stack top are popped from the stack. Therefore, both the stacks become empty. LOWERBOUND:

{empty}

UPPERBOUND: {empty}

Now, the reduction step causes 48 to be fixed to the 5th position and creates two partitions, one from position 1 to 4 and the other from position 6 to 12. Hence, the values 1 and 6 are pushed into the LOWERBOUND stack and 4 and

Stack (abstract data type)

109

12 are pushed into the UPPERBOUND stack. LOWERBOUND:

1, 6

UPPERBOUND: 4, 12

For applying the reduction step again, the values at the stack top are popped. Therefore, the values 6 and 12 are popped. Therefore the stacks look like: LOWERBOUND:

1

UPPERBOUND: 4

The reduction step is now applied to the second partition, that is from the 6th to 12th element.

After the reduction step, 98 is fixed in the 11th position. So, the second partition has only one element. Therefore, we push the upper and lower boundary values of the first partition onto the stack. So, the stacks are as follows: LOWERBOUND:

1, 6

UPPERBOUND:

4, 10

The processing proceeds in the following way and ends when the stacks do not contain any upper and lower bounds of the partition to be processed, and the list gets sorted.

The Stock Span Problem In the stock span problem, we will solve a financial problem with the help of stacks. Suppose, for a stock, we have a series of n daily price quotes, the span of the stock's price on a given day is defined as the maximum number of consecutive days just before the given day, for which the price of the stock on the current day is less than or equal to its price on the given day. Let, Price(i) = price of the stock on day "i". Then, Span(i) = Max{k : k>=0 and Price(j)Price(i), then Span(i)=0. An algorithm which has Quadratic Time Complexity Input: An array P with n elements Output: An array S of n elements such that S[i] is the largest integer k such that k