Algorithms and Data Structures

59 downloads 0 Views 12MB Size Report
was introduced by Karl Weierstrass in 1841. ..... It is a straightforward deduction from Wilson's theorem that[30] ...... logarithms-tutorial) ...... Not until 1882, with Ferdinand von Lindemann's proof of the transcendence of π, was squaring the circle ...
Algorithms and Data Structures Part 2: Basic Mathematics and Geometry Basics (Wikipedia Book 2014)

By Wikipedians

Editors: Reiner Creutzburg, Jenny Knackmuß

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 22 Dec 2013 13:24:17 UTC

Contents Articles Basic Mathematics

1

Absolute value

1

Floor and ceiling functions

8

Greatest common divisor

18

Euclidean algorithm

25

Power function

49

Nth root

49

Factorial

57

Stirling's approximation

69

Exponential function

75

Logarithm

82

Equivalence relation

100

Modular arithmetic

107

Multiplicative group of integers modulo n

112

Proofs of Fermat's little theorem

117

Geometry Basics

127

Thales' theorem

127

Pythagorean theorem

132

Lune of Hippocrates

153

Arbelos

155

Salinon

157

Mean Values

159

Average

159

Arithmetic mean

164

Geometric mean theorem

167

Harmonic mean

169

Inequality of arithmetic and geometric means

174

References Article Sources and Contributors

183

Image Sources, Licenses and Contributors

187

Article Licenses License

190

1

Basic Mathematics Absolute value In mathematics, the absolute value (or modulus) |x| of a real number x is the non-negative value of x without regard to its sign. Namely, |x| = x for a positive x, |x| = −x for a negative x, and |0| = 0. For example, the absolute value of 3 is 3, and the absolute value of −3 is also 3. The absolute value of a number may be thought of as its distance from zero. The absolute value of a number may be thought

Generalisations of the absolute value for real numbers occur in a wide of as its distance from zero. variety of mathematical settings. For example an absolute value is also defined for the complex numbers, the quaternions, ordered rings, fields and vector spaces. The absolute value is closely related to the notions of magnitude, distance, and norm in various mathematical and physical contexts.

Terminology and notation Jean-Robert Argand introduced the term "module", meaning 'unit of measure' in French, in 1806 specifically for the complex absolute value[1][2] and it was borrowed into English in 1866 as the Latin equivalent "modulus". The term "absolute value" has been used in this sense since at least 1806 in French[3] and 1857 in English.[4] The notation |x| was introduced by Karl Weierstrass in 1841.[5] Other names for absolute value include "the numerical value" and "the magnitude". The same notation is used with sets to denote cardinality; the meaning depends on context.

Definition and properties Real numbers For any real number x the absolute value or modulus of x is denoted by |x| (a vertical bar on each side of the quantity) and is defined as[6]

As can be seen from the above definition, the absolute value of x is always either positive or zero, but never negative. From an analytic geometry point of view, the absolute value of a real number is that number's distance from zero along the real number line, and more generally the absolute value of the difference of two real numbers is the distance between them. Indeed the notion of an abstract distance function in mathematics can be seen to be a generalisation of the absolute value of the difference (see "Distance" below). Since the square root notation without sign represents the positive square root, it follows that (1)

which is sometimes used as a definition of absolute value.[7] The absolute value has the following four fundamental properties:

Absolute value

2

(2) Non-negativity (3) Positive-definiteness (4) Multiplicativeness (5) Subadditivity

Other important properties of the absolute value include:

(if

(6)

Idempotence (the absolute value of the absolute value is the absolute value)

(7)

Evenness (reflection symmetry of the graph)

(8)

Identity of indiscernibles (equivalent to positive-definiteness)

(9)

Triangle inequality (equivalent to subadditivity)

(10) Preservation of division (equivalent to multiplicativeness)

)

(11) (equivalent to subadditivity)

Two other useful properties concerning inequalities are:

or These relations may be used to solve inequalities involving absolute values. For example:

Absolute value is used to define the absolute difference, the standard metric on the real numbers.

Absolute value

3

Complex numbers Since the complex numbers are not ordered, the definition given above for the real absolute value cannot be directly generalised for a complex number. However the geometric interpretation of the absolute value of a real number as its distance from 0 can be generalised. The absolute value of a complex number is defined as its distance in the complex plane from the origin using the Pythagorean theorem. More generally the absolute value of the difference of two complex numbers is equal to the distance between those two complex numbers. For any complex number

where x and y are real numbers, the absolute value or modulus of z is denoted |z| and is given by

When the complex part y is zero this is the same as the absolute value of the real number x. When a complex number z is expressed in polar form as

The absolute value of a complex number z is the distance r from z to the origin. It is also seen in the picture that z and its complex conjugate z have the same absolute value.

with r ≥ 0 and θ real, its absolute value is . The absolute value of a complex number can be written in the complex analogue of equation (1) above as:

where

is the complex conjugate of z.

The complex absolute value shares all the properties of the real absolute value given in equations (2)–(11) above. Since the positive reals form a subgroup of the complex numbers under multiplication, we may think of absolute value as an endomorphism of the multiplicative group of the complex numbers.

Absolute value

4

Absolute value function The real absolute value function is continuous everywhere. It is differentiable everywhere except for x = 0. It is monotonically decreasing on the interval (−∞,0] and monotonically increasing on the interval [0,+∞). Since a real number and its opposite have the same absolute value, it is an even function, and is hence not invertible. Both the real and complex functions are idempotent. It is a piecewise linear, convex function.

The graph of the absolute value function for real numbers

Relationship to the sign function The absolute value function of a real number returns its value irrespective of its sign, whereas the sign (or signum) function returns a number's sign irrespective of its value. The following equations show the relationship between these two functions:

and for x ≠ 0,

Derivative The real absolute value function has a derivative for every x ≠ 0, but is not differentiable at x = 0. Its derivative for x ≠ 0 is given by the step function[8][9] Composition of absolute value with a cubic function in different orders

The subdifferential of |x| at x = 0 is the interval [−1,1].[10] The complex absolute value function is continuous everywhere but complex differentiable nowhere because it violates the Cauchy–Riemann equations. The second derivative of |x| with respect to x is zero everywhere except zero, where it does not exist. As a generalised function, the second derivative may be taken as two times the Dirac delta function.

Absolute value

Antiderivative The antiderivative (indefinite integral) of the absolute value function is

where C is an arbitrary constant of integration.

Distance The absolute value is closely related to the idea of distance. As noted above, the absolute value of a real or complex number is the distance from that number to the origin, along the real number line, for real numbers, or in the complex plane, for complex numbers, and more generally, the absolute value of the difference of two real or complex numbers is the distance between them. The standard Euclidean distance between two points

and

in Euclidean n-space is defined as:

This can be seen to be a generalisation of |a − b|, since if a and b are real, then by equation (1),

While if

and

are complex numbers, then

The above shows that the "absolute value" distance for the real numbers or the complex numbers, agrees with the standard Euclidean distance they inherit as a result of considering them as the one and two-dimensional Euclidean spaces respectively. The properties of the absolute value of the difference of two real or complex numbers: non-negativity, identity of indiscernibles, symmetry and the triangle inequality given above, can be seen to motivate the more general notion of a distance function as follows: A real valued function d on a set X × X is called a metric (or a distance function) on X, if it satisfies the following four axioms:[11]

5

Absolute value

6

Non-negativity Identity of indiscernibles Symmetry Triangle inequality

Generalizations Ordered rings The definition of absolute value given for real numbers above can be extended to any ordered ring. That is, if a is an element of an ordered ring R, then the absolute value of a, denoted by |a|, is defined to be:[12]

where −a is the additive inverse of a, and 0 is the additive identity element.

Fields The fundamental properties of the absolute value for real numbers given in (2)–(5) above, can be used to generalise the notion of absolute value to an arbitrary field, as follows. A real-valued function v on a field F is called an absolute value (also a modulus, magnitude, value, or valuation)[13] if it satisfies the following four axioms: Non-negativity Positive-definiteness Multiplicativeness Subadditivity or the triangle inequality

Where 0 denotes the additive identity element of F. It follows from positive-definiteness and multiplicativeness that v(1) = 1, where 1 denotes the multiplicative identity element of F. The real and complex absolute values defined above are examples of absolute values for an arbitrary field. If v is an absolute value on F, then the function d on F × F, defined by d(a, b) = v(a − b), is a metric and the following are equivalent: • d satisfies the ultrametric inequality • • • •

for all x, y, z in F.

is bounded in R. for every for all for all

An absolute value which satisfies any (hence all) of the above conditions is said to be non-Archimedean, otherwise it is said to be Archimedean.[14]

Absolute value

7

Vector spaces Again the fundamental properties of the absolute value for real numbers can be used, with a slight modification, to generalise the notion to an arbitrary vector space. A real-valued function on a vector space V over a field F, represented as usually a norm, if it satisfies the following axioms:

‖·‖, is called an absolute value, but more

For all a in F, and v, u in V, Non-negativity Positive-definiteness Positive homogeneity or positive scalability Subadditivity or the triangle inequality

The norm of a vector is also called its length or magnitude. In the case of Euclidean space Rn, the function defined by

is a norm called the Euclidean norm. When the real numbers R are considered as the one-dimensional vector space R1, the absolute value is a norm, and is the p-norm (see Lp space) for any p. In fact the absolute value is the "only" norm on R1, in the sense that, for every norm ‖·‖ on R1, ‖x‖ = ‖1‖ ⋅ |x|. The complex absolute value is a special case of the norm in an inner product space. It is identical to the Euclidean norm, if the complex plane is identified with the Euclidean plane R2.

Notes [1] Oxford English Dictionary, Draft Revision, June 2008 [2] Nahin (http:/ / www. amazon. com/ gp/ reader/ 0691027951), O'Connor and Robertson (http:/ / www-history. mcs. st-andrews. ac. uk/ Mathematicians/ Argand. html), and functions.Wolfram.com. (http:/ / functions. wolfram. com/ ComplexComponents/ Abs/ 35/ ); for the French sense, see Littré, 1877 [3] Lazare Nicolas M. Carnot, Mémoire sur la relation qui existe entre les distances respectives de cinq point quelconques pris dans l'espace, p. 105 at Google Books (http:/ / books. google. com/ books?id=YyIOAAAAQAAJ& pg=PA105) [4] James Mill Peirce, A Text-book of Analytic Geometry at Google Books (http:/ / books. google. com/ books?id=RJALAAAAYAAJ& pg=PA42). The oldest citation in the 2nd edition of the Oxford English Dictionary is from 1907. The term "absolute value" is also used in contrast to "relative value". [5] Nicholas J. Higham, Handbook of writing for the mathematical sciences, SIAM. ISBN 0-89871-420-6, p. 25 [6] Mendelson, p. 2 (http:/ / books. google. com/ books?id=A8hAm38zsCMC& pg=PA2). [7] , p. A5 [8] Weisstein, Eric W. Absolute Value. From MathWorld – A Wolfram Web Resource. (http:/ / mathworld. wolfram. com/ AbsoluteValue. html) [9] Bartel and Sherbert, p. 163 [10] Peter Wriggers, Panagiotis Panatiotopoulos, eds., New Developments in Contact Problems, 1999, ISBN 3-211-83154-1, p. 31–32 (http:/ / books. google. com/ books?id=tiBtC4GmuKcC& pg=PA31) [11] These axioms are not minimal; for instance, non-negativity can be derived from the other three: . [12] Mac Lane, p. 264 (http:/ / books. google. com/ books?id=L6FENd8GHIUC& pg=PA264). [13] Shechter, p. 260 (http:/ / books. google. com/ books?id=eqUv3Bcd56EC& pg=PA260). This meaning of valuation is rare. Usually, a valuation is the logarithm of the inverse of an absolute value [14] Shechter, pp. 260–261 (http:/ / books. google. com/ books?id=eqUv3Bcd56EC& pg=PA260).

Absolute value

8

References • Bartle; Sherbert; Introduction to real analysis (4th ed.), John Wiley & Sons, 2011 ISBN 978-0-471-43331-6. • Nahin, Paul J.; An Imaginary Tale (http://www.amazon.com/gp/reader/0691027951); Princeton University Press; (hardcover, 1998). ISBN 0-691-02795-1. • Mac Lane, Saunders, Garrett Birkhoff, Algebra, American Mathematical Soc., 1999. ISBN 978-0-8218-1646-2. • Mendelson, Elliott, Schaum's Outline of Beginning Calculus, McGraw-Hill Professional, 2008. ISBN 978-0-07-148754-2. • O'Connor, J.J. and Robertson, E.F.; "Jean Robert Argand" (http://www-history.mcs.st-andrews.ac.uk/ Mathematicians/Argand.html). • Schechter, Eric; Handbook of Analysis and Its Foundations, pp. 259–263, "Absolute Values" (http://books. google.com/books?id=eqUv3Bcd56EC&pg=PA259), Academic Press (1997) ISBN 0-12-622760-8.

External links • Hazewinkel, Michiel, ed. (2001), "Absolute value" (http://www.encyclopediaofmath.org/index.php?title=p/ a010370), Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4 • absolute value (http://planetmath.org/encyclopedia/AbsoluteValue.html) at PlanetMath • Weisstein, Eric W., " Absolute Value (http://mathworld.wolfram.com/AbsoluteValue.html)", MathWorld.

Floor and ceiling functions Floor and ceiling functions

Floor function

Ceiling function In mathematics and computer science, the floor and ceiling functions map a real number to the largest previous or the smallest following integer, respectively. More precisely, floor(x) = is the largest integer not greater than x and ceiling(x) =

is the smallest integer not less than x.[1]

Floor and ceiling functions

9

Notation Carl Friedrich Gauss introduced the square bracket notation [2]

for the floor function in his third proof of quadratic

[3]

reciprocity (1808). This remained the standard in mathematics until Kenneth E. Iverson introduced the names "floor" and "ceiling" and the corresponding notations and in his 1962 book A Programming Language.[4][5] Both notations are now used in mathematics;[6] this article follows Iverson. The floor function is also called the greatest integer or entier (French for "integer") function, and its value at x is called the integral part or integer part of x; for negative values of x the latter terms are sometimes instead taken to be the value of the ceiling function, i.e., the value of x rounded to an integer towards 0. The language APL uses ⌊x; other computer languages commonly use notations like entier(x) (Algol), INT(x) (BASIC), or floor(x)(C, C++, R, and Python).[7] In mathematics, it can also be written with boldface or double brackets .[8] The ceiling function is usually denoted by ceil(x) or ceiling(x) in non-APL computer languages that have a notation for this function. The J Programming Language, a follow on to APL that is designed to use standard keyboard symbols, uses >. for ceiling and b if b > a Complexity of Euclidean method The existence of the Euclidean algorithm places (the decision problem version of) the greatest common divisor problem in P, the class of problems solvable in polynomial time. The GCD problem is not known to be in NC, and so there is no known way to parallelize its computation across many processors; nor is it known to be P-complete, which would imply that it is unlikely to be possible to parallelize GCD computation. In this sense the GCD problem is analogous to e.g. the integer factorization problem, which has no known polynomial-time algorithm, but is not known to be NP-complete. Shallcross et al. showed that a related problem (EUGCD, determining the remainder sequence arising during the Euclidean algorithm) is NC-equivalent to the problem of integer linear programming with two variables; if either problem is in NC or is P-complete, the other is as well. Since NC contains NL, it is also unknown whether a space-efficient algorithm for computing the GCD exists, even for nondeterministic Turing machines. Although the problem is not known to be in NC, parallel algorithms with time superior to the Euclidean algorithm exist; the best known deterministic algorithm is by Chor and Goldreich, which (in the CRCW-PRAM model) can solve the problem in O(n/log n) time with n1+ε processors. Randomized algorithms can solve the problem in O((log

Greatest common divisor

21

n)2) time on

processors (note this is superpolynomial).

Binary method An alternative method of computing the gcd is the binary gcd method which uses only subtraction and division by 2. In outline the method is as follows: Let a and b be the two non negative integers. Also set the integer d to 1. There are now four possibilities: • Both a and b are even. In this case 2 is a common factor. Divide both a and b by 2, double d, and continue. • a is even and b is odd. In this case 2 is not a common factor. Divide a by 2 and continue. • a is odd and b is even. Like the previous case 2 is not a common factor. Divide b by 2 and continue. • Both a and b are odd. Without loss of generality, assume that for a and b as they are now, a ≥ b. In this case let c = (a − b)/2. Then gcd(a,b) = gcd(a,c) = gcd(b,c). Because b ≤ a it is usually easier (and computationally faster) to determine gcd(b,c). If computing this algorithm by hand, gcd(b,c) may be apparent. Otherwise continue the algorithm until c = 0. Note that the gcd of the original a and b is still d times larger than the gcd of the odd a and odd b above. For further details see Binary GCD algorithm. Example: a = 48, b = 18, d = 1 → 24, 9, 2 → 12, 9, 2 → 6, 9, 2 → 3, 9, 2 → c = 3; since gcd(9,3) = 3, the gcd originally sought is d times larger, namely 6.

Other methods If a and b are not both zero, the greatest common divisor of a and b can be computed by using least common multiple (lcm) of a and b: , but more commonly the lcm is computed from the gcd. Using Thomae's function f,

which generalizes to a and b rational or commensurate reals. Keith Slavin has shown that for odd a ≥ 1:

which is a function that can be evaluated for complex b. Wolfgang Schramm has shown that

is an entire function in the variable b for all positive integers a where cd(k) is Ramanujan's sum. Donald Knuth proved the following reduction:

for non-negative integers a and b, where a and b are not both zero. More generally

Greatest common divisor

which can be proven by considering the Euclidean algorithm in base n. Another useful identity relates

22

to

the Euler's totient function:

Properties • Every common divisor of a and b is a divisor of gcd(a, b). • gcd(a, b), where a and b are not both zero, may be defined alternatively and equivalently as the smallest positive integer d which can be written in the form d = a·p + b·q, where p and q are integers. This expression is called Bézout's identity. Numbers p and q like this can be computed with the extended Euclidean algorithm. • a, for a ≠ 0, since any number is a divisor of 0, and the greatest divisor of a is |a|. This is usually used as the base case in the Euclidean algorithm. • If a divides the product b·c, and gcd(a, b) = d, then a/d divides c. • gcd(a, b·c) = 1 if and only if gcd(a, b) = 1 and gcd(a, c) = 1. • If m is a non-negative integer, then gcd(m·a, m·b) = m·gcd(a, b). • If m is any integer, then gcd(a + m·b, b) = gcd(a, b). • If m is a nonzero common divisor of a and b, then gcd(a/m, b/m) = gcd(a, b)/m. • The gcd is a multiplicative function in the following sense: if a1 and a2 are relatively prime, then gcd(a1·a2, b) = gcd(a1, b)·gcd(a2, b). • The gcd is a commutative function: gcd(a, b) = gcd(b, a). • The gcd is an associative function: gcd(a, gcd(b, c)) = gcd(gcd(a, b), c). • The gcd of three numbers can be computed as gcd(a, b, c) = gcd(gcd(a, b), c), or in some different way by applying commutativity and associativity. This can be extended to any number of numbers. • gcd(a, b) is closely related to the least common multiple lcm(a, b): we have gcd(a, b)·lcm(a, b) = a·b. This formula is often used to compute least common multiples: one first computes the gcd with Euclid's algorithm and then divides the product of the given numbers by their gcd. • The following versions of distributivity hold true: gcd(a, lcm(b, c)) = lcm(gcd(a, b), gcd(a, c)) lcm(a, gcd(b, c)) = gcd(lcm(a, b), lcm(a, c)). • It is sometimes useful to define gcd(0, 0) = 0 and lcm(0, 0) = 0 because then the natural numbers become a complete distributive lattice with gcd as meet and lcm as join operation. This extension of the definition is also compatible with the generalization for commutative rings given below. • In a Cartesian coordinate system, gcd(a, b) can be interpreted as the number of points with integral coordinates on the straight line joining the points (0, 0) and (a, b), excluding (0, 0).

Greatest common divisor

Probabilities and expected value In 1972, James E. Nymann showed that k integers, chosen independently and uniformly from {1,...,n}, are coprime with probability 1/ζ(k) as n goes to infinity. (See coprime for a derivation.) This result was extended in 1987 to show that the probability that k random integers has greatest common divisor d is d-k/ζ(k). Using this information, the expected value of the greatest common divisor function can be seen (informally) to not exist when k = 2. In this case the probability that the gcd equals d is d−2/ζ(2), and since ζ(2) = π2/6 we have

This last summation is the harmonic series, which diverges. However, when k ≥ 3, the expected value is well-defined, and by the above argument, it is

For k = 3, this is approximately equal to 1.3684. For k = 4, it is approximately 1.1106.

The gcd in commutative rings The notion of greatest common divisor can more generally be defined for elements of an arbitrary commutative ring, although in general there need not exist one for every pair of elements. If R is a commutative ring, and a and b are in R, then an element d of R is called a common divisor of a and b if it divides both a and b (that is, if there are elements x and y in R such that d·x = a and d·y = b). If d is a common divisor of a and b, and every common divisor of a and b divides d, then d is called a greatest common divisor of a and b. Note that with this definition, two elements a and b may very well have several greatest common divisors, or none at all. If R is an integral domain then any two gcd's of a and b must be associate elements, since by definition either one must divide the other; indeed if a gcd exists, any one of its associates is a gcd as well. Existence of a gcd is not assured in arbitrary integral domains. However if R is a unique factorization domain, then any two elements have a gcd, and more generally this is true in gcd domains. If R is a Euclidean domain in which euclidean division is given algorithmically (as is the case for instance when R = F[X] where F is a field, or when R is the ring of Gaussian integers), then greatest common divisors can be computed using a form of the Euclidean algorithm based on the division procedure. The following is an example of an integral domain with two elements that do not have a gcd:

The elements 2 and 1 + √(−3) are two "maximal common divisors" (i.e. any common divisor which is a multiple of 2 is associated to 2, the same holds for 1 + √(−3)), but they are not associated, so there is no greatest common divisor of a and b. Corresponding to the Bézout property we may, in any commutative ring, consider the collection of elements of the form pa + qb, where p and q range over the ring. This is the ideal generated by a and b, and is denoted simply (a, b). In a ring all of whose ideals are principal (a principal ideal domain or PID), this ideal will be identical with the set of multiples of some ring element d; then this d is a greatest common divisor of a and b. But the ideal (a, b) can be useful even when there is no greatest common divisor of a and b. (Indeed, Ernst Kummer used this ideal as a replacement for a gcd in his treatment of Fermat's Last Theorem, although he envisioned it as the set of multiples of some hypothetical, or ideal, ring element d, whence the ring-theoretic term.)

23

Greatest common divisor

Notes References • Long, Calvin T. (1972), Elementary Introduction to Number Theory (2nd ed.), Lexington: D. C. Heath and Company, LCCN  77-171950 (http://lccn.loc.gov/77-171950) • Pettofrezzo, Anthony J.; Byrkit, Donald R. (1970), Elements of Number Theory, Englewood Cliffs: Prentice Hall, LCCN  77-81766 (http://lccn.loc.gov/77-81766)

Further reading • Donald Knuth. The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89684-2. Section 4.5.2: The Greatest Common Divisor, pp. 333–356. • Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 31.2: Greatest common divisor, pp. 856–862. • Saunders MacLane and Garrett Birkhoff. A Survey of Modern Algebra, Fourth Edition. MacMillan Publishing Co., 1977. ISBN 0-02-310070-2. 1–7: "The Euclidean Algorithm."

External links • greatest common divisor at Everything2.com (http://everything2.com/?node_id=482506) • Greatest Common Measure: The Last 2500 Years (http://www.stepanovpapers.com/gcd.pdf), by Alexander Stepanov

24

Euclidean algorithm

25

Euclidean algorithm In mathematics, the Euclidean algorithm[a], or Euclid's algorithm, is a method for computing the greatest common divisor (GCD) of two (usually positive) integers, also known as the greatest common factor (GCF) or highest common factor (HCF). It is named after the Greek mathematician Euclid, who described it in Books VII and X of his Elements.[1] The GCD of two positive integers is the largest integer that divides both of them without leaving a remainder (the GCD of two integers in general is defined in a more subtle way). In its simplest form, Euclid's algorithm starts with a pair of positive integers, and forms a new pair that consists of the smaller number and the difference between the larger and smaller numbers. The process repeats until the numbers in the pair are equal. That number then is the greatest common divisor of the original pair of integers. The main principle is that the GCD does not change if the smaller number is subtracted from the larger number. For example, the GCD of 252 and 105 is exactly the GCD of 147 (= 252 − 105) and 105. Since the larger of the two numbers is reduced, repeating this process gives successively smaller numbers, so this repetition will necessarily stop sooner or later — when the numbers are equal (if the process is attempted once more, one of the numbers will become 0).

Euclid's method for finding the greatest common divisor (GCD) of two starting lengths BA and DC, both defined to be multiples of a common "unit" length. The length DC being shorter, it is used to "measure" BA, but only once because remainder EA is less than CD. EA now measures (twice) the shorter length DC, with remainder FC shorter than EA. Then FC measures (three times) length EA. Because there is no remainder, the process ends with FC being the GCD. On the right Nicomachus' example with numbers 49 and 21 resulting in their GCD of 7 (derived from Heath 1908:300).

The earliest surviving description of the Euclidean algorithm is in Euclid's Elements (c. 300 BC), making it one of the oldest numerical algorithms still in common use. The original algorithm was described only for natural numbers and geometric lengths (real numbers), but the algorithm was generalized in the 19th century to other types of numbers, such as Gaussian integers and polynomials in one variable. This led to modern abstract algebraic notions, such as Euclidean domains. The Euclidean algorithm has been generalized further to other mathematical structures, such as knots and multivariate polynomials. The algorithm has many theoretical and practical applications. It may be used to generate almost all the most important traditional musical rhythms used in different cultures throughout the world. It is a key element of the RSA algorithm, a public-key encryption method widely used in electronic commerce. It is used to solve Diophantine equations, such as finding numbers that satisfy multiple congruences (Chinese remainder theorem) or multiplicative inverses of a finite field. It can also be used to construct continued fractions, in the Sturm chain method for finding real roots of a polynomial, and in several modern integer factorization algorithms. Finally, it is a basic tool for proving theorems in modern number theory, such as Lagrange's four-square theorem and the fundamental theorem of arithmetic (unique factorization).

Euclidean algorithm

26

If implemented using remainders of Euclidean division rather than subtractions, Euclid's algorithm computes the GCD of large numbers efficiently: it never requires more division steps than five times the number of digits (in base 10) of the smaller integer. This was proved by Gabriel Lamé in 1844, and marks the beginning of computational complexity theory. Methods for improving the algorithm's efficiency were developed in the 20th century. By reversing the steps in the Euclidean algorithm, the GCD can be expressed as a sum of the two original numbers each multiplied by a positive or negative integer, e.g., the GCD of 252 and 105 is 21, and 21 = [5 × 105] + [(−2) × 252]. This important property is known as Bézout's identity.

Background — Greatest common divisor The Euclidean algorithm calculates the greatest common divisor (GCD) of two natural numbers a and b. The greatest common divisor g is the largest natural number that divides both a and b without leaving a remainder. Synonyms for the GCD include the greatest common factor (GCF), the highest common factor (HCF), and the greatest common measure (GCM). The greatest common divisor is often written as gcd(a, b) or, more simply, as (a, b), although the latter notation is also used for other mathematical concepts, such as two-dimensional vectors. If gcd(a, b) = 1, then a and b are said to be coprime (or relatively prime). This property does not imply that a or b are themselves prime numbers. For example, neither 6 nor 35 is a prime number, since they both have two prime factors: 6 = 2 × 3 and 35 = 5 × 7. Nevertheless, 6 and 35 are coprime. No natural number other than 1 divides both 6 and 35, since they have no prime factors in common. Let g = gcd(a, b). Since a and b are both multiples of g, they can be written a = mg and b = ng, and there is no larger number G > g for which this is true. The natural numbers m and n must be coprime, since any common factor could be factored out of m and n to make g greater. Thus, any other number c that divides both a and b must also divide g. The greatest common divisor g of a and b is the unique (positive) common divisor of a and b that is divisible by any other common divisor c. The GCD can be visualized as follows. Consider a rectangular area a by b, and any common divisor c that divides both a and b exactly. The sides of the rectangle can be divided into segments of length c, which divides the rectangle into a grid of squares of side length c. The greatest common divisor g is the largest value of c for which this is possible. For illustration, a 24-by-60 rectangular area can be divided into a grid of: 1-by-1 squares, 2-by-2 squares, 3-by-3 squares, 4-by-4 squares, 6-by-6 squares or 12-by-12 squares. Therefore, 12 is the greatest common divisor of 24 and 60. A 24-by-60 rectangular area can be divided into a grid of 12-by-12 squares, with two squares along one edge (24/12 = 2) and five squares along the other (60/12 = 5). The GCD of two numbers a and b is the product of the prime factors shared by the two numbers, where a same prime factor can be used multiple times, but only as long as the product of these factors divides both a and b. For example, since 1386 can be factored into 2 × 3 × 3 × 7 × 11, and 3213 can be factored into 3 × 3 × 3 × 7 × 17, the greatest common divisor of 1386 and 3213 equals 63 = 3 × 3 × 7, the product of their shared prime factors. If two numbers have no prime factors in common, their greatest common divisor is 1 (obtained here as an instance of the empty product), in other words they are coprime. A key

A 24-by-60 rectangle is covered with ten 12-by-12 square tiles, where 12 is the GCD of 24 and 60. More generally, an a-by-b rectangle can be covered with square tiles of side-length c only if c is a common divisor of a and b.

Euclidean algorithm advantage of the Euclidean algorithm is that it can find the GCD efficiently without having to compute the prime factors. Factorization of large integers is believed to be a computationally very difficult problem, and the security of many modern cryptography systems is based upon its infeasibility. Another definition of the GCD is helpful in advanced mathematics, particularly ring theory. The greatest common divisor g  of two nonzero numbers a and b is also their smallest positive integral linear combination, that is, the smallest positive number of the form ua + vb where u and v are integers. The set of all integral linear combinations of a and b is actually the same as the set of all multiples of g (mg, where m is an integer). In modern mathematical language, the ideal generated by a and b is the ideal generated by g alone (an ideal generated by a single element is called a principal ideal, and all ideals of the integers are principal ideals). Some properties of the GCD are in fact easier to see with this description, for instance the fact that any common divisor of a and b also divides the GCD (it divides both terms of ua + vb). The equivalence of this GCD definition with the other definitions is described below. The GCD of three or more numbers equals the product of the prime factors common to all the numbers, but it can also be calculated by repeatedly taking the GCDs of pairs of numbers. For example, gcd(a, b, c) = gcd(a, gcd(b, c)) = gcd(gcd(a, b), c) = gcd(gcd(a, c), b). Thus, Euclid's algorithm, which computes the GCD of two integers, suffices to calculate the GCD of arbitrarily many integers.

Description The simple form of Euclid's algorithm uses only subtraction and comparison. Starting with a pair of positive integers, form a new pair consisting of the smaller number and the difference between the larger number and the smaller number. This process repeats until the numbers in the new pair are equal to each other; that value is the greatest common divisor of the original pair. If one number is much smaller than the other, many subtraction steps will be needed before the larger number is reduced to a value less than or equal to the other number in the pair. The common form of Euclid's algorithm replaces subtracting the small positive number from the big number (possibly many times) with finding the remainder in long division. This form of Euclid's algorithm also starts with a pair of positive integers, then forms a new pair consisting of the smaller number and the remainder obtained by dividing the larger number by the smaller number. The process repeats until one number is zero. The other number then is the greatest common divisor of the original pair.

Procedure The Euclidean algorithm proceeds in a series of steps such that the output of each step is used as an input for the next one. Let k be an integer that counts the steps of the algorithm, starting with zero. Thus, the initial step corresponds to k = 0, the next step corresponds to k = 1, and so on. Each step begins with two nonnegative remainders rk−1 and rk−2. Since the algorithm ensures that the remainders decrease steadily with every step, rk−1 is less than its predecessor rk−2. The goal of the kth step is to find a quotient qk and remainder rk such that the equation is satisfied rk−2 = qk rk−1 + rk where rk  b a := a − b else b := b − a return a The variables a and b alternate holding the previous remainders rk−1 and rk−2. Assume that a is larger than b at the beginning of an iteration; then a equals rk−2, since rk−2 > rk−1. During the loop iteration, a is reduced by multiples of the previous remainder b until a is smaller than b. Then a is the next remainder rk. Then b is reduced by multiples of a until it is again smaller than a, giving the next remainder rk+1, and so on.

30

Euclidean algorithm The recursive version is based on the equality of the GCDs of successive remainders and the stopping condition gcd(rN−1, 0) = rN−1. function gcd(a, b) if b = 0 return a else return gcd(b, a mod b) For illustration, the gcd(1071, 462) is calculated from the equivalent gcd(462, 1071 mod 462) = gcd(462, 147). The latter GCD is calculated from the gcd(147, 462 mod 147) = gcd(147, 21), which in turn is calculated from the gcd(21, 147 mod 21) = gcd(21, 0) = 21.

Method of least absolute remainders In another version of Euclid's algorithm, the quotient at each step is increased by one if the resulting negative remainder is smaller in magnitude than the typical positive remainder. Previously, the equation rk−2 = qk rk−1 + rk assumed that |rk−1| > rk > 0. However, an alternative negative remainder ek can be computed: rk−2 = (qk + 1) rk−1 + ek if rk−1 > 0 or rk−2 = (qk − 1) rk−1 + ek if rk−1  b > 0, the smallest values of a and b for which this is true are the Fibonacci numbers FN+2 and FN+1, respectively.[12] This can be shown by induction. If N = 1, b divides a with no remainder; the smallest natural numbers for which this is true is b = 1 and a = 2, which are F2 and F3, respectively. Now assume that the result holds for all values of N up to M − 1. The first step of the M-step algorithm is a = q0b + r0, and the second step is b = q1r0 + r1. Since the algorithm is recursive, it required M − 1 steps to find gcd(b, r0) and their smallest values are FM+1 and FM. The smallest value of a is therefore when q0 = 1, which gives a = b + r0 = FM+1 + FM = FM+2. This proof, published by Gabriel Lamé in 1844, represents the beginning of computational complexity theory, and also the first practical application of the Fibonacci numbers. This result suffices to show that the number of steps in Euclid's algorithm can never be more than five times the number of its digits (base 10). For if the algorithm requires N steps, then b is greater than or equal to FN+1 which in turn is greater than or equal to φN−1, where φ is the golden ratio. Since b ≥ φN−1, then N − 1 ≤ logφb. Since log10φ > 1/5, (N − 1)/5  5 is a composite number if and only if

A stronger result is Wilson's theorem, which states that

if and only if p is prime. Adrien-Marie Legendre found that the multiplicity of the prime p occurring in the prime factorization of n! can be expressed exactly as

Factorial

60

This fact is based on counting the number of factors p of the integers from 1 to n. The number of multiples of p in the numbers 1 to n are given by ; however, this formula counts those numbers with two factors of p only once. Hence another

factors of p must be counted too. Similarly for three, four, five factors, to infinity. The sum is

finite since p i can only be less than or equal to n for finitely many values of i, and the floor function results in 0 i when applied for p   > n. The only factorial that is also a prime number is 2, but there are many primes of the form n! ± 1, called factorial primes. All factorials greater than 1! are even, as they are all multiples of 2. Also, all factorials from 5! upwards are multiples of 10 (and hence have a trailing zero as their final digit), because they are multiples of 5 and 2.

Series of reciprocals The reciprocals of factorials produce a convergent series: (see e)

Although the sum of this series is an irrational number, it is possible to multiply the factorials by positive integers to produce a convergent series with a rational sum:

The convergence of this series to 1 can be seen from the fact that its partial sums are less than one by an inverse factorial. Therefore, the factorials do not form an irrationality sequence.

Rate of growth and approximations for large n As n grows, the factorial n! increases faster than all polynomials and exponential functions (but slower than double exponential functions) in n. Most approximations for n! are based on approximating its natural logarithm

The graph of the function f(n) = log n! is shown in the figure on the right. It looks approximately linear for all reasonable values of n, but this intuition is false. We get one of the simplest approximations for log n! by bounding the sum with an integral from above and below as follows:

which gives us the estimate

Plot of the natural logarithm of the factorial

Factorial

61

Hence log n! is Θ(n log n) (see Big O notation). This result plays a key role in the analysis of the computational complexity of sorting algorithms (see comparison sort). From the bounds on log n! deduced above we get that

It is sometimes practical to use weaker but simpler estimates. Using the above formula it is easily shown that for all n we have , and for all n ≥ 6 we have . For large n we get a better estimate for the number n! using Stirling's approximation:

In fact, it can be proved that for all n we have

Another approximation for log n! is given by Srinivasa Ramanujan (Ramanujan 1988)

Thus it is even smaller than the next correction term

of Stirling's formula.

Computation If efficiency is not a concern, computing factorials is trivial from an algorithmic point of view: successively multiplying a variable initialized to 1 by the integers 2 up to n (if any) will compute n!, provided the result fits in the variable. In functional languages, the recursive definition is often implemented directly to illustrate recursive functions. The main practical difficulty in computing factorials is the size of the result. To assure that the exact result will fit for all legal values of even the smallest commonly used integral type (8-bit signed integers) would require more than 700 bits, so no reasonable specification of a factorial function using fixed-size types can avoid questions of overflow. The values 12! and 20! are the largest factorials that can be stored in, respectively, the 32-bit and 64-bit integers commonly used in personal computers. Floating-point representation of an approximated result allows going a bit further, but this also remains quite limited by possible overflow. Most calculators use scientific notation with 2-digit decimal exponents, and the largest factorial that fits is then 69!, because 69!