INFORMATION AND COMPLEXITY (HOW TO MEASURE THEM ...

40 downloads 104789 Views 108KB Size Report
Computer science, also called informatics, is often defined as the theory of storing, processing, and communicating information. The key notion in the theory of ...
INFORMATION AND COMPLEXITY (HOW TO MEASURE THEM?) L´aszl´o Lov´asz Dept. of Computer Science E¨otv¨ os Lor´and University, Budapest, and Princeton University, Princeton, NJ

Computer science, also called informatics, is often defined as the theory of storing, processing, and communicating information. The key notion in the theory of computing is that of the complexity. The basic tasks of computer science (and their variations) lead to various measures of complexity. We may speak of the complexity of a structure, meaning the amount of information (number of bits) in the most economical “blueprint” of the structure; this is the minimum space we need to store enough information about the structure that allows us its reconstruction. We may also speak of the algorithmic complexity of a certain task: this is the minimum time (or other computational resource) needed to carry out this task on a computer. And we may also speak of the communication complexity of tasks involving more than one processor: this is the number of bits that have to be transmitted in solving this task (I will not discuss this last notion in these notes). It is important to emphasize that the notion of the theory of computing (algorithms, encodings, machine models, complexity) can be defined and measured in a mathematically precise way. The resulting theory is as exact as euclidean geometry. The elaboration of the mathematical theory would, of course, be beyond these notes; but I hope that I can sketch the motivation for introducing these complexity measures and indicate their possible interest in various areas. Complexity, I believe, should play a central role in the study of a large variety of phenomena, from computers to genetics to brain research to statistical mechanics. In fact, these mathematical ideas and tools may prove as important in the life sciences as the tools of classical mathematics (calculus and algebra) have proved in physics and chemistry. As most phenomena of the world, complexity appears first as an obstacle in the way of knowledge (or as a convenient excuse to ignorance). As a next phase, we begin to understand it, measure it, determine its the laws and its connections to our previous knowledge. Finally, we make use of it in engineering: complexity has reached this level, it is widely used in cryptography, random number generation, data security and other areas. Some of these aspects are discussed by Adi Shamir in this volume.

Some examples 1

As a computer scientist, I will consider every structure (or object) as a sequence of 0’s and 1’s; this is no restriction of generality, at least as long as we study objects that have a finite description, since such objects can be encoded as sequences of 0’s and 1’s (every computer uses such an encoding). For example, a positive integer is a finite sequence of 0’s and 1’s when written in base 2 instead of the usual base 10. Rational numbers can be encoded as pairs of integers, with some notational trick to show the sign and the element where the first integer ends etc. There is a sequence which, in a sense, contains the whole mathematics. Imagine that we write down every conceivable mathematical statement (whether or not it is true or false). We start, say, with the equality 0 = 0; second is the “equality” 0 = 1; somewhere we write down the (true) identity (xy)2 = x2 y 2 , and then also the (false) identity (x + y)2 = x2 + y 2 ; Pythagoras’ Theorem appears and then also Thales’ Theorem; Fermat’s Last Theorem is listed (even though we don’t know whether it is true or not) etc. Details of how we do this are irrelevant; it is enough to know that this way every mathematical statement, true or false, gets a number (its position in the list); given a statement, we can compute its position, and given a position, we can write up the corresponding statement. Now we write down a sequence of 0’s and 1’s. The first element is 1, because the first mathematical statement in our list is true; the second is 0 because the second statement in the list is false etc. Anybody knowing this sequence knows the whole mathematics! So let us ask the question: how complex is a sequence of 0’s and 1’s. For example, consider the following sequence: 000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000 This is perhaps the simplest possible object, and not very interesting; the only thing that can be said about it is that it consists of 576 0’s. The following sequence is only slightly more interesting: 010101010101010101010101010101010101010101010101010101010101010101010101 010101010101010101010101010101010101010101010101010101010101010101010101 010101010101010101010101010101010101010101010101010101010101010101010101 010101010101010101010101010101010101010101010101010101010101010101010101 010101010101010101010101010101010101010101010101010101010101010101010101 010101010101010101010101010101010101010101010101010101010101010101010101 010101010101010101010101010101010101010101010101010101010101010101010101 010101010101010101010101010101010101010101010101010101010101010101010101 2

This also consists of 576 entries, alternatingly 0 and 1 and is still very non-complex. Now consider the following sequence:

101101001001101100101101001101101001001101001011011001001101001001101101 001011001001101001001101100101101001001101101001101100101101100100110100 101100100101101100100110100101101100101101001101101001011001001101001001 101101001101100100101101001101100101101100100101101001101101001001101001 011011001001011001001101001001101100101101001001101001011001001101101001 001101001011001001011010011011001001011011001011010010011011001011011001 001011010011011001001011001001101001001101100101101001101101001001101100 101101100100101100100110100101101100100110100100110110100110110010010110 This looks much more complicated! You really have to be a wizard to memorize it: there is no periodicity, no obvious regularity or pattern in the succession of 0’s and 1’s, and one feels that the sequence is very complex. But in fact it is generated by a very simple program: main() { int n, m=1, a[576]; a[1]=1; for (n=1; m