Extractors for Kolmogorov Complexity

6 downloads 0 Views 195KB Size Report
Dec 4, 1996 - the Fulbright scholar program. yUniversity of Chicago, Department of Computer Science, 1100 E. 58th St., Chicago, IL 60637. Email: la-.
Extractors for Kolmogorov Complexity Lance Fortnow CWI & The University of Chicago

Sophie Laplantey University of Chicago

December 4, 1996

Abstract We show two sets of results applying the theory of extractors to resource-bounded Kolmogorov complexity:  Most strings in easy sets have nearly optimal polynomial-time CD complexity. This extends work of Sipser [Sip83] and Buhrman and Fortnow [BF97].  We use extractors to extract the randomness of strings. In particular we show how to get a random string of high polynomial-time C complexity from a potentially nonrandom string of high polynomial-time CND complexity.

1 Introduction Extractors denote a set of bipartite graphs designed to \extract" randomness. One would expect some connections with Kolmogorov complexity, one of the best known measures of \randomness". Previously no known connection has been established. We show how to apply extractors to the theory of resource-bounded Kolmogorov complexity in two separate applications. In one case, we use extractors to strengthen upper bounds on the Kolmogorov complexity of strings in easily computable sets. In the other application, we use extractors, true to their name, to extract a Kolmogorov random string capturing all of the randomness of some longer string. Extractors are a collection of bipartite graphs with many more nodes on the left than the right and of relatively small degree. Extractors have the property that for any large enough collection S of nodes of the left, the process of randomly choosing a neighbor of a node in S generates a close to uniform distribution of the nodes on the right. We give more details in Section 2.2. We also recommend the excellent survey on extractors by Nisan [Nis96]. University of Chicago, Department of Computer Science, 1100 E. 58th St., Chicago, IL 60637. Email: [email protected]. URL: http://www.cs.uchicago.edu/~fortnow. Supported in part by NSF grant CCR 92-53582 and the Fulbright scholar program. University of Chicago, Department of Computer Science, 1100 E. 58th St., Chicago, IL 60637. Email: [email protected]. Supported in part by NSF grant CCR 92-53582. 

y

1

We concentrate on three de nitions of polynomial-time Kolmogorov complexity: C p (x) that represents the length of the smallest program that generates x in polynomial time, CDp(x) that represents the length of the smallest program that distinguishes x in polynomial time and CNDp(x) that represents the length of the smallest program that nondeterministically distinguishes x in polynomial time. Sipser [Sip83] shows that for every set A in P , for all strings x of length n in A, the CDp complexity of x given a \random" string r is bounded by log jAj + O(log n). Buhrman and Fortnow [BF97] remove the dependency on the random string but at a cost of only bounding the CDp complexity of x by 2 log jAj + O(log n). We nearly achieve the optimal bound of Sipser without the random string by bounding the CDp complexity of most strings x in A by log jAj + logO(1) n. Following Buhrman and Fortnow [BF97], we also establish a similar relationship between sets in NP and CND complexity. We can also achieve Sipser's bound for most strings at the cost of only polylogarithmic random bits. How hard is it given a string x to nd a string y such that y is random and y has roughly the same length as x? Note that y \extracts" out the randomness from x. In traditional Kolmogorov complexity one can describe y by x and only log n additional bits{the size of the smallest program for x. For polynomial-time complexity this attack appears not to work. However, we can use extractors to extract the randomness. We show that given a string x of high CNDp complexity we can nd a string y that captures most of the randomness of x using only a small additional number of bits.

2 Preliminaries We start by giving some basic de nitions that are needed to introduce the notion of an extractor graph. We follow the presentation in Nisan's survey paper [Nis96]. An extractor can be thought of as a bipartite graph, whose rst color class is larger than the second color class. By convention we think of the rst color class as being on the left, and the second on the right. The vertices on the left side are all the strings of length n, so the rst color class can be equated with the set [N ], where N = 2n . Likewise, the vertices on the right side of the graph are labeled by strings of length m  n, so we let M = 2m and [M ] is equated with the vertices in the second color class.

2.1 On distributions We will be choosing a node on the left side of the graph at random according to a distribution X . The min-entropy of a distribution X over [N ] can be thought of as a measure of the randomness present in a string x chosen according to X . The min-entropy of D is de ned to be minf? log2 X (x)jx 2 [N ]g. A distribution Y is said to be "?close to X if both distributions are over the same space (say [M ]) and that for any S  [M ]; jX (S ) ? Y (S )j  ".

2.2 Extractors A bipartite graph G with (independent) vertex sets [N ] and [M ], N = 2n ; M = 2m , and for which the degree of all the vertices in the rst color class is bounded by D = 2d is an ("; k) extractor if given any 2

distribution X on the N vertices whose min-entropy is at least k, the result of choosing an x according to this distribution and a random neighbor y of x in the graph is "-close to the uniform distribution over [M ]. We will also say that G is an extractor with parameters (n; k; d; m; "). In our setting, the distribution X will be the uniform distribution over a subset A  [N ], so k will be log jAj. ?(x) denotes the set of neighbors of x in G when x is a vertex on the left side of the graph. The number of edges originating at some vertex x on the left side of the graph is called the outdegree of x, whereas the number of edges adjacent with a vertex y on the right side of the graph is called the indegree of y . G(x; r) represents the rth neighbor of x in the graph, where multiple edges are allowed. When y is a vertex on the right side of the graph, ??A1 (y ) is the subset of preimages of y which lie in A. The notation extends to sets in the natural way.

2.3 Best known explicit constructions The results we state are subject to improvement if better explicit extractor constructions are found. We have stated our results in general terms so that new results on extractors will be immediately applicable. The current best known explicit constructions for extractors are due to Ta-Shma and Zuckerman [Ta-96, Zuc96]. They are described by the theorems below.

Theorem 2.1 (Ta-Shma) There is an explicit construction which for every n; " = "(n) and every m =

m(n)  n yields an extractor with parameters (n; m; logO(1)(n="); m; "):

Theorem 2.2 (Zuckerman) There is an explicit construction which for every constant c, every n; =

(n);  = (n) yields an extractor with parameters (n; n; c log(n="); ( ? )n; "); = ) where c = c log(1 .

We use both these constructions in this paper in order to obtain concrete bounds. It is useful to compare these constructions to the current lower bound on extractors, due to Nisan and Zuckerman [NZ93].

Theorem 2.3 (Nisan-Zuckerman) There is a constant c such that for all n; m; k  n ? 1; " < 1=2, if there is an extractor whose parameters are

(n; k; d; m; "); then it must be the case that d  maxfm; c log(n=")g. This lower bound also gives a good indication as to the limits of the techniques described in this paper. For more information on extractors we recommend the survey paper by Nisan [Nis96]. 3

2.4 Kolmogorov and CD Complexity Several variants on Kolmogorov complexity are used in this paper. We give the de nitions below, following the notation put forth in [LV93]. The time-bounded Kolmogorov complexity of a string x relative to a string y , written C t(xjy ), is the length of the shortest program p which on input y , prints out the string x in time bounded by t(jxj + jy j). When y is the empty string, we write C t(x). The time-bounded Kolmogorov distinguishing complexity of a string x relative to a string y , written CDt(xjy), is the length of the shortest program p which on input y; z, outputs 1 when z = x and 0 on all other strings. The time taken on all inputs y; z must be bounded by t(jpj + jy j + jz j). When y is the empty string, we write CDt (x). The nondeterministic time-bounded Kolmogorov distinguishing complexity of a string x relative to a string y , written CNDt(xjy ), is the length of the shortest program p which has the following behavior. If z = x, then there is a witness w of length bounded by t(jxj) such that p accepts on input y; z; w. If z 6= x, then no such witness can cause p to accept. The time taken on all inputs y; z; w must be bounded by t(jpj + jy j + jz j + jwj). When y is the empty string, we write CNDt(x). For more information on Kolmogorov complexity we recommend the comprehensive book by Li and Vitanyi [LV93].

3 Complexity Bounds on Easy Sets Our theorems improve upon the results of Sipser [Sip83] and Buhrman and Fortnow [BF97] in the sense that our bounds are stronger and the strings we obtain have high mutual information. The price we pay for these improvements is that our bounds apply only to \most" strings, not to all strings as in [Sip83, BF97].

Theorem 3.1 (Sipser) For every set A 2 P , there is a polynomial p and a constant c such that for every n and for most r of length p(n), and for every x 2 A \ n , CDp (xjr)  log(jA \ nj) + c log n: Is the random string r necessary to prove Theorem 3.1? Buhrman and Fortnow show how to eliminate r at the cost of doubling the complexity.

Theorem 3.2 (Buhrman-Fortnow) For any set A 2 P , there is a polynomial p and a constant c such

that for all strings x 2 A \ n ,

CDp (x)  2 log(jA \ nj) + c log n: We extend the work of Buhrman and Fortnow [BF97] by getting nearly the bound of Sipser [Sip83] without the random string used by Sipser. However, our result only works for most strings in A. Using the extractor construction of Ta-Shma [Ta-96] we get the following theorem. 4

Theorem 3.3 For any set A 2 P , " = "(n), there is a polynomial p such that for all n and for all but a 2" fraction of the x 2 A \ n ,

CDp(x)  log jA \ nj + logO(1)(n="): Using a similar proof based on the Zuckerman [Zuc96] extractor construction we can get a slightly di erent bound.

Theorem 3.4 For any set A 2 P , " = "(n), there is a polynomial p such that for all n and for all but a 2" fraction of the x 2 A \ n ,

CDp (x)  (1 + "(n)) log jA \ n j + O(log(n=")): We give a few extensions of this result. In one extension, we consider sets in NP and nd short strings whose CD complexity is close to the nondeterministic CD complexity of the original string. We also give a randomized version of these theorems, stating that the shorter string can be chosen at random and the probability of getting a short string which encodes as much information as the original string is bounded away from 1=2. Buhrman and Fortnow also bound the nondeterministic CD complexity of strings in sets in timebounded nondeterministic classes. We state their result here and use it in the proof of one of our theorems.

Theorem 3.5 (Buhrman-Fortnow) For any set A 2 NTIME[t(n)], there is a polynomial p and a constant c such that for all strings x 2 A \ n ,

CNDpt(x)  2 log(jA \ nj) + c log n:

3.1 Extracting CD complexity for sets in P Theorem 3.3 follows immediately from the following result:

Theorem 3.6 Fix a set A in P , a polynomial q(n) and " = "(n). Then there is a polynomial p(n) such that for all n and for all but a 2" fraction of the x 2 A \ n , there is a y such that 1. jy j = log jA \ n j 2. C p(y jx)  log D + O(1) 3. CDp(x)  C q (y ) + 3 log D + 2 log n + O(1), where the value D depends on the construction of explicit extractor graphs, and currently the best known is 2logO (n=") . (1)

5

For the remainder of this section, we x n and we let S = jA \ n j. We use an ("; S ) extractor G, for which M = S and d = O(logO(1)(n=")) [Ta-96]. In our setting, we will think of the set A \ n as de ning a distribution of min-entropy log S . The string x represents an element of A \ n and y is one of its neighbors in the graph G. Hence, y has length log S ; computing y from x requires knowing only a short (\random") string of length log D; and as we will see, y together with some short additional distinguishing information will suce to distinguish the string x (in the sense of CD complexity). The following lemmas are at the heart of all the probability argument. They allow us to upper-bound the number of \bad" elements in A \ n , where \bad" means the strings x for which Theorem 3.6 will not apply. In order to get a short description for x, we need to nd a string y in its range which has small indegree. In Lemma 3.1, we use the properties of the extractor to obtain an upper bound on the number of y which have large indegree, where \large" is parameterized by the value w0. Lemma 3.2 gives an upper bound on the number of x on the left side of the extractor whose neighbors all lie within a small subset of the right side of the graph. These x are the \bad" x to which the theorem will not apply.

Lemma 3.1 Let w be an indegree threshold, D < w  DS , and Y be a subset of vertices on the right 0

0

hand side of the extractor graph. If 8y 2 Y; w(y ) > w0, then

jY j  w"DS ?D 0

Proof: This is just a simple counting argument. Let Y be the set of vertices whose indegree exceeds w . 0

Because the graph is an extractor, it must be the case that w(Y ) ? jY j  ":

DS

S

Since w(Y )  w0jY j, we get by simple rearrangement that jY j  w"DS ?D as claimed. 0

2

Lemma 3.2 If Y is a set on the right side of the graph containing at most (1 ? ")S vertices, then jfx : ?(x)  Y gj  "S + jY j

Proof: Once again this is a simple counting argument. The fact that the graph is an extractor tells us that w(Y )  "DS + DjY j. Each x for which ?(x)  Y contributes D edges to the indegree of Y , therefore the total number of such x cannot exceed "S + jY j as claimed. 2 To conclude, we show the proof of Theorem 3.6. Proof:

Let A be a set in P and "; n be given as in the statement of the theorem. By Lemma 3.1, applied with w0 = 2D and Lemma 3.2 with Y as in the hypothesis of Lemma 3.1, the size of the subset B  A \ n such that 8x 2 B , 8y 2 ?(x), y has indegree at least w0 can have size at most 2"S . Therefore for all but 6

2"S of the x in A \ n , there is a y in its range whose indegree is at most 2D. For each such x, let rx be the label of one of the edges in G which connects x to such a y . We need to verify 3 properties for each of these pairs x; y . 1. jy j = log S : This is simply by choice of the extractor G. 2. C (y jx)  log D + O(1) : y = G(x; rx) for some rx 2 d , so the algorithm to print y will contain an encoding of rx , and on input x computes G(x; rx) and outputs the result. 3. CD(x)  C (y ) + 3 log D + 2 log n + O(1) : The program to recognize x will contain an encoding for an rx and y for which G(x; rx) = y and the indegree of y is at most 2D. It must also contain a distinguishing program p which recognizes x; rx among the 2D edges originating in A that are adjacent to y . The length of p is bounded by 2 log(2D) + 2 log(n + log D), as given by Theorem 3.2. The algorithm follows: On input z : (1) Check if z is in A. If not, output 0 and terminate. (2) Check if G(z; rx) = y . If not, output 0 and terminate. (3) If p(z ) = 1, then z = x. Output 1 and terminate. Otherwise output 0. So the program needs an encoding of y , r, and the distinguishing program p, for a total length of C (y) + log(D) + 2 log(D) + 2 log(n + log D) + O(1).

2

3.2 Extracting CND complexity for sets in NP Using a slight variant of the proof of Theorem 3.6, we can get the following result about CND complexity. It is stronger in that it applies to sets in NP , and although the bound on the complexity of x is for CND instead of CD complexity, the trade-o is that the bound is smaller by a term of log D and it involves only CD(y ) instead of C (y ).

Theorem 3.7 Fix a set A in NP , a polynomial q(n), and " = "(n). Then there is a polynomial p(n) such that for every n and for all but a 2" fraction of the x 2 A \ n , there is a y such that 1. 2. 3.

jyj = log jA \ nj C p(yjx)  log D + O(1) CNDp(x)  CDq(y) + 2 log D + c log(n + logD) + O(1).

The proof is essentially the same as that of Theorem 3.6. To obtain property 3, we need only guess y , and verify our guess using a distinguishing program for y whose length is bounded by CDq (y ). Likewise, we can simply guess r and omit its encoding, and use the distinguishing program p to verify our guess for r. 7

3.3 Randomly extracting CD complexity Another trade-o we obtain to save a log D term is to choose a counterpart y to a string x in a set in P at random. We will only require that for most x, at least half of the edges from x map to a \good" y . Although this comes at the cost of only applying to \most" strings x, this improves upon the result of Sipser [Sip83] by reducing the length of the random string from nO(1) to logO(1)(n="). The proof is similar to that of Theorem 3.6; it requires only a slight modi cation to the the counting argument.

Theorem 3.8 Fix a set A in P , a polynomial q(n), and " = "(n). Then there is a polynomial p(n) such

that for every n and for all but a 4" fraction of the x 2 A \ n , and at least half of the strings r of length D, there is a y such that: 1. jy j = log jA \ n j 2. C p(y jx; r)  O(1) 3. CDp(xjr)  C q (y ) + 2 log D + 2 log(n + logD) + O(1).

4 Extracting random strings In the previous section, we used the fact that the strings examined were in a small set of bounded complexity, and we showed the existence of strings for which the mutual information was roughly the CD complexity of the original string. Here we use extractor techniques to a achieve a slightly di erent goal. We obtain an incompressible string whose length is close to the CD complexity of x and which can be computed from x using only log(n=") bits.

Theorem 4.1 Fix a polynomial q(n) and " = "(n), and let c > log ( ?" ) for large enough n. Then there 1

1

exists a polynomial p(n) such that for any string x, there is a string y of length (1 ? ")(CNDp(x)=2 ? O(log n)) such that: 1. C p(y jx)  log(n=") 2. CNDp(x)  C q (y ) + O(1) 3. C q (y ) > jy j ? O(1).

We can improve the bound on the length of y if we are willing to state the theorem for \most" x of any given CND complexity, instead of stating it for all x. More speci cally, we can improve the coecient of CNDp(x) from 1=2, as it is stated above, to 1 ? ". (The only change in the proof is to use a bound based on Zuckerman's extractor instead of Theorem 3.5.) Instead of giving the proof of Theorem 4.1, we prove the result in the following more general form, which may be improved as explicit extractor constructions are improved. 8

Theorem 4.2 Fix a polynomial q(n) and " = "(n), and let c > log ( ?" ) for large enough n. Then there 1

1

exists a polynomial p(n) such that for any string x, there is a string y of length m such that: 1. 2. 3.

C p(yjx)  d + O(1) CNDDp(x)  C q (y) + O(1) C q (y) > jyj ? O(1),

provided there is an explicit extractor with parameters

(n; k; d; m; ") where k = CND

p D (x)?O(log n) 

2

.

Theorem 4.1 follows by applying Theorem 4.2 with parameters obtained from Zuckerman's extractor [Zuc96]. Proof: (Sketch) The idea of the proof is to consider a family of extractors parameterized by n; k; m(k), and look at what happens in the extractor n; k; m(k) to strings x of length n. For xed n; m, we let An;m = fxj?(x)  C [q(n); m ? c]g: The fact that G is an extractor prohibits the set An;m from being large, as we see now. If jAn;m j > 2k , then by the properties of the extractor graph, the weight on C [q (n); m ? c] induced by An;m must be close to uniform, namely: w(C [q(n); m ? c])  C [q(n); m ? c] + ": DjA j 2m n;m

Using the fact that w(C [q (n); m ? c]) = DjAn;m j by de nition of An;m and that jC [q (n); m ? c]j  2m?c , we get that 1  2?c + ": However, we have chosen c > log ( 1?1 " ) precisely to eliminate this possibility. Hence we must conclude that jAn;mj  2k . Now we may apply Theorem 3.5 to conclude that all x 2 An;m must have small CND complexity. First notice that verifying membership in An;m is in NTIME[D  p] for some polynomial p, since it suces to guess, for each neighbor y of x in Gn;m a program of length m ? c which prints out y . Hence, for every x 2 An;m , CNDDp(x)  2 log(jAn;m j) + 2 log n + O(1): Now consider x with respect to the extractor Gn;k;m(k) , where k = 12 (CNDDp(x) ? 2 log n ? O(1) ? 1) and m is maximal for this k. By the observation above, it must be the case that x 62 An;m . Therefore there must be a y not in C [q (n); m ? c] to which x is mapped under Gn;k;m . It is easy to verify that y satis es the properties claimed in the statement of the theorem.

2

9

5 Extensions It makes sense to state the results above in a yet more general form. Instead of requiring that the reference set A be in P , all results carry over to the setting where the instance complexity of all instances in A is small, adding the appropriate (small) term to the bounds we obtain. We refer the reader to the paper of Ko, Orponen, Schoning, and Watanabe [KOSW86] or to section 7.3.3 of the textbook by Li and Vitanyi [LV93] for more information on instance complexity. Note that this observation applies to the results of Buhrman and Fortnow [BF97] as well. Of particular interest are the sets in the class IC [log, poly], which is known to sit properly between the nonuniform classes P =log and P /poly.

6 Acknowledgments We would like to thank Stuart Kurtz, Amber Settle, Harry Buhrman and David Zuckerman for several helpful discussions.

References [BF97] H. Buhrman and L. Fortnow. Resource-bounded kolmogorov complexity revisited. In Proceedings of the 14th Symposium on Theoretical Aspects of Computer Science, Lecture Notes in Computer Science. Springer, Berlin, 1997. To appear. [KOSW86] K. Ko, P. Orponen, U. Schoning, and O. Watanabe. What is a hard instance of a computational problem? In A. Selman, editor, Proc. Conference on Structure in Complexity Theory, pages 197{ 217. Springer-Verlag, 1986. [LV93] M. Li and P. Vitanyi. An Introduction to Kolmogorov Complexity and Its Applications. Texts and Monographs in Computer Science. Springer, New York, 1993. [Nis96] N. Nisan. Extracting randomness: How and why (a survey). In Proceedings of the 11th IEEE Conference on Computational Complexity, pages 44{58. IEEE, New York, 1996. [NZ93] N. Nisan and D. Zuckerman. More deterministic simulation in logspace. In Proceedings of the 25th ACM Symposium on the Theory of Computing, pages 235{244. ACM, New York, 1993. [Sip83] M. Sipser. A complexity theoretic approach to randomness. In Proceedings of the 15th ACM Symposium on the Theory of Computing, pages 330{335. ACM, New York, 1983. [Ta-96] A. Ta-Shma. On extracting randomness from weak random sources (extended abstract). In Proceedings of the 28th ACM Symposium on the Theory of Computing, pages 276{285. ACM, New York, 1996. [Zuc96] D. Zuckerman. Randomness-optimal sampling, extractors, and constructive leader election. In Proceedings of the 28th ACM Symposium on the Theory of Computing, pages 286{295. ACM, New York, 1996. 10