Some Remarks on the Generative Power of

0 downloads 0 Views 175KB Size Report
A context-free chain-code grammar MRW82] is a type-2 Chomsky grammar generating a language of words over the alphabet fu; d; l; r;";#g. Such a word is.
Some Remarks on the Generative Power of Collage Grammars and Chain-Code Grammars Frank Drewes? Fachbereich 3 { Mathematik und Informatik, Universitat Bremen Postfach 33 04 40, D{28334 Bremen E-Mail: [email protected]

Abstract. Collage grammars and context-free chain-code grammars are

compared with respect to their generative power. It is shown that the generated classes of line-drawing languages are incomparable, but that chain-code grammars can simulate collage grammars that use only similarity transformations.

1 Introduction Inspired by the comparison of chain-code and collage grammars in [DHT96], in this paper some further observations concerning the generative power of these two types of picture generating grammars are pointed out. A context-free chain-code grammar [MRW82] is a type-2 Chomsky grammar generating a language of words over the alphabet fu; d; l; r; "; #g. Such a word is then interpreted as a sequence of instructions to a plotter-like device in order to produce a line drawing. The letters u, d, l, and r are interpreted as instructions to draw a unit line from the current position of the pen upwards, downwards, to the left, and to the right, respectively. Furthermore, " lifts the pen (so that subsequent drawing instructions only a ect the position of the pen, rather than actually drawing a line) and # sets the pen down, again. Collage grammars, as introduced in [HK91], are quite di erent as they produce pictures by transforming any sort of basic geometric objects using ane transformations. In particular, they are not at all restricted to the generation of line drawings. However, collage grammars can of course generate line drawings in the sense of chain-code grammars, so that it is natural to compare these two devices with respect to their capabilities in generating this sort of pictures. Three results in this respect are presented in this paper. (1) Linear collage grammars can generate languages of line drawings that cannot be generated by context-free chain-code grammars. (2) Conversely, linear context-free chain-code grammars can generate languages which cannot be generated by collage grammars.1 Thus, the two classes of languages are incomparable. Partially supported by the EC TMR Network GETGRATS (General Theory of Graph Transformation Systems) through the University of Bremen. 1 This fact was already claimed in [DHT96], but a proof was missing until now.

?

(3) In contrast to (1), every language of line drawings which can be generated by a collage grammar using only similarity transformations, can as well be generated by a context-free chain-code grammar. The results mentioned in (1) and (2) are in fact slightly stronger because they remain valid if line drawings are required to be equal only up to translation (i.e., if one is only interested in the gures being generated, rather than in their exact positions). This is an extended abstract. Except for some proof sketches all proofs are omitted.

2 Basic notions It is assumed that the reader is familiar with the basic notions of ane geometry. The sets of natural numbers, integers, and real numbers are denoted by N , Z, and R, respectively. N + denotes N n f0g and [n] denotes f1; : : :; ng for n 2 N . The identical transformation on R2 is denoted by id . The cardinality of a set S is denoted by jS j. As usual, the set of all nite words (or strings) over an alphabet A is denoted by A and  denotes the empty word. A signature  is a nite set whose elements are called symbols, such that for every f 2  a natural number called its rank is speci ed. The fact that f 2  has rank n is indicated by writing f (n) instead of f . The set T of terms over  is de ned as usual, i.e., it is the smallest set such that f 2 T for every f (0) 2  and g[t1 ; : : : ; tn ] 2 T for every g(n) 2  (n  1) and all t1 ; : : : ; tn 2 T . A regular tree grammar (cf. [GS97]) is a tuple g = (N; ; P; S ) such that N is a nite set of nonterminals considered as symbols of rank 0,  is a signature disjoint with N , P is a set of term rewrite rules of the form A ! t where A 2 N and t 2 T[N , and S 2 N is the start symbol. The rules in P are also called productions. The tree language generated by g is given by L(g) = ft 2 T j S !P tg, where !P denotes the transitive and re exive closure of the term rewrite relation !P determined by P . A regular tree grammar or context-free Chomsky grammar is said to be linear if every right-hand side contains at most one nonterminal symbol.

3 Context-free chain-code picture languages In this section the notion of context-free chain-code grammars [MRW82] is recalled (with the addition of the symbols " and #, which appeared later in the literature). For every point p = (x; y) 2 Z2, we denote by u(p), d(p), l(p), and r(p) the points (x; y + 1), (x; y ? 1), (x ? 1; y), and (x + 1; y), respectively. Furthermore, for every a 2 fu; d; l; rg, aline (p) denotes the subset of R2 given by the straight line segment between p and a(p). A line drawing is a nite set D such that every d 2 D has the form uline (p) or rline (p) for some p 2 Z2. The set of all line drawings is denoted by D .

A picture description is a word over the alphabet Acc = fu; d; l; r; "; #g. Every word w 2 Acc determines a drawn picture dpic (w) 2 D  Z2  f"; #g, as follows. (i) dpic () = (;; (0; 0); #) (ii) For every picture description v 2 Acc with dpic (v) = (D; p; s) and every a 2 Acc, if a 2 f"; #g then dpic (va) = (D; p; a). Otherwise,



[ faline (p)g; a(p); s) if s = # dpic (va) = ((D D; a(p); s) if s = ": The line drawing drawing (w) described by w 2 Acc is the rst component of dpic (w), i.e., drawing (w) = D if dpic (w) = (D; p; s). A (context-free) chain-code grammar is a context-free Chomsky grammar g whose alphabet of terminal symbols is Acc . The chain-code picture language generated by g is the set L(g) = fdrawing (w) j w 2 L(g)g.2 The set of all chain-code picture languages generated by context-free chain-code grammars is denoted by CFCC . A language of line drawings which is generated by a linear grammar is called linear.

4 Collage grammars in R 2 In this section the basic de nitions concerning collage grammars are recalled. For technical convenience, we shall de ne collage grammars in the way introduced in [Dre96a,Dre96b] (which is also employed in [DKL97]) rather than using the original de nitions from [HK91]. A collage is a nite set of parts, every part being a subset of R2 . (Thus, in particular, line drawings are collages.) A collage signature is a signature  consisting of collages (viewed as symbols of rank 0) and symbols of the form of rank k 2 N + , where f1 ; : : : ; fk are ane transformations on R2 . A term t 2 T denotes a collage val (t) which is determined as follows. If t is a collage C then val (t) = C . Otherwise, if t = [t1 ; : : : ; tk ] then val (t) = f1 (val (t1 )) [    [ fk (val (tk )) (where the fi are canonically extended to collages). A (context-free) collage grammar is a regular tree grammar g = (N; ; P; S ) such that  is a collage signature. The collage language generated by g is L(g) = fval (t) j t 2 L(g)g. The set of all languages generated by collage grammars is denoted by CFCL. A collage language generated by a linear collage grammar is called linear. Furthermore, if S is a set of ane transformations then CFCLS denotes the set of all collage languages L such that L = L(g) for some collage grammar g = (N; ; P; S ), where f1 ; : : : ; fk 2 S for all 2  . Thus, CFCLS is the set of all collage languages that can be generated by collage grammars using only transformations in S . 2

For a Chomsky grammar g, L(g) denotes the language generated by g.

5 Collage grammars vs. chain-code grammars In this section collage grammars and chain-code grammars are compared with respect to their generative power. Clearly, as mentioned in [DHT96], collage grammars can generate languages which cannot be generated by chain-code grammars, simply because all languages in CFCC consist of line drawings whereas collages may contain arbitrary subsets of R2 as parts. But what about the languages L 2 CFCL for which L  D ? It turns out that the answer is `no' in this case, too. In fact, this negative result can be strengthened by considering line drawings as equivalent if they are equal up to translation. For this (and also for future use), let G T be the set of all grid translations rs (r; s 2 Z) in R2 , where rs (x; y) = (x + r; y + s) for all (x; y) 2 R2 . Now, for two line drawings D and D0 , let D  D0 if (and only if) there is some  2 G T such that  (D) = D0 . Clearly,  is an equivalence relation. The equivalence class of D 2 D is denoted by [D]. For languages L; L0  D , we write L  L0 if f[D] j D 2 Lg = f[D0 ] j D0 2 L0g. Now, the rst result can be formulated as follows. Theorem 1. There is a linear language L 2 CFCL of line drawings such that there is no language L0 2 CFCC that satis es L  L0. For the proof, consider the linear collage grammar

g = (fS; Ag; f; ; C0 ; C1 g; P; S ); where

P = fS ! [C0 ; A]; A ! [A]; A ! C1 g; f maps (x; y) 2 R2 to (x; 2y), C0 = frline (0; 0)g, and C1 = fuline (1; 0)g. Obviously, L(g) = ffrline (0; 0); uline (2n ; 0)gj n 2 N g. In order to show that L(g) cannot be generated by a chain-code grammar one can make use of an observation that may be interesting in its own right. As usual, call a set ZP Zm linear if there are z0; : : : ; zn 2 Zm for some n 2 N , such that Z = fz0 + i2[n] ai  zi j (a1 ; : : : ; an ) 2 N n g, and say that Z is semi-linear if it is a nite union of linear sets. Furthermore, for L  D let grid S -points (L) denote the set of all points p 2 Z2 for which there is a line r 2 L such that p 2 r. Then the following can be shown. Lemma 2. The set grid-points (L) is semi-linear for every chain code-picture language L 2 CFCC . Proof. Using Parikh's theorem [Par66] it is easy to construct a semi-linear set N  N 4 such that grid -points (L) = f(v ? w; x ? y) j (v; w; x; y) 2 N g. Consequently, grid -points (L) itself is semi-linear, too. ut Now, choose L = L(g), where g is as above. By a straightforward construction one can show that every chain-code grammar g0 satisfying L(g0 )  L can be transformed into a chain-code grammar g00 such that L(g00 ) = L. In other words,

if there was a language L0 as in Theorem 1 it would follow S that L 2 CFCC . This is impossible because grid -points (L) = f(0; 0); (1; 0)g [ n2Nf(2n; 0); (2n ; 1)g is not semi-linear. Now, consider the converse question: Can collage grammars generate all the languages in CFCC , or at least the linear ones? Again, the answer is `no'.

Theorem 3. There is a linear language L 2 CFCC such that there is no language L0 2 CFCL that satis es L  L0 . In fact, the proof of the theorem reveals that this does even hold if the chaincode grammars are not allowed to make use of the symbols " and #. Consider the linear chain-code grammar

g = (fS g; Acc; fS ! ruSdr; S ! rg; S ); which generates the set of all line drawings consisting of two \stairs" of equal height, as shown in Figure 1. . ..

...

Fig. 1. The type of line drawings generated by the chain-code grammar used in the proof of Theorem 3

Denote L(g) by L. In order to show that this language proves Theorem 3 one needs a suitable criterion for context-freeness of collage languages, i.e., a criterion that allows to show that the class CFCL does not contain any language L0 satisfying L0  L. Criteria of this kind have been established in [DKL97]. Unfortunately, for the present aim these criteria do not suce. It turns out, however, that one of the results in [DKL97], namely Theorem 1, can be generalized in a nice way. For this, view a pair r = (L; R) of collages as a replacement rule in the following way. If, for a collage C , there is an ane transformation f such that f (L)  C then C =)r C 0 where C 0 = (C n f (L)) [ f (R). As usual, if S is a set of such rules let C =)S C 0 if C =)r C 0 for some r 2 S . Then the following holds.

Lemma 4. For every collage language L0 2 CFCL there is a constant n and a nite set S of pairs of collages such that the following holds. For every collage C 2 L0 there are collages C ; : : : ; Cn 2 L0 (for some n 2 N ) such that C =)S C =)S    =)S Cn and jCn j  n . 0

0

0

1

1

0

The lemma can be proved using the pumping lemma for regular tree languages, the proof being an easy reformulation of the proof of [DKL97, Theorem 1].

Obviously, for a language L0 2 [L] one cannot nd n0 and S as required in the lemma. This is because every line drawing L0 is determined by the positions of the two bottom steps of the stairs. Thus, the only way to alter a suciently large element by a production in S without leaving L0 would be to add or remove steps at the bottom of the two stairs simultaneously. Since there is an arbitrarily large distance between these steps this cannot be accomplished with a nite set of rules, which proves Theorem 3. Theorems 1 and 3 reveal that the classes CFCC and CFCL are incomparable even if the latter is restricted to line-drawing languages and equality of line drawings is required only up to translation. The collage grammar used to prove Theorem 1 makes use of a non-uniform scaling, however. Intuitively, its horizontal scaling causes, in e ect, an exponential translation of vertical lines. The point here is that a horizontal scaling of a unit vertical line results in a unit vertical line, again, because lines are one-dimensional objects. It is thus natural to wonder what happens if the collage grammars are only allowed to make use of similarity transformations. The answer is given by the following theorem. Theorem 5. Let L 2 CFCLSIM be a language of line drawings, where SIM is the set of all similarity transformations on R2 . Then it holds that L 2 CFCC . The proof mainly consists of two steps. Intuitively, since all the parts in a collage of the language L are required to be unit lines, a collage grammar cannot make signi cant use of uniform scalings, rotations, and translations other than those in G T , because all the information about such transformations would have to be remembered in the ( nitely many) nonterminals in order to avoid producing \wrong" parts. This is why the following can be proved in a rather straightforward way. Lemma 6. Let L 2 CFCLSIM be a set of line drawings. Then it holds that L 2 CFCLGT. Using Lemma 6, Theorem 5 can easily be veri ed. Let L 2 CFCLGT be a language of line drawings, where g = (N; ; P; S ) is the corresponding collage grammar (i.e., g is assumed to use only grid translations). Then it may be assumed without loss of generality that P contains only productions of the form A ! [A1 ; : : : ; An ], where A1 ; : : : ; An 2 N , and A ! fhg, where h = uline (0; 0) or h = rline (0; 0). Now, turn g into a chain-code grammar (N [ fS0 g; Acc; P 0 ; S0 ), where S0 is a new nonterminal, as follows. { For every production A ! [A1; : : : ; An] in P , choose any words vi ; wi 2 fu; d; l; rg such that dpic ("vi ) = (;; (ri ; si )) and dpic ("wi ) = (;; (?ri ; ?si )). Then P 0 contains the production

A ! v1 A1 w1 v2 A2 w2    vn An wn : { For every production A ! fhg in P , P 0 contains the production A ! #ud" if h = uline (0; 0) and A ! #rl" if h = rline (0; 0). { In addition, P 0 contains the production S0 ! "S .

It should be clear that L(g0 ) = L(g), which proves Theorem 5. The construction of productions above can in fact be simpli ed if g is a linear collage grammar. In this case it can be assumed that all productions have the form A ! D or A ! [D; A1 ], where D is a line drawing and A1 a nonterminal. Thus, the corresponding productions of a chain-code grammar would be A ! #v0 and A ! #v0 v1 A1 , where dpic (v0 ) = (D; (0; 0); ") and v1 is as in the construction above. In particular, no w1 to the right of A1 is needed. Thus, the resulting grammar is regular. Together with the fact that every regular chaincode grammar can easily be transformed into an equivalent collage grammar that uses only grid translations the following corollary is obtained, which is a slight extension of Theorems 10 and 13 in [DHT96]. Corollary 7. For every set L of line drawings the following are equivalent. (i) L can be generated by a regular chain-code grammar; (ii) L can be generated by a linear collage grammar using only similarity transformations; (iii) L can be generated by a linear collage grammar using only grid translations.

6 Conclusion In the previous section it was shown that collage grammars and context-free chain-code grammars yield incomparable classes of line drawings, but that the former can be simulated by the latter if only similarity transformations are used (and, of course, the generated language is a language of line drawings). Despite this second result one may say that incomparability is the main characteristic of the relation between the two devices. As Theorems 1 and 3 show, in either case not even the assumption of linearity makes a simulation possible. Intuitively, the reason for this is that both notions of grammar are based on quite di erent concepts. While collage grammars employ a completely local generation mechanism, where the generation of a part has no e ect on the rest of the generated collage, the main principle of chain-code grammars is the concatenation of line drawings. Thus, the latter can insert new lines by shifting already generated parts to the side, as was done in the example used to prove Theorem 3. On the other hand, chain-code grammars do not provide any means of scaling, rotation, shearing, etc.|which are essential in the de nition of collage grammars. One may argue that the point of view taken in this paper is somewhat unfair against collage grammars. Most of their nice capabilities cannot be used if they are restricted to the generation of pictures consisting of unit lines. In fact, as a matter of experience (but certainly|to some extend|also as a matter of taste) collage grammars often turn out to be more appropriate and manageable than chain-code grammars because of their exibility and their strictly local behaviour. Nevertheless, one should also notice that there are quite natural picture languages (like the one used to prove Theorem 3) that can be generated by chain-code grammars but not by collage grammars.

References [DHT96] Jurgen Dassow, Annegret Habel, and Stefan Taubenberger. Chain-code pictures and collages generated by hyperedge replacement. In H. Ehrig, H.-J. Kreowski, and G. Rozenberg, editors, Graph Grammars and Their Application to Computer Science, number 1073 in Lecture Notes in Computer Science, pages 412{427, 1996. [DKL97] Frank Drewes, Hans-Jorg Kreowski, and Denis Lapoire. Criteria to disprove context-freeness of collage languages. In B.S. Chlebus and L. Czaja, editors, Proc. Fundamentals of Computation Theory XI, volume 1279 of Lecture Notes in Computer Science, pages 169{178, 1997. [Dre96a] Frank Drewes. Computation by tree transductions. Doctoral dissertation, University of Bremen, Germany, 1996. [Dre96b] Frank Drewes. Language theoretic and algorithmic properties of d-dimensional collages and patterns in a grid. Journal of Computer and System Sciences, 53:33{60, 1996. [GS97] Ferenc Gecseg and Magnus Steinby. Tree languages. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages. Vol. III: Beyond Words, chapter 1, pages 1{68. Springer, 1997. [HK91] Annegret Habel and Hans-Jorg Kreowski. Collage grammars. In H. Ehrig, H.-J. Kreowski, and G. Rozenberg, editors, Proc. Fourth Intl. Workshop on Graph Grammars and Their Application to Comp. Sci., volume 532 of Lecture Notes in Computer Science, pages 411{429. Springer, 1991. [MRW82] Hermann A. Maurer, Grzegorz Rozenberg, and Emo Welzl. Using string languages to describe picture languages. Information and Control, 54:155{ 185, 1982. [Par66] Rohit J. Parikh. On context-free languages. Journal of the Association for Computing Machinery, 13:570{581, 1966.