Inheritance with Overriding Without Non ... - Semantic Scholar

2 downloads 0 Views 262KB Size Report
that max and john are tall. ... Hence they are tall too. john being a player, and max being a basket ball .... and Gangopadhyay [8], Naish [3], Lawley [16], etc.
Inheritance with Overriding Without Non-monotonic Reasoning in Datalog++ ∗ Hasan M. Jamil Department of Computing School of MPCE, Macquarie University Sydney, NSW 2109, Australia e-mail: [email protected]

Abstract We present a query language, called the Datalog++ , for deductive object-oriented databases. While a direct semantics for inheritance with overriding and encapsulation is desirable, an alternative method for capturing these two features in Datalog++ based on the idea of program transformation and compilation is proposed. A reduction technique for Datalog++ programs to Datalogneg is discussed. The elegance of the proposed reduction technique is that the transformed programs use purely deductive means to capture these two important features in Datalogneg without the need for stratification or non-monotonic reasoning. The strength of the reduction technique rests on the so called i-completion and context resolution schemes that respectively handle overriding and encapsulation. We also outline a prototype implementation of Datalog++ on CORAL deductive database system. Unlike most others, our implementation does not require meta-interpretation and consequently readily exploits the rich set of optimization techniques available in CORAL.

1

Introduction

In logic programming and in artificial intelligence, it has been a tradition to assign default values to a set of objects through inheritance. For example, the following logic program says that every player is tall (6 feet and above, say), and that every basket ball player is a player, i.e., a basket ball player is a special player. Thus the set of basket ball players is a subset of the set of players. r1 r2 r3 r4

: : : :

tall(X) ← player(X). player(X) ← basket ball player(X). player(john). basket ball player(max).

It is easy to see that if we pose the following queries, we will be able to prove from the above program that max and john are tall. r5 : ? tall(max). r6 : ? tall(john). ∗ Research supported in part by the Macquarie University Research Grant and the Macquarie University, Department of Computing Travel Grant.

The idea captured in the above program is as follows. Every player is normally tall. Every basket ball player is a player. Hence they are tall too. john being a player, and max being a basket ball player (and thus a special player), are also tall. In other words, basket ball players inherit the property tall each being a player. Notice that the rule r2 defines the membership of basket ball players in the player class through the logical implication (←) and thus help inherit the property tall. This method of inheriting default values works well so long we do not have to define any exception. Exceptions, however, are again traditionally handled using logical negation (¬). For example, consider a set of oriental players who are not tall (less than 6 feet, say). This scenario may be captured by the following set of rules. r7 : ¬tall(X) ← oriental player(X). r8 : oriental player(cheng).

Rule r7 says that oriental players are not tall. Rule r8 (in conjunction with rule r7 ) asserts that cheng being an oriental player is not tall. Now, if we assume (as shown below in rule r9 ) that every oriental player is also a player, then we have a serious problem. Rule r9 below in conjunction with rules r1 and r7 defines an exception to the default assumption that every player is tall. Specifically, it says, although all players are tall, oriental players are not tall although they are players, i.e., an exception. Now, using a top down evaluator we can not only prove tall(cheng), we are also able to prove ¬ tall(cheng), leading to an inconsistency. Same remark applies even if we use a bottom-up evaluator. In short, the problem here is that the evaluators failed to recognize the exception defined at more specialized class of objects, i.e., a subclass (oriental payer) of a class (player) where the default has been defined, and hence the default should not be inherited. This led to the observation that defining is-a specificity through logical implication, and overriding through logical negation alone have serious consistency related consequences. r9 : player(X) ← oriental player(X). The matter becomes worse, if we allow multiple inheritance by adding the following rule (rule r10 ). The rule says that cheng is also a basket ball player in addition to the fact that he is an oriental player. In this case, even if we can force the traditional evaluators to recognize the exception defined in the oriental player subclass, cheng now will also inherit the property tall from basket ball player class in addition to inheriting not tall from the oriental player subclass, and hence render the program inconsistent. r10 : basket ball player(cheng). These observations lead the researchers to develop techniques to make the evaluators recognize the rule priority, evaluation hierarchy, etc. (e.g., [23]) by developing various extensions to the language, and to the evaluators. These methods, some way or another, involve handling negation and thus non-monotonic reasoning. Besides the fact that the computational complexity is too high for these methods and that occasionally semantics to programs are assigned non-deterministically, they yet do not provide an intuitive framework for data modeling and knowledge representation. By and large, these techniques could not make inheritance with overriding a language feature. Rather they leave the issue of modeling overriding as a programming technique to be implemented by the application programmers if required. The above knowledge, however, can easily be captured in imperative languages such as C++ [22]. In such a language, overriding is based on a value being redefined at a more specialized class, i.e., value based overriding. For example, the class player defines a variable height and assigns a default value tall, whereas oriental player as a subclass of player class overrides the value tall by defining a new value

124

short for the height attribute or variable. Thus the value tall will not be inherited in the oriental player subclass, while it will be in basket ball player subclass. While imperative languages address this issue quite elegantly, no straightforward mapping of these techniques could be developed for the logic based languages so far. Recently, there have been intense activities in the research front to develop a deductive objectoriented language by taking an object-oriented view and actually incorporating features peculiar to this paradigm. In the last few years, several proposals also addressed the issue of a direct semantics for logic based object-oriented languages, for example [6, 13, 14]. These proposals attempt to provide a higher level abstraction for the object-oriented features such as objects, classes, subclasses, methods, inheritance, overriding, encapsulation, etc. They also recognize the fact that implication and negation are not the answer to the problem of subclass specification (and thus handling inheritance) and overriding respectively. Hence these languages adopt new language constructs other than implication to specify subclass relationship. It was, however, readily observed that giving a direct semantics to a language capable of modeling desired object-oriented features is a daunting task. While few experimental systems started to emerge, they are far less likely to be used commercially than their deductive database counterparts. This is simply because these systems either lack a formal foundation, are too complex to be used as modeling tools, or are plainly inadequate for useful applications. We consider an extension of Datalog, called the Datalog++ , in the direction of the so called objectrelational models. The goal here is to develop a language that has most of the desired object-oriented features well within the setting of an existing system, is adequate to use as a modeling tool and at the same time has a formal foundation. Hence, we try to be practical and instead of trying to develop a direct semantics of our language, we develop a rewriting based semantics. That is, to give a semantics to a program in our language we rewrite every Datalog++ program to another language, Datalogneg to be exact. Similar approaches have been studied in works by Abiteboul et al. [1], Dalal and Gangopadhyay [8], Naish [3], Lawley [16], etc. The advantage of such an approach is as follows. Inheritance with overriding and also encapsulation can be modeled in a practical manner1 . However, the lack of a complete logical semantics for both of these features – inheritance with overriding and encapsulation, necessitates a meta-logical treatment of these features and thus motivates the rewriting based approach to object modeling presented in this paper. We introduce few not so familiar concepts such as value and code inheritance, locality and inheritability of clauses, accessibility of methods, etc. These concepts allow us to develop a reduction technique for Datalog++ programs to Datalogneg and thus help explain the meta-logical features of our language in a logical way. Furthermore, the reduction technique helps us model inheritance and encapsulation in purely deductive ways without having to deal with non-monotonic reasoning. We first describe an object-oriented data model in section 2 and explain the meanings of the features we incorporate in our language. We then introduce the Datalog++ language in section 3 using an example along with its intended semantics only on intuitive grounds. We do not discuss, in particular, the formal syntax of Datalog++ , nor do we present the theoretical foundations of the reduction technique used in section 3 to illustrate the salient features of Datalog++ for the lack of space. We, however, refer the interested readers to [11] where we present the formal syntax and semantics of the language, and give a formal treatment of the reduction technique by giving a translation algorithm to reduce every Datalog++ program to Datalogneg . We then proceed to discuss the principle based on which non-monotonic reasoning is avoided in Datalog++ in section 5, refer to a Unix based implementation of Datalog++ in CORAL deductive database system in section 6, and finally give our conclusion in section 7. 1 While encapsulation can be incorporated by developing suitable machinery and semantics, as we demonstrate in this paper, unfortunately none of the works known to us have addressed this important issue so far.

125

2

The Data Model

It perhaps is beneficial to fix a data model ahead on which a language is to be developed. This helps in keeping the view focused and also provides a justification for the features and constructs incorporated in the new language. The proposals that do not do so leave themselves open to interpretations and thus make it hard to verify the effectiveness of the language and its functionalities. The data model presented here is adapted from the OR model [12] and the SDM [19], and adequately explains the features we capture in Datalog++ . In this simple model, we have objects, a partial order on objects that captures the is-a hierarchy, and relationships among the objects organized in class hierarchies.

2.1

Objects

An object o is an abstraction of a real life entity and is represented by the expression o = ι, µ, σ, where ι is an immutable identifier which is unique to o, µ is a set of methods – attributes and operations defined on the attributes and methods of objects2 , and σ is a set of signatures one each for every method definition in o such that |σ| ≥ |µ|. Objects are of two types – class objects and instance objects. Instances belong to a single class and inherit3 all their signatures from the class to which they belong. But an object definition may include method definitions. In other words, in an instance object, |σ| = 0 and |µ| ≥ 0. A class object oc defines signatures and methods for the class itself and its instances. Hence it defines class and instance signatures, and likewise it defines class and instance methods. Instances of oc will inherit the signatures and methods defined for instances only while any subclass object os of oc will inherit all. We insist that method definitions in objects be well-typed. By well-typed we mean that for every method definition at any given object, we must have a corresponding signature definition, either locally, or by inheritance. While an object may have a method definition and make it well-typed by inheriting the signature from one of its superclasses, the converse, having a local signature and inheriting the method definition, is not allowed since it violates structural integrity of objects.

2.2

Encapsulation

Classes also define an interface through which it makes some of its methods visible and accessible to other objects by declaring a set of method signatures, σ  ⊆ σ, as public. This is known as encapsulation. The set σ \ σ is regarded as private and is only accessible by the objects themselves and their class objects. This implies that an object responds to a message only if (i) the method is public, (ii) the object is an instance of the class object that sent the message, or (iii) the object is trying to access its own method. This definition of objects coincides in most part with the definition of abstract objects in abstract data types and hence objects in our model are called abstract objects.

2.3

Class Hierarchy

In this model objects are grouped into classes and classes are organized in a specialization-generalization (SG) hierarchy in the sense of extended ER model. Thus classes have objects as their instances where the instances may have specialized methods. The SG (or is-a) hierarchy naturally induces a superclass/subclass association among the classes. This leads to specialization where the classes lower in 2 Note that we do not differentiate between state variables and operations for technical reasons. Instead we uniformly view both as methods, the reason for which will become clear when we present the syntax and semantics of the Datalog++ language. 3 Discussed in detail later in section 2.3.

126

the SG hierarchy inherit the properties (signatures and methods) of those higher up in an overriding fashion. A class object os lower in the hierarchy to a class object oc is called a subclass of oc , and oc is called a superclass of os and we say that the ordering relation os < oc holds. The objective of organizing objects in a hierarchy of classes is to share properties of the objects in useful, economical and meaningful ways via inheritance based on specificity dominance principle. This means that a property is inherited in a subclass from a superclass (similarly, in an instance from a class) only if a local definition4 is not available. This implies that a subclass will inherit the most specific definition of a property in a hierarchy. Our approach to the definition based overriding is shared by most classical object-oriented models including O2 [17], Orion [15], Gemstone [18], C++ [22], Smalltalk [5], etc. However, in the case of a multiple inheritance, a subclass may be required to inherit a property from more than one superclasses and an inconsistency may result due to the difference in implementation of the method or the signature. This is known as inheritance conflict or ambiguity. The resolve to this problem in our model is that the subclass should inherit neither unless a single preference is specified. The logic behind this resolve is that a subclass should inherit properties from a unique source, and in the event of a conflict should reject all to avoid a non-deterministic interpretation of the hierarchy. Finally, a subclass object inherits all the class properties from any of its superclasses in an overriding fashion. In a similar way, any instance of a subclass will inherit all the instance properties from any of the superclasses of the subclass again in an overriding way. This in the spirit of SDM and have important modeling applications (see Example 3.1).

2.4

Relationships

Unlike many other models, we allow relationships in the sense of ER model. This allows us to capture symmetric associations among objects in a way which is effective, efficient and intuitive by avoiding method based inter-object association called the inversion [19]. Note that inversion is the sole basis for establishing association among objects in most of the object-oriented data models. Research shows that implementing constraints and maintaining inversion cost just too much specially when private methods are involved in the inversion [2, 21].

3

The Datalog++ Language Overview

We now introduce the informal syntax and semantics of the Datalog++ language with an overview to its salient features. The goal here is to develop a logic based language to represent and query deductive object-oriented databases. Datalog++ extends Datalog syntax to capture the concepts such as classes, objects, signatures, is-a, methods, etc., sometimes in a meta-logical way. While the extension in Datalog++ is syntactical, the semantic interpretation of every Datalog++ program is still given in Datalogneg – a variant of Datalog that incorporates negation5 and hence has a firstorder interpretation. Since some of the Datalog++ features lack logical interpretation (meta-logical in nature), the semantics relies on a translation function to Datalogneg such that every translated feature is given a relational characterization. In this way, it becomes possible to capture most of the non-standard object-oriented notions in a meta-logical way but yet simultaneously give a logical interpretation to these features. We will show that not only our semantics is richer, stronger and more intuitive than Datalogmeth [1], OOLP+ [8], logic language in [4], NUOO Prolog [3] and similar 4 Note an important difference here with most of the logic based languages. Languages such as [1, 4, 9, 14, 16] insist that overriding will take place only if the execution of the method definition does not produce any value. In our case, the definition is enough for overriding to take place and the success or failure of the execution does not affect the overriding. This choice also has serious conceptual and implementation efficiency related consequences that we discuss at length in [11] and in [10]. 5 Notice that the choice of the negation semantics in Datalogneg does not affect the semantics of Datalog++ programs.

127

other languages, our syntax is also much more flexible that allows the users to model their universe of discourse in an object-oriented way as opposed to relational ways as in some of these proposals [1, 4].

3.1

Informal Syntax of Datalog++

There are five types of atoms in our language L of Datalog++ : (global) predicates, local predicates, message predicates, instance is-a, and subclass is-a atoms6 . - Global predicate: Let p be a predicate symbol of arity k – denoted p/k, and a1 , . . . , ak be terms. Then p(a1 , . . . , ak ) is a global predicate. A predicate assumes a meaning depending on its position in a rule. Usually a predicate represents a relation except when a predicate is in the rule body and a local predicate is in the rule head. In that case the predicate represents a property (method or attribute) of an object. - Local predicate: Let p(a1 , . . . , ak ) be a predicate and o be a term, then o : p(a1 , . . . , ak ) is a local predicate. Intuitively, a local predicate of the form o : p(a1 , . . . , ak ) means that the predicate p(a1 , . . . , ak ) holds in the object o. o is called the context or the descriptor of the local predicate atom. - Message predicate: Let p(a1 , . . . , ak ) be a predicate and o be a term, then o  p(a1 , . . . , ak ) is a message predicate. Intuitively, a message predicate of the form o  p(a1 , . . . , ak ) means that evaluate the predicate p(a1 , . . . , ak ) in the object o. - Is-a: Let oc , oi and os be terms. Then oi ∈ oc and os :: oc are respectively instance and subclass is-a atoms. Intuitively, they say that oi and os are instance and subclass of oc respectively. When the difference between the two is unimportant, we will write op in the remainder of this paper. Horn formulas involving atoms of L are defined as usual. Furthermore, let A be a predicate atom – global, local or message predicate. Then the notation pred(A) = p/k denotes the predicate symbol of A. For any local predicate (or a local clause) A of the form o : p(a1 , . . . , ak ), the function context(A) = o returns the object o where the predicate is defined. If A is a head atom, then we call o as the context of the clause A ← B1 , . . . , Bm when A = o : p(a1 , . . . , ak ). Similarly, for any message predicate of the form A = o  p(a1 , . . . , ak ), recipient(A) = o returns the target object. Definition 3.1 (Programs and Queries) A database ∆, or equivalently a program P, in Datalog++ is an expression of the form Σ, , add σi [oc , π, ϕ, p/k] in DΣ  , and • for every control definition of the form < reject γ p/k f rom oa >, add ρ[γ, p/k, oc, oa ] in DΣ  . Example 4.1 (Disassembling) The application of disassembling discussed above on the database D produces the set D1 of reduction expressions in Figure 2. As a result, D = D1 , D< , DΨ , DΠ . α[grad stud]. α[faculty]. α[gta]. β[gta, grad stud]. β[gta, faculty].

σi [grad σi [grad σi [grad σi [grad σi [grad σi [grad

stud, stud, stud, stud, stud, stud,

σi [faculty, pub, val, name/1]. σi [faculty, pub, val, eid/1]. σi [faculty, priv, val, salary/1]. σi [faculty, pub, code, income/1]. σi [faculty, pub, val, avg income/1]. σi [faculty, pub, code, meandev/1]. σc [faculty, pub, code, total faculty/1].

pub, val, name/1]. pub, val, sid/1]. priv, val, stipend/1]. pub, code, income/1]. pub, val, avg income/1]. pub, code, meandev/1].

σi [gta, priv, val, taship/1]. ρ[sig, avg income/1, gta, faculty]. ρ[sig, salary/1, gta, faculty]. ρ[sig, income/1, gta, faculty].

Figure 2: Effect of disassembling on the database D.

4.2

Exposing Clause Locality Through L-closures

The notion of clause locality introduced in Example 3.1 is, however, a meta-knowledge and is not explicitly captured in a Datalog++ program. L-closure defined below helps syntactically expose this important piece of knowledge implicitly assumed by every Datalog++ programmer. Whenever a local clause of the form o : p(t1 , . . . , tk ) ← G ∈ Π, then add an expression of the form λ[o, p/k] to Π. The expression of the form λ[o, p/k] captures the fact that a method predicate p of arity k is locally defined in object o. Example 4.2 (L-closure) Recall that l-closure concerns the locality of the method clauses. Taking l-closure of D will syntactically expose the locality of method clauses in it and add the following reduction expression set D2 to DΠ giving D∗ = D1 , D< , DΨ , DΠ ∪ D2 . D2 is shown in Figure 3. Once we have the knowledge about the locality in this form and have the signature expressions in DΣ  , we can readily determine the inheritability of methods in Π using the ∇m function presented in Definition 4.1. Let the expression of the form ωm [p/k, o, q, ϕ] denote the fact that the object o inherits the method p/k with a mode ϕ from q. We can now use this explicit expression to capture inheritance of methods with overriding and conflict resolution as discussed next. 10 Observe that the expressions in D  are not in the language of Datalog++ , nor they are in Datalogneg . The goal Σ here is to recognize them as special expressions which will ultimately be converted to Datalogneg vocabulary. The same remark applies for the expressions introduced in the next few sections. To distinguish between the Datalog++ formulas and these expressions, we will call them reduction expressions since they are introduced during the reduction process. For simplicity, we will call Datalog++ formulas and reduction expressions as expressions in the rest of this paper.

132

λ[sally, taship/1].

λ[grad stud, stipend/1]. λ[grad stud, income/1]. λ[grad stud, avg income/1]. λ[grad stud, meandev/1]. λ[joe, stipend/1]. λ[kelly, income/1].

λ[faculty, total faculty/1]. λ[faculty, salary/1]. λ[faculty, income/1]. λ[faculty, avg income/1]. λ[faculty, meandev/1]. λ[john, name/1]. λ[max, salary/1].

λ[gta, taship/1]. λ[gta, income/1].

Figure 3: Database D∗ obtained as a result of taking l-closure on the database D .

4.3

Inheritance Based on I-completion

To be able to code inherit a rule (i.e., apply the rule) at a lower subclass or instance, we need to change the context of the rule to a variable and restrict the instantiation of the variable to only those contexts where the rule can be legitimately applied (i.e., inherited). The objects, or the contexts, where a rule corresponding to a method is inheritable can be easily and statically determined (at compile time) by the following function11 . Note that the idea of conflict resolution is built into this function as discussed in section 3.2.2. Definition 4.1 (Method Inheritability) Let S be a set of (ground) is-a atoms, p/k be a predicate symbol, and o be an object. Then the inheritability of method p/k in the object o is defined by the method inheritability function ∇m as follows12 :   if method p/k not local to o and [∃q such that oq ∈ S ∇m (S, p/k, q) = os ,      method p/k is local to os , ∇s (S, p/k, o) = ∇s (S, p/k, os ) and (∀r, such that         os or ∈ S, one of the following holds.  ∇m (S, p/k, o) =  • ∇m (S, p/k, r) = r, and method p/k is not local to r,     • ∇ (S, p/k, r) = o , or o rejects method or signature p/k from r.)]   m s       o, in all other cases. 2 Once the inheritability of a method (method defined through a local clause) is known, the instantiation of the context variable of the (rewritten) clause can be restricted to legitimate objects as follows: Replace every clause of the form Cl ≡ A ← G ∈ Π, with a clause of the form Cl ≡ (A)[o//V ] ← (G)[o//V ], ωm [p/k, V, o, code] ∈ Π i such that the context(Cl) = o, pred(A) = p/k and V is a distinct variable not occurring in Cl. Example 4.3 (I-completion) Observe that the purpose of taking the i-completion of programs is to restrict the instantiation of the context variables of the method clauses to the legitimate objects in the program by applying suitable control mechanism. I-completion of the D∗ replaces the clauses in DΨ and DΠ with sets D3 and D4 respectively. Since there are no global clauses in D, D3 is empty while D4 is the set in Figure 4 (derived from the corresponding method clauses (1) through (22) in the set D). After taking i-completion, D∗ becomes Di = D1 , D< , DΨ ∪ D3 , D2 ∪ D4 . Notice that i-completion can only handle code inheritance, hence value inheritance, which is a much simpler problem, must be handled separately. There are two ways to address this issue – (i) by rewriting every rule in Π, or (ii) by adding an axiom to Π i . We take the latter approach for efficiency reasons. 11 A

similar function, in a slightly different setting, was used in ORLog [13]. in fact, need another function ∇s similar to ∇m to determine signature inheritability. The definition of ∇s may be found in [11]. 12 We,

133

(1 ) V:stipend(12K) ← ωm [stipend/1,V,grad stud,code]; (2 ) V:income(X) ← stipend(X), ωm [income/1,V,grad stud,code]; (3 ) V:avg income(avg()) ← O∈grad stud, Oincome(I), ωm [avg income/1,V,grad stud,code]; (4 ) V:meandev(X) ← income(I), avg income(A), X=abs(I−A), ωm [meandev/1,V,grad stud,code]; (6 ) V:stipend(15K) ← ωm [stipend/1,V,joe,code]; (8 ) V:income(X) ← johnsalary(X), ωm [income/1,V,kelly,code]; (9 ) V:total faculty(count()) ← O∈faculty, ωm [total faculty/1,V,faculty,code];

(10 ) V:salary(60K) ← ωm [salary/1,V,faculty,code]; (11 ) V:income(X) ← salary(X), ωm [income/1,V,faculty,code]; (12 ) V:avg income(avg()) ← O∈faculty, Oincome(I), ωm [avg income/1,V,faculty,code]; (13 ) V:meandev(X) ← income(I), avg income(A), X=abs(I−A), ωm [meandev/1,V,faculty,code]; (16 ) V:name(”John”) ← ωm [name/1,V,john,code]; (17 ) V:salary(75K) ← ωm [salary/1,V,max,code]; (18 ) V:taship(16K) ← ωm [taship/1,V,gta,code]; (19 ) V:income(X) ← stipend(S), taship(T), X=S+T, ωm [income/1,V,gta,code]; (22 ) V:taship(20K) ← ωm [taship/1,V,sally,code];

Figure 4: Database Di obtained from D∗ after taking i-completion.

4.4

Encapsulation Through Context Resolution

Let the expression [p/k, or , os ] state the fact that p/k is accessible in object or by object os . The accessibility function presented below formalizes the notion of encapsulation discussed in section 2.2 in the form of accessibility of methods. Definition 4.2 (Method Accessibility) Let S be a set of (ground) signature expressions, I be a set of (ground) is-a atoms, p/k be a predicate symbol, os and or be two object symbols. Then the accessibility of method p/k in the object or with respect to os is defined by the context resolution function Υ as follows:   if one of the following conditions holds:          • os = or ,      true  • os = or , and or  os holds, or  Υ (S, p/k, os , or ) =  • os = or , π = pub, and ((σc [oc , π, ϕ, p/k]∈S and ∇s (I, p/k, or )       = oc ) or (σi [oc , π, ϕ, p/k]∈S and ∇s (I, p/k, or ) = oc )) holds       f alse, in all other cases. 2 A method p/k is accessible in the recipient or by a sender os of a message if and only if Υ (S, p/k, os , or ) = true13 . To capture this behavior, we transform every local clause in Π as follows: Replace every local clause Cl ≡ A ← B1 , . . . , Bn ∈ Π, add Cl obtained from Cl as follows: • Cl = Cl , • for every Bi ∈ Cl , such that 0 ≤ i ≤ n and context(Cl) = os do the following: – if Bi is a message predicate add [p/k, or , os ] as Bn+1 in Cl where context(Cl) = os , recipient(Bi ) = or , pred(Bi ) = p/k and os = or . – if Bi is a (self) predicate then replace Bi as os  Bi . Notice that an accessibility expression is added only when it is necessary to do so – that is the addition is avoided when the terms representing the sender and receiver objects are identical, implying a self invocation. 13 We must now be able to (dynamically) determine the accessibility of methods in every Datalog++ program by adding suitable computing machinery. We also need to transform every query in a similar way. A complete discussion on accessibility may again be found in [11].

134

Example 4.4 (Context Resolution) The clauses that are affected by context resolution are (3 ), (8 ), (12 ) in D4 and the query clauses (23) through (32) in DΨ . Recall that the purpose of resolving context is to enforce encapsulation, and we do so only if an object is sending a message to an object other than itself. After context resolution, we will have the sets D5 and D6 as in Figure 5. Consequently, Dc = D1 , D< , D3 ∪ D6 , D2 ∪ ((D4 \ {3 , 8 , 12 }) ∪ D5 ), where D5 = {3 , 8 , 12 } and D6 = {23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 }. (3 ) V:avg income(avg()) ← O∈grad stud, Oincome(I), ωm [avg income/1,V,grad stud,code], [income/1,O,V]; (8 ) V:income(X) ← johnsalary(X), ωm [income/1,V,kelly,code], [salary/1,john,V]; (12 ) V:avg income(avg()) ← O∈faculty, Oincome(I), ωm [avg income/1,V,faculty,code], [income/1,O,V];

(23 ) ? sallyincome(X), [income/1,sally,o ]; (24 ) ? sueincome(X), [income/1,sue,o ]; (25 ) ? joeavg income(X), [avg income/1,joe,o ]; (26 ) ? joemeandev(X), [meandev/1,joe,o ]; (27 ) ? johntotal faculty(X), [total faculty/1,john,o ]; (28 ) ? facultytotal faculty(X), [total faculty/1,faculty,o ]; (29 ) ? gtatotal faculty(X), [total faculty/1,gta,o ]; (30 ) ? joestipend(X), [stipend/1,joe,o ]; (31 ) ? kellyincome(X), [income/1,kelly,o ]; (32 ) ? joeincome(X), [income/1,joe,o ];

Figure 5: Database Dc from Di after context resolution.

4.5

Datalogneg Rewriting of Datalog++ Programs

We are now ready to define an algorithm to reduce every Datalog++ program to an equivalent Datalogneg program. This requires us to develop a translation function τ as stipulated in Definition 4.3 that will map every Datalog++ expression and the reduction expressions to Datalogneg expressions. Given any Datalog++ expression φ, its encoding into Datalogneg , denoted φ , is given by the following recursive transformation rules. In the following, τ is an identity function on terms and symbols in Datalog++ . • Encoding of complex formulas: – τ (A ← B1 , . . . , Bm ) = τ (A) ← τ (B1 ), . . . , τ (Bm ) • Encoding of atomic Datalog++ formulas (given case by case): – – – – – – –

τ (p(a1 , . . . , ak )) = rel(p, arg(a1 , . . . , ak ))14 . τ (o  p(a1 , . . . , ak )) = rel(p, arg(a1 , . . . , ak )). τ (o : p(a1 , . . . , ak )) = rel(p, arg(a1 , . . . , ak )). τ (o:p(a1 , . . . , ak )) = meth(o, p, k, arg(a1 , . . . , ak )) when o = o . τ (op(a1 , . . . , ak )) = meth(o, p, k, arg(a1 , . . . , ak )) when o = o . τ (o ∈ q) = ins(o, q). τ (o :: q) = sub(o, q).

• Encoding of reduction expressions (given case by case): – – – – – – 14 Note

τ (α[oc ]) = class(oc ). τ (β[oc , os ]) = sub(oc , os ). τ (σc [oc , π, ϕ, p/k]) = sig(oc , π, ϕ, p, k, class). τ (σi [oc , π, ϕ, p/k]) = sig(oc , π, ϕ, p, k, ins). τ (ρ[γ, p/k, oc , oa ]) = rej(γ, p, k, oc , oa ). τ (λ[o, p/k]) = loc(o, p, k).

that τ (p(a1 , . . . , ak )) = p(a1 , . . . , ak ) is also possible.

135

– τ (ωm [p/k, o, q, ϕ]) = meth inh(p, k, o, q, ϕ). – τ ([p/k, or , os ]) = vis(p, k, or , os ). Example 4.5 (Translation) Let Dτ = τ (Dc ) denote the translation of Dc to CORAL. Then Dτ is equivalent to the set shown in Figure 6.

4.6

Reduction

Finally, we have the following definition for inheritance and encapsulation reduced first-order Datalog++ programs. Let the reduction expressions denote the set of meta-expressions introduced in a Datalog++ program during, disassembling, l-closure, i-completion and context resolution. Definition 4.3 Let P be a Datalog++ program, and τ be a translation function for every expression in P. Let P∇s , P∇m , and PΥ be Datalogneg programs that implement respectively the ∇s , ∇m , and Υ functions. Also let Pisa and Pval be Datalogneg programs for computing the reflexive transitive closure  from < and the value inheritance axiom respectively. If P↓ is a disassembled, l-closed, i-completed and context resolved program of P then Pr ≡ τ (P↓ ) ∪ P∇s ∪ P∇m ∪ PΥ ∪ Pisa ∪ Pval is the inheritance and encapsulation reduced Datalogneg program of the Datalog++ program P. The reduction algorithm can be given as follows as suggested by Definition 4.3. Note that the programs P∇s , P∇m , PΥ , Pisa and Pval are already in Datalogneg . Input: A Datalog++ program P. Output: A reduced program Pr of program P in Datalogneg . begin - P = Apply disassembling to program P. - P∗ = Apply l-closure to program P . - Pi = Apply i-completion to program P∗ . - Pc = Apply context resolution to program Pi . - Pr = τ (Pc ) ∪ P∇s ∪ P∇m ∪ PΥ ∪ Pisa ∪ Pval . end. Note that, so long P remains a definite program, the reduction Pr is always stratified. The inheritability axioms we introduce as part of the implementation for P∇s and P∇m contain negative literals and thus necessitate Datalogneg . However, the rules in P∇s and P∇m are locally stratified as shown in Example 4.6. Example 4.6 (Reduction) The reduction algorithm in section 4.3 requires the addition of the following set of axioms (in Figure 7), namely Disa , D∇m , D∇s , DΥ , and Dval , to the translated database Dτ in Example 4.5 as a final step15 . Notice the simplicity of the Dval component as discussed earlier in section 4.3. So finally, Dr = Dτ ∪ Disa ∪ D∇m ∪ D∇s ∪ DΥ ∪ Dval as expected. It is easy to see from the Examples 4.1 through 4.6 that the increase in database size is negligible. Specifically, if the size of a database D = DΣ , D< , DΨ , DΠ  is SDΣ , SD< , SDΨ , SDΠ respectively component wise, then the size of the reduced database is almost SDΣ , SD< , SDΨ , SDΠ × 2 plus 26 15 Observe that in Example 4.6, and in the reduction algorithm above, we added D ∇m and D∇s to compute the method and signature inheritability respectively using logical negation, and thus require stratification. We actually do not need to add them in a real implementation where a static compiler is used to determine the inheritability. This important information then can be added as facts to the reduced programs in the form of inheritability relations and thus obviating the need for a dynamic computation of D∇m and D∇s using stratification. We have shown them here for expository purposes only.

136

r45 r46

r1 r2 r3

class(grad stud). class(faculty). class(gta).

r4 r5

sub(gta,grad stud). sub(gta,faculty).

r6 r7 r8 r9 r10 r11

ins(joe,grad stud). ins(kelly,grad stud). ins(john,faculty). ins(max,faculty). ins(sally,gta). ins(sue,gta).

r12 r13 r14 r15 r16 r17

sig(grad sig(grad sig(grad sig(grad sig(grad sig(grad

r18 r19 r20 r21 r22 r23 r24

sig(faculty,pub,val,name,1,ins). sig(faculty,pub,val,eid,1,ins). r55 sig(faculty,priv,val,salary,1,ins). sig(faculty,pub,code,income,1,ins). sig(faculty,pub,val,avg income,1,ins). sig(faculty,pub,code,meandev,1,ins). r56 sig(faculty,pub,code,total faculty,1,class).

r25

sig(gta,priv,val,taship,1,ins).

r26 r27 r28

rej(sig,avg income,1,gta,faculty). rej(sig,salary,1,gta,faculty). rej(sig,income,1,gta,faculty).

r29 r30 r31 r32 r33 r34 r35 r36 r37 r38 r39 r40 r41 r42 r43 r44

loc(grad stud,stipend,1). loc(grad stud,income,1). loc(grad stud,avg income,1). loc(grad stud,meandev,1). loc(joe,stipend,1). loc(kelly,income,1). loc(faculty,total faculty,1). loc(faculty,salary,1). loc(faculty,income,1). loc(faculty,avg income,1). loc(faculty,meandev,1). loc(john,name,1). loc(max,salary,1). loc(gta,taship,1). loc(gta,income,1). loc(sally,taship,1).

stud,pub,val,name,1,ins). stud,pub,val,sid,1,ins). stud,priv,val,stipend,1,ins). stud,pub,code,income,1,ins). stud,pub,val,avg income,1,ins). stud,pub,code,meandev,1,ins).

r47

r48

r49 r50

r51 r52 r53 r54

r57

meth(V,stipend,1,arg(12K)) ← meth inh(stipend,1,V,grad stud,code). meth(V,income,1,arg((X)) ← meth(V,stipend,1,arg((X)), meth inh(income,1,V,grad stud,code). meth(V,avg income,1,arg(avg())) ← tins(O,grad stud), meth(O,income,1,arg((I)), vis(income,1,O,V), meth inh(avg income,1,V,grad stud,code). meth(V,meandev,1,arg((X)) ← meth(V,income,1,arg(I)), meth(V,avg income,1,arg(A)), X=abs(I−A), meth inh(meandev,1,V,grad stud,code). meth(V,stipend,1,arg(15K)) ← meth inh(stipend,1,V,joe,code). meth(V,income,1,arg(X)) ← meth(john,salary,1,arg(X)), meth inh(income,1,V,kelly,code),vis(salary,1,john,V). meth(V,total faculty,1,arg(count())) ← tins(O,faculty), meth inh(total faculty,1,V,faculty,code). meth(V,salary,1,arg(60K)) ← meth inh(salary,1,V,faculty,code). meth(V,income,1,arg(X)) ← meth(V,salary,1,arg(X)), meth inh(income,1,V,faculty,code). meth(V,avg income,1,arg(avg())) ← tins(O,faculty), meth(O,income,1,arg(I)), vis(income,1,O,V), meth inh(avg income,1,V,faculty,code). meth(V,meandev,1,arg(X)) ← meth(V,income,1,arg(I)), meth(V,avg income,1,arg(A)), X=abs(I−A), meth inh(meandev,1,V,faculty,code). meth(V,name,1,arg(”John”)) ← meth inh(name,1,V,john,code). meth(V,salary,1,arg(75K)) ← meth inh(salary,1,V,max,code).

r59

meth(V,taship,1,arg(16K)) ← meth inh(taship,1,V,gta,code). meth(V,income,1,arg(X)) ← meth(V,stipend,1,arg(S)), meth(V,taship,1,arg(T)), X=S+T, meth inh(income,1,V,gta,code).

r60

meth(V,taship,1,arg(20K)) ← meth inh(taship,1,V,sally,code).

r61 r62 r63 r64 r65

? ? ? ? ?

r66

?

r67

?

r68 r69 r70

? ? ?

r58

meth(sally,income,1,arg(X)), vis(income,1,sally,o ). meth(sue,income,1,arg(X)), vis(income,1,sue,o ). meth(joe,avg income,1,arg(X)), vis(avg income,1,joe,o ). meth(joe,meandev,1,arg(X)), vis(meandev,1,joe,o ). meth(john,total faculty,1,arg(X)), vis(total faculty,1,john,o ). meth(faculty,total faculty,1,arg(X)), vis(total faculty,1,faculty,o ). meth(gta,total faculty,1,arg(X)), vis(total faculty,1,gta,o ). meth(joe,stipend,1,arg(X)), vis(stipend,1,joe,o ). meth(kelly,income,1,arg(X)), vis(income,1,kelly,o ). meth(joe,income,1,arg(X)), vis(income,1,joe,o ).

Figure 6: Dτ = τ (Dc ) – Datalogneg encoded database Dc .

137

Disa :

D∇m (continued):

a1 a2

parent(X,Y) ← sub(X,Y). parent(X,Y) ← ins(X,Y).

a3 a4

tins(X,Y) ← ins(X,Y). tins(X,Y) ← ins(X,Z),tsub(Z,Y).

a5 a6 a7 a8 a9

tsub(X,X) tsub(X,X) tsub(Y,Y) tsub(X,Y) tsub(X,Y)

a10 a11

tisa(X,Y) ← tins(X,Y). tisa(X,Y) ← tsub(X,Y).

D∇m : a12

a13

a14

a15

a16

← ← ← ← ←

sig(A,B,C,D,E,X). sub(X,Y). sub(X,Y). sub(X,Y). sub(X,Z),sub(Z,Y).

a17

conflict(M nam,Ar,Obj,Sou obj, S obj,M mod) ← meth inh(M nam,Ar,A obj, Asou obj, O M mod), parent(Obj,A obj), S obj=A obj, Obj=A obj, Asou obj=Sou obj, ¬rej(M nam,Ar,Obj,A obj).

D∇s : a18 a19

a20 meth inh(M nam,Ar,Obj,Obj,M mod) ← sig inh(M nam,Ar,Vis,M mod, ins,Obj, S obj), loc(M nam,Ar,Obj),ins(Obj,S obj). meth inh(M nam,Ar,Obj,Obj,M mod) ← a21 sig inh(M nam,Ar,Vis,M mod,class, Obj,S obj), loc(M nam,Ar,Obj), sub(Obj,S obj). meth inh(M nam,Ar,Obj, Sou obj,M mod) ← poss(M nam,Ar,Obj, Sou obj,M mod), a22 Obj = Sou obj, ¬loc(M nam,Ar,Obj), loc(M nam,Ar,Sou obj), sig inh(M nam, Ar,Vis,M mod, ins,Obj,Ssou obj),, sub(Sou obj,Ssou obj), ins(Obj,Sou obj). meth inh(M nam,Ar,Obj, Sou obj,M mod) ← DΥ : poss(M nam,Ar,M mod,Obj, Sou obj), Obj = Sou obj, ¬loc(M nam,Ar,Obj),, a23 loc(M nam,Ar,Sou obj), sig inh(M nam,Ar, a24 Vis,M mod,class, Obj,Ssou obj), a25 sub(Sou obj,Ssou obj), sub(Obj,Sou obj). poss(M nam,Ar,Obj,Sou obj,M mod) ← Obj=S obj, meth inh(M nam,Ar,S obj, Dval : Sou obj,M mod) ¬rej(M nam,Ar,Obj,S obj), parent(Obj,S obj), ¬conflict(M nam,Ar,Obj, a26 Sou obj,S obj,M mod).

sig inh(M nam,Ar,Vis,M mod,Level, Obj,Obj) ← sig(M nam,Ar,Vis,M mod,Level, Obj). sig inh(M nam,Ar,Vis,M mod,ins, Obj,Sou obj) ← Obj = Sou obj, poss sig(M nam,Ar,Obj, Sou obj), ¬sig(M nam,Ar,O vis,O m mod, O level,Obj), sig(M nam,Ar,Vis,M mod,ins, Sou obj), ins(Obj,Sou obj). sig inh(M nam,Ar,Vis,M mod,class, Obj,Sou obj) ← Obj = Sou obj, poss sig(M nam,Ar,Obj, Sou obj), ¬sig(M nam,Ar,O vis,O m mod, O level,Obj), sig(M nam,Ar,Vis,M mod,class, Sou obj), sub(Obj,Sou obj). poss sig(M nam,Ar,Obj,Sou obj) ← parent(Obj, S obj), Obj=S obj,¬rej(M nam,Ar,Obj,S obj), sig inh(M nam,Ar,O vis,O m mod, O level, S obj,Sou obj), ¬conflict sig(M nam,Ar,Obj, Sou obj, S obj). conflict sig(M nam,Ar,Obj,Sou obj, S obj) ← parent(Obj,A obj), S obj=A obj, Asou obj=Sou obj, sig inh(M nam,Ar,O vis, O m mod, O level,A obj,Asou obj), Obj=A obj,¬rej(M nam,Ar,Obj,A obj). vis(M nam,Ar,Rec,Sen) ← tisa(Rec, Sen). vis(M nam,Ar,Rec,Sen) ← Rec = Sen. vis(M nam,Ar,Rec,Sen) ← sig inh(M nam,Ar, pub,M mod,Level, Rec,S obj).

meth(Obj,M nam,Ar,Arg) ← meth(Sup,M nam, Ar,Arg), meth inh(M nam,Ar,Obj,Sup,val).

Figure 7: Reduction of D into database Dr . axioms. That is, we add only one locality rule for each method definition in DΠ . But, in an actual implementation where we pre-compile inheritability, we throw away these information too. Hence, the database size becomes almost identical to the input database at run time.

5

Avoiding Non-monotonic Reasoning

We must mention here that inheritance with overriding and exception handing are inherently nonmonotonic concepts. To be able to capture overriding in a logic based language in particular, the knowledge required to manage exceptions must be encoded using some form of “negation”. While most models use logical negation in various ways, and fall back to stratification, we do not. We exploit the syntactic properties of programs, such as locality of method clauses, to determine the inheritability of these clauses using static compilation. This is possible because we adopt a static overriding approach similar to many imperative languages such as C++ [22], Smalltalk [5], and object-oriented databases such as Gemstone [18], Orion [15], and O2 [17]. Although the inheritability function ∇m uses negative

138

information, it is easy to see that the concepts of rule locality and inheritability of clauses help eliminate the need for stratification, and thus help improve the overall efficiency significantly. This should not be confused with logical negation used in other languages.

6

Datalog++ Interface to CORAL

We have developed a Datalog++ interface to the CORAL deductive database system under Unix. The interface is written in C++ and X. The interface provides a complete programming environment. Users always perceive their universe in Datalog++ while the interface reduces Datalog++ programs and queries to CORAL programs (an extension of Datalogneg ). The CORAL program is then evaluated and the answers are returned to the users. We do not include a discussion on the architecture of the Datalog++ System in this paper for the want of space. However, a complete discussion on the Datalog++ System, its architecture and implementation techniques may be found in [10].

7

Conclusion and Future Works

Our work can be extended in several different ways. It seems appropriate for some applications to have a choice to override methods dynamically or statically based on the need of the application. For some applications, a method may call for monotonic inheritance and not override at all. We think, giving such a choice to application designers will result in a flexible design environment. While Datalogneg and CORAL has built-in optimization mechanisms, specific Datalog++ optimization techniques may be possible that utilizes knowledge specific to object-oriented paradigm. Finally, update can be accommodated as an orthogonal feature as it was addressed in [4]. These are some of the issues we plan to investigate in our future research.

Acknowledgement: We express our gratitude to the anonymous referees for their helpful comments. In particular, their editing suggestions helped reshaping the presentation of the material contained in this paper. We also thank the referee who identified few important typographical errors. Finally, the author would like to thank Mohammad Ashrafuzzaman of the University of Saskatchewan, Canada for reading the initial version of this paper and giving helpful comments to improve the presentation of the final version.

References [1] S. Abiteboul, G. Lausen, H. Uphoff, and E. Waller. Methods and Rules. In ACM SIGMOD Conference on Management of Data, pages 32–41, 1993. [2] A. Albano, G. Ghelli, and R. Orsini. A relationship mechanism for a strongly typed object oriented database programming language. In Proceedings of the 17th International Conference on Very Large Data Bases, pages 565–575, Barcelona, 1991. [3] F. Belli, O. Jack, and L. Naish. Object-oriented programming in Prolog: Rationale and a case study. Technical Report 92/2, Department of Electrical and Electronics Engineering, University of Paderborn, Paderborn, Germany, 1992. [4] E. Bertino and D. Montesi. Towards a logical object-oriented programming language for databases. In A Pirotte, C. Delobel, and G. Gottlob, editors, Proc. of the 3rd Intl. Conf. on EDBT, pages 168–183. Springer-Verlag, 1992. LNCS 580.

139

[5] A. H. Borning and D. H. Ingalls. A type declaration and inference system for Smalltalk. In Proc. of the ACM Symposium on Principles of Programming Languages, pages 133–141, 1982. [6] M. Bugliesi. A declarative view of inheritance in logic programming. In K. Apt, editor, Proc. Joint Int. Conference and Symposium on Logic Programming, pages 113–130. The MIT Press, 1992. [7] M. Bugliesi and H. M. Jamil. A logic for encapsulation in object oriented languages. In M. Hermenegildo and J. Penjam, editors, Proceedings of the 6th International Symposium on Programming Language Implementation and Logic Programming (PLILP), pages 213–229, Madrid, Spain, 1994. Springer-Verlag. LNCS 844. [8] M. Dalal and D. Gangopadhyay. OOLP: A translation approach to object-oriented logic programming. In Proceedings of the First DOOD Conference, pages 593–606, 1990. [9] G. Dobbie and R. Topor. A Model for Sets and Multiple Inheritance in Deductive Object-Oriented Systems. In Proc. 3rd Intl. DOOD Conf., pages 473–488, December 1993. [10] H. M. Jamil. Architecture and implementation of the Visual Datalog++ system. Technical report, Department of Computing, Macquarie University, Sydney, Australia, February 1997. Submitted for publication. [11] H. M. Jamil. Implementing abstract objects with inheritance in Datalogneg . In Proceedings of the 23rd International Conference on Very Large Databases (VLDB), Athens, Greece, 1997. To Appear. [12] H. M. Jamil and L. V. S. Lakshmanan. ORLog: A Logic for Semantic Object-Oriented Models. In Proc. 1st Int. Conference on Knowledge and Information Management, pages 584–592, 1992. [13] H. M. Jamil and L. V. S. Lakshmanan. A declarative semantics for behavioral inheritance and conflict resolution. In John Lloyd, editor, Proceedings of the 12th International Logic Programming Symposium, pages 130–144, Portland, Oregon, December 1995. MIT Press. [14] M. Kifer, G. Lausen, and J. Wu. Logical Foundations for Object-Oriented and Frame-Based Languages. Journal of the Association of Computing Machinery, 42(3):741–843, July 1995. [15] W. Kim. A model of queries for object-oriented databases. Technical Report ACA-ST-365-88, MCC, 1988. [16] M. J. Lawley. A Prolog interpreter for F-logic. Unpublished Manuscript. Griffith University, Brisbane, Australia, 1993. [17] C. Lecluse, P. Richard, and F. Velez. O2 , An Object-Oriented Data Model. ACM Press, 1987. [18] D. Maier and J. Stein. Development and implementation of object-oriented DBMS. In B. Shriver and P. Wegner, editors, Research Directions in Object-Oriented Programming, pages 355–392, Cambridge, MA, 1987. MIT Press. [19] J. Peckham and F. Maryanski. Semantic data models. ACM Computing Surveys, 20(3):153–189, September 1988. [20] Raghu Ramakrishnan, Divesh Srivastava, S. Sudarshan, and Praveen Seshadri. The CORAL deductive system. The VLDB Journal, Special Issue on Prototypes of Deductive Database Systems, 3(2):161–210, April 1994. [21] J. Rumbaugh. Relations as semantic constructs in an object-oriented language. In OOPSLA, pages 466–481, 1987. [22] B. Stroustroup. The C++ Programming Language. Addison-Wesley, 1986. [23] J. H. You, S. Ghosh, L. Y. Yuan, and R. Goebel. An introspective framework for paraconsistent logic programs. In John Lloyd, editor, Proceedings of the 12th International Logic Programming Symposium, pages 384–398, Portland, Oregon, December 1995. MIT Press.

140