Practical approaches to maintaining referential integrity in ...

2 downloads 0 Views 814KB Size Report
a fail-safe store-and-forward manner, requesting the execution of operations. Each request initiates a new transaction at the server. The incoming request mes-.
Practical Approaches to Maintaining Referential Integrity in Multidatabase Systems Debajyoti Mukhopadhyay and Gomer Thomas Bellcore, 444 Hoes Lane, Piscataway, New Jersey 08854 (debm, gomer)@ctt.bellcore.com

ential integrity constraints, in the context of the transaction models or interaction models which are most commonly encountered in practice in distributed systems: the queued message model, the remote procedure call model, the distributed transaction processing model, and the local transaction dialog model. Referential integrity is a frequently encountered type of semantic integrity constraint. It applies in situations where data entities contain references to other data entities, and it requires that the entities referenced must actually exist. It was originally defined as one of the key integrity features of the relational data m ~ d e I , [ ~but ~ [in ~ ]fact it is equally meaningfuland important in other data models.[15](See the generic definition in the next section.) The motivation for satudying referential integrity constraints goes beyond their intrinsic importance. Their semantics are relatively well studied and understood, so there exists a firm foundation for analysis. At the same time, they embody many key features of other types oEsemantic integrity constraints, so that solutions developed for them can be generalized.

Abstract An important problem in multidatabase systems is maintru'ning semantic integrity constraints among the multiple databases in the system. This paper investigates practical approaches to enforcing referential integrity, one of the nwst important classes of constraints. Referential integrity is important in its own right and also provides a good starting point for studying more general cldsses of constraints. The semantics of referential integrity are relatively well known and understood, yet they have much in common with more general classes of constraints. This paper considers what enforcement mechanisms can be used in the context of the transaction models or interaction models which are widely found in practice today: the queued message model, the remote procedure call model, the distributed transaction model, and the local transaction dialog model.

1.

Introduction

An important problem in multidatabases is maintaining data consistency.There has been a ood deal of research on consistency of replicated data[''][ [191[201and on managing semantic integrity in centraliLed databascs (referencing only some of the extensive literature in these areas). Some work has also appeared on specification of general data interdependencies, including semantic integrily constraints, in muludatabases, and on some aspects of implementati~n.[*~][~~][~~~[~] However, the implementations typically assume that the data managers can be chosen or designed or modified as needed to support the proposed algorithms and protocols. These solutions,are very useful to designers of data management products. Application developers, however, are often faced with the need to maintain semantic integrity constraints in a multidatabase environment which is based on existing standard commercial products, so that the developer must work within the limitations of the existing data managers. This paper investigates how to maintain an important class of semantic integrity constraints, the so-called refer-

%

42

0-8186-3710-2/93$3.00 0 1993 IEEE

2.

Referential integrity

The specification of a semantic integrity constraint should include the invariant to be preserved and the action(s) to be taken to restore the invariant when an operation affecting it is executed.[51 We give the referential integrity definitions in a generic form, to emphasize that referential integrity is not limited to the relational data model. We assume that we have an entity set El in a database DB, and an entity set in a database DB2, such that each entity instance in El references some entity instance in &. (Our primary interest is the case when DB, and DB2 are different databases.) We assume this is implemented by having a foreign key attribute FK of El and a primary key attribute PK of E2 with the following invariant: Referential Integrity Condition: For each entity instance of El the value of FK is equal to the value of PK for some entity instance of E2

to explicitly request a whole sequence of associated deletes or updates in precisely the right order. We will often refer to the original update action as the primary action and the actions resulting from action rules (which may be retrievals or updates) as secondary actions. Note that in a distributed system with semi-autonomous subsystems, it is always possible for some secondary actions to fail because of system failures or other factors. In such a situation, it may be necessary for the primary action and other secondary actions to be rolled back, in order to preserve data consistency.

Often a slightly relaxed form of the constraint is allowed in which entities of E l may explicitly reference a NULL entity, i.e., in which the value of FK may be NULL. Note that in an object-oriented system the primary key would normally be the object identifier, and the foreign key would be a pointer or other form of reference to it. Each entity instance of El references at most one entity instance of E2, but an instance of E, may be referenced by multiple instances of El. Moreover, although we are focusing on a single pair of entity sets with a referential integrity constraint between them, in practice there may be multiple foreign keys in one or more DBs referencing the same primary key. We now define the possible action rules for how to pass from one valid state of the database to another as updates are performed.[”] There are two situations to consider: (1) When an attempt is made to create/modify FK (i.e. create an instance of El or modify the FK value of an instance of El), the usual action rule is the so-called Restrict Rule: If the update to FK would lead to a violation of the referential integrity condition, the update is disallowed (rolled back). (2) When an attempt is made to deletefmodify PK (i.e. delete an instance of & or modify the PK value of an instance of E*),there are several possible action rules: Restrict Rule: If the update to PK would lead to a violation of the referential integrity condition, the update is disallowed (rolled back). Cascade Rule: Delete/modify any FK entities in El referencing the deleteamodified PK entity in E,. Set NULL Rule: Set to NULL any FK values in El referencing the deleted or modified PK entity in E,. Set Default Rule: Set to a designated default value any FK values in El referencing the deleted or modified PK entity in Q. There are actually two forms of the PK restrict rule. In the so-called hard restrict rule, a check is made ahead of time whether any of the PK entities to be deleted or modified have matching FK entities, and if so the update is disallowed. In the so-called SOBrestrict rule, the check is not performed until after the update and all its associated cascaded actions have been performed, and if the check fails then all the actions are rolled back. Since a single update operation may affect multiple PK entities, and since cascaded actions may affect still more PK andor FK entities, the soft resmct rule may allow desired updates which would not be allowed by the hard restrict rule. Restrict rules are fundamentally easier to manage than cascade or set NULL/default rules. On the other hand, they impose considerably more burden on applications, since an application wanting to delete or update an entity may have

3.

Interaction models and replication strategies

In this section we describe the interaction models and replication strategies which we will consider.

3.1. Interaction models

9

There are fundamentally four transaction models or interaction models between components of a distributed system which are commonly supported by current commercial products: Queued message model (QM): Each system has an “input message queue” and an “output message queue.” Messages are sent from clients to servers in a fail-safe store-and-forward manner, requesting the execution of operations. Each request initiates a new transaction at the server. The incoming request message, the processing associated with the request, and any outgoing response to the request are all under the umbrella of a common transaction, which is committed unilaterally by the server. In the event of a crash in the midst of processing a request, the recovery process not only rolls back any partial processing and any response on the output queue associated with the request, it also restores the request message to the input message queue so that it is ready to be processed again. Once the request message is sent, it will eventually get processed and the response returned, even in the event of multiple crashes by client and/or server. Typically no context is maintained at the server between messages. Remote procedure call model (RPC): Messages are sent from client to server requesting the execution of operations.Each request initiates a new transaction at the server. The server commits transactions unilaterally, with no commit coordination between client and server. If the server crashes after the request is received but before the operation is executed and the response sent, it is up to the client to discover that

43

fact and take appropriaterecovery action. The server will typically have no memory after recovery that the request was ever received. No transaction semantics is maintained between client and server. Distributed transaction processing model (DTP): Clients engage in dialogs with servers. Collectionsof operations at a server are executed as a subtransaction of a distributed uansaction. Whenever a transaction is to be committed, each server involved in the transaction must enter into a distributed commit protocol (e.g. two-phase commit) with the client and the other servers. Full transaction semantics are preserved among all participating parties. Local transactions dialog model (LTD): Clients engage in dialogs with servers. Sequences of operations at a server (within a dialog) are executed as stand-alone transactions, not part of a distributed transaction. Typically the client will tell the server when work should be committed, and the server will then attempt to commit its work without any further coordination with the client or with other servers. The QM model is widely used by older transaction processing systems, a great many of which are still in heavy use in industry. It is still the model of choice even for new applications in certain situations. The RPC model is not widely used for database applications, although the mechanisms to support it are widely available. The DTP model is not yet widely used, but an increasing number of vendor products are supporting it, and for certain types of applications it is the model of choice. The LTD model is widely supported by vendom of relational database management systems and is relatively widely used. The I S 0 RDA (Remote Database Access) recognizes both the LTD and the DTP models (called the “RDA Basic Applications Context” and the “RDA TP Applications Context” in the standard).

the referential integrity condition for PK deletes/modifies (See Figure l(b).)

DB1

DB2

(a) PK Replication

DB1

(b) FK Replication

Figure 1. Replication strategies PK replication makes it efficient to create/update FK instances, since the checking for the existence of the referenced PK instance can be done locally. However, creation of a new PK instance must be propagated to the replicated copy, and deletionhpdate of a PK instance requires propagation to the replicated copy, as well as any other checking, cascading, etc. Similarly, FK replication makes it efficient to deletehpdate PK instances. However, deletion of an FK instance must be propagated to the replicated copy, and creationhpdate of a FK instance requires propagation to the replicated copy as well as checking. A mixture of replication approaches may be used. For example, DB, may keep replicated copies of some foreign keys referencing the same primary key, but no replicated copies of others. Some of the DBs containing foreign keys may keep replicated copies of the primary key; others may not.

3.2. Replication strategies In analyzing how to minimize the communications overhead and other problems associated with enforcement of referential integrity, there are several replication strategies to be considered: No replication PK replication FK replication In the no replication case, no data values are replicated. In the PK replication case, a record of all the PK entities in entity set E2 is kept in DB1 for use in checking the referential integrity condition for FK creates/modifies.(See Figure l(a).) In the FK replication case, a record of all the FK instances in entity set El is kept in DB2 for use in checking

4.

Referential integrity under QM

The QM model has several features which pose problems for inter-DB referential integrity: Transaction fragmentation: Any operation at one DB which requires communicationswith other DBs must take place as a sequence of transactions, since each incoming message intitiates a new transaction. Lack of distributed atomicity (commit protocol): Since each DB is unilaterally commiting its transactions, it is possible for some of the actions of a dis-

44

tributed operation to succeed at some DBs while other associated actions fail at others. Lack of distributed isolation (concurrency control): As transactions are committed at individual DBs, their locks are dropped. Thus partial effects of uncompleted distributed operations are visible to other operations. The problems which result from these features include: Difficulties with “soft” restrict rules: With no distributed atomicity it is not feasible to post-check at other DBs after performing actions at one DB, as needed for the “soft” restrict rule. Communicating with other DBs to check the invariant requires committing the performed work. If the check fails, it is too late to roll back. Therefore we assume in the queued message context that restrict rules are “hard” restrict rules. Difficulties with “hard” restrict rules: The lack of distributed isolation means that even when prechecking at other DBs, as needed to enforce the “hard” restrict rule, other transactions may change the state of the other DBs between the time of the check and the time of the operation, since no locks are being held by the data manager at the other DBs. This is a major consideration in the analysis which follows. Difficulties with cascade and set NULL/default rules: The lack of distributed atomicity means that if some secondary actions fail, it is too late to roll back the primary action or any previous secondary actions. They have already been committed. This is another major consideration in the analysis which follows.

DB1

DB2

request received to create instance Of

E,

request received to delete instance Of E2

J

check for nonexistence of referencing FK values

check for existence of ,referenced value of PK

create instance of

delete instance of

El

E2

Figure 2. Inconsistency Scenario Under QM with No Replication lack of atomicity, i.e. from some actions succeeding while others fail. It arises from a lack of isolation, i.e. from allowing these two conflicting operations to interleave with each other instead of following one after the other as would be required by a concurrency control protocol. There is no truly satisfactory general solution to this problem. There are some possible solutions which may work in special circumstances, depending on application characteristics: Administrative separation: It may be possible to ensure through administrative policies and procedures that PK deletedmodifies and FK creates/modifies never take place concurrently. For example, it may be that FK updates have to be applied during the normal working shift, but all PK updates can be deferred to overnight batch jobs. Compensatingtransactions: If compensatingtransactions are available for the create FK and delete PK operations, then post-checking of the referential integrity condition can be used instead of pre-checking. This would avoid the problem above. However, one can then get a situation where both the create FK and the delete PK operation are rolled back (compensated), when in fact only one of them needs to be rolled back to preserve integrity. Also, the temporary inconsistencies which exist between the time an action is committed and the time it is compensated may not be acceptable for some applications. Application level locking: To delete a PK entity, first mark it for deletion with an application level tag, then check for non-existence of referencing FK entities, then delete the PK entity (or just unmark it if the check is unsuccessful). To create an FK entity, first

9

.

4.1. The no replication case

.

If no replicated data is being maintained between DBs, then the execution of any referential integrity action rule requires on4 ine communications between DBs. Unfortunately, a straightforwardimplementationof the action rules can lead to data inconsistencies,in some situations because of a lack of distributed isolation and in some cases because of a lack of distributed atomicity. For example, if the restrict rule is in effect for both “create FK” and ‘“deleteP K , the sequences of actions at DBI and DB2 shown in Figure 2 can occur (where each box represents a separate local transaction): Assume the instance of E, being created references the instance of E, being deleted. Both of the check steps would succeed, sinc:e neither the create nor the delete has been executed at the time the checks are made, so both the create and the delete would take place. This would result in a dangling reference -- a violation of referential integrity. An important point is that this problem does not arise from a

45

.

create a version marked “pending” with an application level tag, then check for existence of the referenced PK entity, then remove the tag (or delete the pending version if the check is unsuccessful). When checking for non-existence of referencing FK entities, marked entities are counted as existing. When checking for existence of a referenced PK entity, marked entities are counted as not existing. This solves the concurrency control problem. However, the problem arises of how to handle requests from other clients to access a marked entity. In theory one would just like to block such requests until the status of the marked entity is resolved. However, this could turn into a very long wait. Moreover, in a QM system there is no good mechanism to achieve such blocking, especially if one wants to allow the request to time out and return with a “not available” error code. The alternative is to allow requests to access marked entities, but to return an indication of the marked status. This is usually unsatisfactory, since it requires other applications to take account of an attribute which is nothing more than an artifact of the concurrency control mechanism. If cascade or set “LL/default action rules are in effect, then the lack of a distributed commit protocol can lead to data inconsistencies because of some secondary actions failing after the primary action and perhaps some of the other secondary actions have already committed. Again, there is no truly satisfactory solution to this problem. Some possible solutions which may work in special circumstances include: Avoidance: Avoid cascade and set NULL/default action rules. As noted earlier, this puts a greater burden on application programmers. Guarantee constraint “safety”: Take great care in the definition of all semantic integrity constraints to make sure that secondary actions will ultimately not fail (although they may be delayed sometimes because of system failures). This is usually not possible. Compensating transactions: This solution is discussed above. Moreover, even if all cascade and set NULL/default actions succeed, there will still be transient inconsistencies between the time that the primary action is executed and the time all associated secondary actions are executed.

Createhpdate FK: The local copy of PK at DBl can be used to check for the existence of the referenced value, rather than having to communicate with DB2. Create PK: The create actions must be applied to the local copy of PK at DB,, as well as to the original copy at DB,. Deletehpdate PK: The delete/update actions must be applied to the local copy of PK at DBl, in addition to applying them to the original copy at DB2 and carrying out associated secondary actions. Concurrency control problems can be avoided in this case by the following steps: Deletehpdate PK with restrict rule: When DB2 checks for the non-existence of referencing FK entities at DB,, DB2 must notify DB, of the pending deletehpdate actions. If the check is successful (no referencingFK values exist), then the pending delete/ update actions are marked on the local PK copy at DB,. DB, must later notify DB, whether or not the PK delete/updateactions succeeded,so that the pending actions can be applied or cancelled. Deletehpdate PK with cascadehet NULL/default rule: The deletehpdate actions must be propagated to DB 1, and DB must carry out the primary actions on the local PK copy and the secondary actions on FK as an atomic action. Note that this essentially uses an application level lock. However, it is less objectionable in this case than in the “no replication” case because it is applied only to a local copy which is not accessible to clients. Thus, PK replication not only increases the efficiency of handling FK creates/ updates, it also provides a solution to concurrency control problems. Atomicity problems are not so easy to avoid. If there are any cascade or set NULL/default action rules, then data inconsistencies can arise because of the lack of distributed commit protocols, just as in the “no replication” case. The same approaches can be used to prevent them, with the same limitations. Transient inconsistencies can arise as well, because of the delays between the primary action and associated secondary actions.

4.3. The FK replication case At first sight it appears that updates in this case can be handled in much the same way as in the “PK replication” case. However, the possible presence of cascade rules can create additional complexity. Consider the scenario below, in which the “cascade delete” rule is in effect, and a “create F K operation is executed concurrently with a “delete PK’ operation on the referenced PK entity. The actions appear in chronological order. C1, C2, C3, C4 are the steps of the

4.2. The PK replication case The differences between the handling of updates in this case and the “no replication” case are:

46

create operation. D1, D2, D3 are the steps of the delete operation. 1. (Cl) Receive request at DB, to create FK instance. 2. (C2) Check for existence at DB2 of referenced value of PK. Create local “pending” instance of FK at DB,. 3. (Dl) Receive request at DB2 to delete PK instance. Delete PK instance at DB2. Nark local “pending” copy of matching FK entity at DB2 for deletion (cascading). 4. (D2) Delete matching FK entity at DB, (cascading --but FK entity doesn’t exist yet!). 5. (D3) Delete local “pending” copy of matching FK entity at DB2 6. (C3) Create FK entity at DBI. 7. (C4) Remove the “pending” tag from the local copy of the FK entity at DB,. The first problem here is that the cascaded delete of the FK entity may reach DB, before the actual creation of the entity happens. If that invalid delete action is just discarded, then the final step will attempt to remove a “pending” tag from a local copy which no longer exists. If that invalid action is just discarded, the final result will be a violation of referential integrity. To avoid this, additional measures would have to be taken, such as DB, saving the cascaded delete action on the FK entity (D2) and using it to cancel the create action (C3) when it arrived later. Even if these concurrency control problems are resolved, there will still be the same atomicity problem and the same transient inconsistency problems as in the “no replication” and “PK replication” cases.

existence of referenced entities can be performed, then the primary action can be committed or aborted as required. Since the locks on the primary action are being held throughout the time the checks are being performed, this will work as long as the local data managers all have the property that the order in which conflicting transactions logically serialize is the same as the order in which their operations occur chronologically -- which is true for strict two-phase locking, used by nearly all commercial data managers. (Strict two-phase locking means that all locks acquired by a transaction are held until the transaction commits.) However, deadlock can be a problem. For example, concurrent attempts to create an FK entity and delete the PK entity to be referenced by it could lead to the following situation: The FK entity is created (but not yet committed) at DB,. The PK entity is deleted (but not yet committed) at DB,. Then the check is initiated at DB, for the non-existence of an FK entity referencing the PK value, but it is blocked by the lock on the newly created FK entity. Similarly, the check is initiated at DB2 for the existence of a PK entity referenced by the FK value, but it is blocked by the lock on the newly deleted PK entity. The two operations are now deadlocked. Neither can commit until its check is completed, and neither check can complete until the other commits. Many methods have been pro sed for distributed deadlock prevention or detection.p“ 61 Unfortunately, nearly all require support from the local transaction managers, and very few transaction managers provide this support. In practice it is often necessary to use application level timeouts to detect deadlocks. The atomicity problems of the QM model are also present in the Rpc model, since each server unilaterally commits work when it is completed. The same approaches needed in the QM model would be required here to alleviate the problem.

4.4. The mixed replication case It is possible to use PK replication for some foreign keys, FK replication for others, and no replication for still others. No additional complications are introduced beyond the complications inherent in each of the replication strategies themselves.

5.

6.

Referential integrity under DTP

We assume here that we have a distributed Eansaction manager which guarantees both atomicity and serializability of distributed transactions. In this situation it is clear that for all replication smtegies (no replication, PK replication, FK replication, mixed replication) the referential integrity action rules can be implemented in a straightforward way as distributed transactions, and data consistency will be achieved. However, there are some very important points to keep in mind: Both atomicity and serializability are required. If the local data managers use strict two-phase locking for concurrency control, then both atomicity and serial-

Referential integrity under RPC

We assume here a client-server RFC model in which the client maintains local context between RPC calls, but the server does not. Thus each call to the server is a separate local transaction at the server, but there may be a single local transaction at the client spanning multiple calls to one or several servers. Under this model usually restrict rules can be correctly handled in practice. The primary action can be performed (but not committed), then checks for existence or non-

9

47

waiting until all secondary actions have been completed before requesting that any be committed. If the local data managers use strict two-phase locking for concurrencycontrol, then the strategy of waiting until all secondary actions have been completed before requesting any actions to be committed ensures distributed serializability. Deadlocks may arise, just as with the DTP model. Thus, if there is no need for cascade or set NULL/ default referential action rules, this model allows straightforward implementation of referential integrity checking. The biggest problem is deadlock detection and resolution, which is essentially the same problem one would have in the DTP model. If cascade or set NULL/default rules are needed, the window of vulnerability to inconsistency caused by a failure can be made very small, but cannot be eliminated entirely.

izabilit are guaranteed by a two-phase commit protoc01,[~~ but it is essential that even read-only subtransactions hold their locks until the commit point. Distributed deadlocks may arise. For example, consider the checkcheck, create-delete scenario of Figure 2. Let Tl be the distributed transaction required to create an instance of El, and let T2 be the distributed transaction required to delete an instance of E2. Then TI has a read-only subtransactionRl at DB2 (checking for the existence of the referenced PK), and T2 has a read-only subtransactionR2 at DB1 (checking for the nonexistence of a referencing FK). It might be assumed that since these subtransactions do not modify the database, they do not need to hold their locks until the commit points of their respective distributed transactions. However, if these read-only subtransactions are allowed to drop their locks early, the same kind of inconsistency can arise as with the QM model. Note moreover that the lock placed by the read-only subtransaction of T2 must be a predicate lock, not just a data item lock.[71(A predicate lock not only locks the data ikms accessed by a query, it also ensures that no other transaction can create a data item which would have been accessed by the query had it existed at the time the lock was created.) The subtransactionRz of T2 issues a query at DB1 to retrieve instances of El containing a particular FK value. No instances are returned. T2 then proceeds to create the desired instance of at DB2. Until T2 commits, no other transaction can be allowed to create an instance of El at DB1 which would have satisfied the query. If all locks are held to the commit point, then the scenario just described can lead to a deadlock.

7.

9

8.

Conclusions

This paper investigates mechanisms for, and limitations on, enforcement of referential integrity constraints in the context of the queued message, remote procedure call, distributed transaction processing, and local transactions dialog models. The point is not to advocate one interaction model over another. This choice is usually dictated by a numerous considerations, of which referential integrity maintenance is only one. The point is to help understand what can be done and how to do it in the context of the model being used and the typical commercial products embodying that model. In the queued message, remote procedure call and local transactionsdialog models the lack of distributed atomicity (the lack of a distributed commit protocol) means that inconsistencies can arise if any cascade or set NULL/ default action rules are in effect, although the window of vulnerability can be made fairly small in the local transactions dialog model. Thus in these models those referential action rules should be avoided, with a consequent burden on applications programmers to carry out operations on related entities in the correct order and to take corrective actions when anything fails. Even if only restrict rules are used, consistency can only be ensured in the queued message model by administrative separation of FK creates/modifies from PK deletes/modifies or by application level locking. Application level locking is really only satisfactory when PK or FK replication is used, so the locking can be done on a local replicated copy which is not visible to users. In the remote procedure call, distributed transaction processing and local transactions dialog models concurrency control is a concern, but problems can be avoided when the

Referential integrity under LTD

We assume here a client-server model in which a client carries on a conversation or dialog with one or more servers. At any time the client can choose to commit its own local transaction, and at any time it can request that some or all of the servers commit or roll back their local transactions. A server will not commit its local transaction unless it is requested to do so.It may or may not be able to commit when it is requested to do so. It will always roll back any uncommitted work if the dialog is aborted. In this model three types of problems arise: There is no distributed atomicity, so there is the possibility of inconsistencies arising from the fact that some secondary actions may succeed while others may fail. This is only a problem if cascade or set NLJLL/default action rules are in effect. Moreover, the window of vulnerability can be minimized by

48

local data managers are using a strict two-phase locking protocol for their local concurrency control. However, deadlock is a potential problem, for which there is no completely satisfactory remedy. It is not clear how serious a problem it is in practice.

9.

[l 11 B.M. Horowitz: “A Run-Time ExecutionModel for Referential Integrity Maintenance,” Proc. 8th lnt. Conf. on Data Eng.. 1992. [121 ISO/IEC JTCl/SC21/WG3 (Database): “Information Technology - Open Systems Interconnection - Remote Database

Access - Part 1: Generic Model, Service and Protocol,” ISO/ IEC IS 9579-1, Jan 1993.

Acknowledgements

[131 ISO/IEC JTCl/SC21/WG3 (Database): “InformationTech-

This paper grew out of work on a “data layer engineering” project, the objective of which was to develop practical guidelines for the design and implementation of “Data Layer Buildin Blocks” under the Bellcore 0SCAm architecture.[’ [‘‘I The authors gratefully acknowledge the many contributionsof the other members of the project team: Bruce M. Horowitz, Haim Kilov, John A. Mills, and Hassan Srinidhi. The extensive expertise of Bruce Horowitz in the area of referential integrity was especially valuable.

nology - Open Systems Interconnection - Remote Database Access - Part 2: SQL Specialization,’’ISO/IEC IS 9579-2, Jan 1993.

-a

[ 141 Q. Li, D. McLeod: “Managing Interdependenciesamong

Objects in Federated Databases,”Proc.IFIP DS-5 Workshop (Semantics of Interoperable Database Systems), Nov 1992. [ 151 V.M. Markowitz: “Referential Integrity Revisited An Object-Oriented Perspective,”Proc. 15th Int. Cor$ on Very

Large Data Bases, Sep 1990. [16] D.R. McCarthy, U. Dayal: “The Architecture of an Active Data Base Management System,”Proc. 1989ACM SIGMOD Int. Con5 on Management of Data, Jun 1989.

References Y.Breitbart, A. Silberschatz:“MultidatabaseSystems with a Decentralized Concurrency Control Scheme,”IEEE Distributed Processing TC Newsletter 10:2. Nov 1988.

[171 J.A. Mills, L. Ruston: ‘The OSCATMArchitecture: Enabling

S. Ceri, J. Widom: “Deriving Production Rules for Constraint Maintenance.” Proc. 16th lnt. Cot$ on Very Large Data Bases, Sep 1991.

[ 181 J.A. Mills: “An OSCATMArchitecture Characterization of

E.F. Codd: “Extending the Database Relational Model to Capture More Meaning.”ACM T r m .on Database Sys. 4:4, Dec 1979.

[19] J. Paris, D. Long: “Efficient Dynamic Voting Algorithms,” Proc. 4th Int. Conf. on Data Eng., 1988.

Independent Product Software Maintenance,”Proc. EUROMICRO ‘90Workshop on Real Time, Jun 1990. Network Functionality and Data,” J. Sys. Integration, Jul 1991.

[20] S.Rangarajan, S.Setia. S.K.Tripathi: “A Fault-Tolerant Algorithm for Replicated Data Management,” Proc. 8th Int. Cog. on Data Eng., 1992.

C.J. Date: “ReferentialIntegrity,” Proc. 7thInt. Cog. on Very Large Data Bases, Sep 1981.

[21] M. Rusinkiewicz, A. Sheth, G. Karabatis: “Specifying Interdatabase Dependencies in a Multidatabase Environment,” IEEE Computer 24:12, Dec 1991.

C.J. Date: An lntroduction to Database Systems, vol. II. Addison-Wesley, 1983 A.K. Elmapnnid, “A Survey of Distributed Deadlock Detection Algorithms,” ACM SIGMOD Record 15:3, Sep 1986.

[22] A. Sheth, M. Rusinkiewicz, G. Karabatis: “Using Polytransactions to Manage InterdependentData,” Database Transaction Modelsfor Advanced Applications (A. Elmagarmid. ed.), Morgan Kaufmann Pub., 1992.

K.P. Eswaran, J.N. Gray, R.A. Lorie, I.L.Traiger: “The Notions of Consistency and Predicate Locks in a Relational Database System,”CommunicationsACM8:11, Aug 1976.

[23] M. Stonebraker, L.E. Hanson, C.-H. Hong: “The Design of the POSTGRES Rules System,”Proc. 3rd Int. Conf. on Data Eng., 1987.

H. Garcia-Molina, D. Barbara: “How to Assign Votes in a Distributed System,” Journal ACM 324, Oct 1985.

[24] S.D. Urban, A.P. Karadimce. R.B. Nannapaneni: ‘The Implementation and Evaluation of Integrity Maintenance Rules in an Object-Oriented Database,” Proc. 8th Int. C o ~ . on Data Eng., 1992.

N. Gehani, H.V. Jagadish: “Ode as an Active Database: Constraints and Triggers,” Proc. 17th Int. Cog. on VeryLarge DataBases. Sep 1991. [lo] D.K. Gifford “Weighted Voting for ReplicatedData,”Proc. 7th Symp. on Operating Sys. Prin., 1979.

[25] G. Wiederhold. X. Qian: “Modeling Asynchrony in distributed Databases.” Proc. 3rd Int. Conf. on Data Eng., 1987.

OSCA is a trademark of Bell Communications Research, Inc. (Bellcore)

49