database systems or database theory

7 downloads 0 Views 102KB Size Report
developments in the SQL standard and the facilities available in ... The questions that seem to me to need answering are. What is ... post on servers or to redistribute to lists, requires prior specific permission. ... interview with Chris Date [7], quoted by Joe Celko ... problem of optimization to solve. ... XML Query language [10].
DATABASE SYSTEMS OR DATABASE THEORY- OR “WHY DON’T YOU TEACH ORACLE” M.J.Ridley Dept of Computing School of Informatics University of Bradford [email protected] http://www.inf.brad.ac.uk/~mick/ ubiquitous nature we should remember were looked

ABSTRACT

on as of only academic interest at one time. What surely marks this era out is the sound theoretical basis of that database model and the progress that was made because of this. This is as opposed to the network and hierarchical systems that preceded it, whose “models” we only talk about with hindsight. But what age are we in now? Many of us may teach courses on “Database Systems” even using texts called “…Database Systems...” I do not think that in itself matters; referring back to the initial quote we can be teaching the practical implementation of a theory (or implementations of theories). But is there still a firm theoretical foundation for what we teach? Does that matter? Are we in danger of training students in the features of a particular system (and a convenient web interface for it) rather than educating them about databases in a broader sense? What is the relationship of Database teaching to other aspects of the curriculum such as study of data structures and object orientation? And are there theories of those subjects?

In this paper the role of theory in database teaching is reviewed particularly in light of recent developments in the SQL standard and the facilities available in modern DBMSs. Keywords Relational Database, Object-Relational Database, Object Database, XML Database, Database Theory

1. INTRODUCTION “Practice without Theory is blind, Theory without practice is sterile” Karl Marx [1] or if you prefer “Theory without Practice is Idle, Practice without Theory is Blind” Ancient Chinese saying [2] There is generally perceived relationship between theory and practice that I hope can be applied to the area of database studies in a form that most people would agree with i.e. that we need a theoretical framework in which to talk about databases if the result is to be productive, etc. That is, not a collection of facts which only apply to one system but theorems, rules, heuristics that can be of more general use over a number of situations and a period of time.

This paper will argue that we are in danger of losing sight of firm theoretical ground especially as we are under pressure to include yet more additional facilities of DBMSs. This has become more so as object systems have failed to achieve commercial success commensurate with their early promise and the pressure of the web has shifted the focus of much database work.

The questions that seem to me to need answering are. What is that framework? Is it the simple relational model? How important is that framework? (In practical, educational terms how much of the syllabus is this?) Are there competing theoretical frameworks that need considering?

2. DATABASE PARADIGMS In Kuhn’s [3] term relational theory is the “normal” science of databases and is the dominant paradigm. This follows the overthrow of the earlier network and hierarchical databases. This is not to say that the relational model is not and has not been under threat. This has been the case throughout the 1990s following:

The modern age of databases may have started with relational databases which despite their current Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Teaching, Learning and Assessment in Databases, Coventry 2003 © 2003 LTSN Centre for Information and Computer Sciences

127



The Object-Oriented Database System Manifesto’s proposal of an alternative paradigm [4].



The Third-Generation Database System Manifesto [5], which although distinguishing

itself from relational systems as the second generation does not seem to offer such a clear change of paradigm and it could be claimed focuses more on practice than theory. •

for database use and database teaching too. The Web has made many more people database users and heightened awareness of databases generally. In the database syllabus it has helped by:

and The Third Manifesto [6] which lays claim to a restatement of the correctness of the relational paradigm and places more emphasis on the problems of the existing practice built on that theory.

When we are so used to relational systems being the “normal” database it is hard to step back and recognise how long and hard it was for them to move from theory to practice. In a magazine interview with Chris Date [7], quoted by Joe Celko [8], reference is made to the “13 long years between Codd’s first external publication on the model and … the release of DB2 in 1983”. A key factor according to Date was “There was the technical problem of optimization to solve. The relational folks always believed it was solvable, but nobody had done it.” This seems to me to have very strong parallels with comments made by David Maier in a Sigmod Record interview [9]. In response to questions about the results of object database research and whether object databases failed to be the “next big thing”, Maier comments “...we finally figured out after ten years how to optimise OQL…”. He also comments with regard to object-relational databases that “there’s not a good design theory the way there was [with] normal form relations and so forth for relational databases.” and “… [the new features] at least [double] the number of [design] choices you have in every instance”. Maier also feels that the work on object technology “seems very usable for XML [databases].” This comment seems to fit with comments made by Don Chamberlin, one of SQL’s original authors, who acknowledges the influence of OQL on XQuery, the XML Query language [10]. This seems to support the position that we are the later days of relational theory of the dominant paradigm which is still being challenged, in a gradualist way by object-relational systems (if not an object-relational theory) and in a more profound way by object based alternatives.



giving much more emphasis to embedding SQL in applications,



giving a context to raise the importance of areas like transactions,



making issues like security and user identity and rights more “real” than when they are taught in the abstract.

It raises the issues of the semantic mismatch between the set oriented nature of SQL and most programming languages, but could also be seen as leading to a lot of work on JDBC as a relational – object “fix”. Does the importance of the Web lead directly to XML? Yes, “Theorem: the WEB changes everything. Corollary: XML is the means” [17] in the opinion of one leading database researcher, one of the authors of the Asilomar Report. But where then is our dominant relational paradigm left?

4. EDUCATIONAL CONSEQUENCES I feel we are often under pressure to “teach” Oracle. Many students see Computing as an essentially practical subject and are increasingly market driven. It is not just about Oracle I find an increasing number of students who want to use MySQL and PHP on final year projects. This is where without an understanding of the theory; they treat material on MySQL-PHP (which is extensive) like a recipe book and can not see how to transfer it to a different database or different situation. They often realise that they have normalisation problems late on and do not know how to solve them. The expansion of material in database text books includes much good material but is it part of the core of what students must study. If there is a limited space in the undergraduate syllabus to be devoted to database topics than it is crucial that the core theoretical topics get covered adequately and are not squeezed by the additions. This is true if these are additions to the theory, consideration of object databases as well as relational databases or additions in terms of applications such as web connectivity. If relational theory is only a small part of a large textbook, will it be perceived as a small topic by students even if we try to give it more prominence? If these additional subjects are worth studying then they deserve their own space. This is not easy; the computing syllabus is always under pressure to include more topics. Our solution has been to try to leave topics beyond basic relational material out of our core Database module and only

The growth of complexity and range of topics covered is reflected in the growth in size of database textbooks. Date has grown from 638 to 934 pages from 4th [11] to 7th [12] editions, Connolly and Begg (and Strachan) from 838 to 1236 pages from 1st [13] to 3rd [14] editions. Similarly the size of the SQL standard itself with SQL:1999, has increased massively [15].

3. THE WEB CHANGES EVERYTHING This section heading from the Asilomar Report on Database Research [16] emphasises the positive side and appropriately focuses on research opportunities but the web has changed everything 128

[6] Darwen, H and Date, C.J. The Third Manifesto. SIGMOD Record, Vol. 24, No. 1, March 1995. p. 39–49. http://www.acm.org/sigmod/record/issues/9503/

cover them (in options) in final year modules on Deploying Web Technologies and Advanced Database Systems. There still remain problems of how advance features can be taught. There is a lack of clear theory and a lack of standardisation to tackle. Some features of SQL 92 (e.g. information schema) are not widely implemented let alone a lot of SQL:1999. This means that it harder to focus on the theoretical concepts. Here I can pick on PostgreSQL rather than Oracle, with inheritance there is a need to cover two forms in SQL:1999 and an implementation of only one form but with a different syntax.

[7] Date, C.J. A Chat with the Great Explainer. DBMS Sept. 1989. p. 26-29. [8] Celko, J. Joe Celko’s data and databases: concepts in practice. san Francisco. Morgan Kaufmann.1999. p. 124-126. [9] Winslett. M. David Maier Speaks Out. SIGMOD Record. Vol. 31, No. 4. http://www.acm.org/sigmod/record/issues/0212/ [brackets in original] [10] Chamberlin, D. XQuery: An XML query language. IBM Systems Journal, Vol. 41, No. 4, 2002 p. 597-615.

How does the database syllabus relate to the rest of computing syllabus as a whole? With modularisation there is a strong tendency to put topics in unrelated compartments. We have considered trying to work against that by locating Databases as part of a wider Data Modelling stream encompassing other topics like Data Structures and Algorithms. This seems possible but also leads to the question of whether there is a more general object theory that informs programming and software development generally. If so can relational theory be located as a subset of that mainstream or does database stand out as an exception?

[11] Date, C.J. An Introduction to Database Systems, Vol. 1 4th ed. Reading, AddisonWesley, 1986. [12] C.J. Date, C.J. An Introduction to Database Systems, 7th ed. Reading, Addison-Wesley, 2000. [13] Connolly, T. ,Begg C. and Strachan, A. Database Systems, 3rd ed, Wokingham, Addison-Wesley, 1996

5. REFERENCES

[14] Connolly, T. and Begg C. Database Systems,3rd ed, Harlow, Addison-Wesley, 2002

[1] Marx,K. Letter to F. A. Sorge. Marx Engels Collected Works, Vol. 477 p. 531-32

[15] Eaglestone, B and Ridley, M, Web Database Systems, London, McGraw-Hill, 2001, p. 164.

[2] http://www.tqj.de/English/02-2/theory.html

[16] Bernstein, P. et.al. The Asilomar Report on Database Research. SIGMOD Record, |Vol27, No.4 1998. http://www.acm.org/sigmod/record/issues/9812/

[3] Kuhn, T. S. The Structure of Scientific Revolutions. 3rd ed. Chicago. The University of Chicago Press. 1996.

[17] Ceri, S. et. al. XML: Current Developments and Future Challenges for the Database Community. Proc of the 7th Int. Conf. on Extending Database Technology(EDBT), Springer, LNCS 1777, Konstantz, March 2000.

[4] Atkinson, M. et.al. The Object-Oriented Database Manifesto. in Deductive and Object Oriented Databases. Elsevier, 1990. p. 223-239 [5] The Committee for Advanced DBMS Function. The Third-Generation Database Manifesto. SIGMOD Record, Vol. 19, No. 3 September 1990. p. 31-44.

129