A Relational Model of Data for Large Shared Data Banks
E. F. CODD IBM Research Laboratory, San Jose, California
Future having internal such at and are needed traffic with models are form data tions and in the
users to know information
large how is not the
data data A
protectedmachine which Activities should
from (the supplies of users remain
is organized solution. programs of the in query, types data of more external
in the service
representation). and when when as and the some Changes a result natural noninferential, files data. A base are (other to the model.
AND PHRASES: of integrity 3.73,
a satisfactory application internal aspects in data in theformatted or based introduced. than logical of slightly
terminals even changed.
of data will update, stored general systems
is changed often and provide information. users network models a normal a universal operadiscussed consistency be
tree-structured of the discussed. foron data relations sublanguage applied user’s
In Section model relations,
1, inadequacies on n-ary and the In Section inference) redundancy concept
of these relations, of are and 2, certain
The relational view (or model) of data described in Section 1 appears to be superior in several respects to the graph or network model [3,4] presently in vogue for noninferential systems. It providesa means of describing data with its natural structure only-that is, without superimposing any additional structure for machine representation purposes. Accordingly, it provides a basis for a high level data language which will yield maximal independence between programs on the one hand and machine representation and organization of data on the other. A further advantage of the relational view isthat it forms a sound basis for treating derivability, redundancy, and consistency of relations-these are discussedin Section 2. The network model, on the other hand, has spawned a number of confusions, not the least of which is mistaking the derivation of connections for the derivation of relations (seeremarks in Section 2 on the “connection trap”). Finally, the relational view permits a clearerevaluation of the scope and logical limitations of present formatted data systems, and also the relative merits (from a logical standpoint) of competing representations of data within a single system. Examples of this clearer perspective are cited in various parts of this paper. Implementations of systems to support the relational model are not discussed. IN 1.2. DATA DEPENDENCIES PRESENTSYSTEMSThe provision of data description tables in recently developed information systems represents a major advance toward the goal of data independence [5,6,7]. Such tables facilitate changing certain characteristics of the data representation stored in a data bank. However, the variety of data representation characteristics which can be changed without logically impairing some application programs isstill quite limited. Further, the model of data with which users interact is still cluttered with representational properties, particularly in regard to the representation of collections of data (as opposed to individual items). Three of the principal kinds of data dependencies which still need to be removed are: ordering dependence, indexing dependence, and accesspath dependence. In some systemsthese dependencies are not clearly separable from one another. 1.2.1. Ordering Dependence. Elements of data in a data bank may be stored in a variety of ways, someinvolving no concern for ordering, some permitting each element to participate in one ordering only, others permitting each element to participate in several orderings. Let us consider those existing systems which either require or...