Relational Inheritance Thoughts

From: Trent Shipley <tshipley(at)deru(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Relational Inheritance Thoughts
Date: 2006-01-06 17:23:36
Message-ID: 200601061023.37083.tshipley@deru.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Relational Inheritance Thoughts

The most fundamental property of relational inheritance is that it creates
hierarchies of relations that act as composite relations. That is,
relational inheritance produces a tree of relations (presumably tables) that
itself can be treated as a relation. The tree can be queried as if it were a
single table, and provided that the signature of the tuple is distinct,
inserts and updates can also be polymorphically executed against the
hierarchical composite relation.

Obviously, relational inheritance is one way to realize an object-relational
feature in a database. What is almost as obvious is that inheritance
introduces a non-relational database element into the underlying database
model. Depending on whether or not the implementation supports single or
multiple inheritance the resulting product supports not only a relational
database model but also either a hierarchical or an acyclic network [acdg].
At the risk of belaboring the point, relational single inheritance resembles
a hierarchical database, like a traditional computer file system. A
relational multiple inheritance database resembles a network database, like
the once promising CODASYL standard or the GROVE or DOM representation of an
XML hyper-document.

I recall reading somewhere that network databases can realize all the
functional power of a relational database and the reverse is also true (thus,
network and relational databases are functional equivalents). Translating
between the two models is, unfortunately, not trivial.

The reader will know from experience that normalized relational databases (and
more so, some strategic denormalized data warehouse forms) are relatively
easy to query. On the other hand, algorithmic data structures naturally
become trees and directed graphs. It is easy to do application programming
for a network database.

A database with relational multiple inheritance is a hybrid between a
relational database and network database that ought to support any mixture of
the data models without prejudice. The ability to handle a network data
model has the advantage of eliminating the need to provide a translation
layer that marshals data to and from each application that uses the database.
The networked object oriented data from an application can (theoretically) go
directly into the network composite relation without the translation needed
to put it in relational form.

More generally, the ability to work with a dual system supporting both
relational and network models simultaneously will give database designers
another degree of freedom in expressive power. A good designer will be that
much better with relational inheritance options. (On the other hand, novice
and poor designers will have more rope with which to hang themselves.)

On the downside, one expects that a data design cannot be optimized both for
write operations and for queries. Relational inheritance allows database to
be designed for the convenience of application programmers at the expense of
report writers. Given the typical design process for databases and the
relative power and prestige of application and report developers one indeed
expects that relational inheritance *will* be used to the advantage of
application programmers and the detriment of report writers.

Furthermore, relational multiple inheritance databases have to encompass the
complexity of Relational X Network. A relational multiple inheritance
database system should be a degree of magnitude more complex than its
relational (or network) relative. The reward will be more expressive
convenience for database designers. The resulting system, however, will have
no more functional power than either a relational or network database system.

Also interesting is that a table (or relation) may have an “attitude” toward
parenthood.

The table may be preapted. A preapted table is explicitly and completely
ready for use as the parent of a class hierarchy. It is fully class aware.
One should be able to explicitly make a preapted stand-alone table Adhamic,
ready for use as the first ancestor of a hierarchy. Of course, you should
also be able to declare that a preapted table is FINAL. Furthermore,
declaring any class property on a table should implicitly make the table
itself a member of a relational inheritance network. Note that strict
preaptation does not allow for opportunistically finding a purely relational
table and using the relational table as the first ancestor of a relational
inheritance class.

Another attitude toward parenthood is indifference. There are various ways
that a table may be purely relational, thus indifferent to being used as a
class ancestor. The table may be implicitly ignorant. As far as a found
Adhamic table is concerned there is no class. It is absolutely unconstrained
by any objects dependent on it and it would be utterly ignorant that it is
implicated in a class. No doubt, implicit indifference would make for a
problematic implementation.

In another variation on indifference, a relational table could be found and
quietly, implicitly bound to a class. This is arguably what Postgresql's
INHERITS clause is working toward though the implementation is incomplete.
The incomplete implementation leaves Postgresql with some annoying aspects
reminiscent of implicit ignorance.

In a third variation a designer can opportunistically inherit from a purely
relational table, but first the relational table must be made explicitly
aware that it will be used in a relational network. One would probably
extend ALTER TABLE to accommodate the new semantics.

It should go without saying, but some naïve suggestions for extending
Postgresql's INHERITS clause regarding class unique constraints warrant
mentioning that relational inheritance classes *must* be hierarchical. It
may be tempting to allow global declarations that let descendants constrain
the current state of an ancestor, but down that road lies chaos. Relational
inheritance should use a strict subclassing model wherein subclasses nest
within subclasses until one reaches to top of a hierarchy.

Two subclassing models present themselves. One is Java-like where all objects
must be derived from objects, ultimately tracing their ancestry to the
Ur-object. In a database this would mean that every table would have to
ultimately derive from ur-table. The Java-like system can be called an
“forced inheritance modality.” The other modality is more like C++.
Multiple hierarchies are allowed, tables can be relational and need not be
object aware. Object tables can derive from relational tables. This sort of
system can be called a “dual inheritance modality”. Postgresql's INHERITS
clause implements the more expressive dual inheritance modality. The major
detraction is that a class cannot be directly instantiated, but can only be
based on a preexisting purely relational table.

As you read this, remember that the relational inheritance classes are not
slavishly analogous to classes in object oriented programming languages. For
example, in well designed OO programming you avoid directly accessing data.
With relational inheritance, in contrast, you directly access data. The
tuple is the interface.

------

[acdg]: Of course, in asserting that a relational inheritance database with
multiple inheritance support is acyclic, one assumes that there is no way to
“forward declare” an inheritance vector. A DECLARE/DEFINE function might
provide mutual or recursive inheritance.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2006-01-06 17:26:26 Re: Indexes works only on miss
Previous Message mordicus 2006-01-06 17:08:48 Re: Problems building pg 8.1.1