On improving OO support in posgresql and relaxing oid bottleneck at the same time

From: "Maurice Gittens" <mgittens(at)gits(dot)nl>
To: <hackers(at)postgreSQL(dot)org>
Subject: On improving OO support in posgresql and relaxing oid bottleneck at the same time
Date: 1998-04-05 13:31:51
Message-ID: 009e01bd6097$31a58060$fcf3b2c2@caleb..gits.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I'm currently under the impression that the following change in the
postgresql system would benefict the overall performance and quality
of the system.

Tuples for a class and all it's derived classes are stored in one file.

Advantages:
- Since all tuples for a given class hierarchy are stored in the same
physical file,
oids now need only be unique to a single inheritance hierarchy
(instead of unique in each posgresql installation).

So no longer is there any _necessity_ for a systemwide unique oid.
(This necessity existed because all objects by definition (in OO
sematics)
must posses the identity property ("this" in C++/Java sometimes also
called "self"),
and because instances of the same hierachy were stored in different files
it was necesary to provide the identity property in a "file independant"
way.

The bottleneck formed by the systemwide unique oid is replaced by a
bottleneck for each inheritance hierarchy within an installation.
If one doesn't use inheritance then it translates to a per table
bottleneck.
(Which is what we have now anyway isn't it?).

- Indices, triggers, contraints, etc. are automatically inherited.
so that we can showcase classic OO semantics (including polymorphism).

- Makes easy implementation of referential integrity for oids possible.

- It becomes possible to store more than 4Giga tuples
on 32 bit systems

- given an instance of a class identified by an oid it is easy to determine
the most derived class it belongs to.
(This feature has been requested by a number of poeple on the
questions list.)

- It is the first step to support tables with no oids at all (not that this
is particularly interesting to me though). I'd suggest that system
catalogues
keep their oids though our we would be in for a major rewrite I think.

Disadvantages
- sequential heapscans for tables _with_ derived classes will be less
efficient
in general, because now some tuples may have to be skipped since they
may
belong to the wrong class. This is easily solved using indices.

- slight space overhead for tuple when not using inheritance.
The space is used to tag each tuple with the most derived class it
belongs to.

To improve OO support the implementation plan is to:
1. Add a system attribute to each heap tuple which identifies the most
derived
class the instance belongs to. (easy; I think)
2. Store instances of derived classes in the same physical file as the top
most base class. I hope that hacking heapopen() to tell it in which file
it should look for tuples of a particular relation will be enough.
Maybe this might have implications for caching etc. which I don't
understand.
(difficult?)
3. modify the heap_scanning functions to support the new sceem. (easy; I
think)

Now for my questions.
- Is implementing the above major surgery?
- Am I missing something important?
- What do you guys think of this?

With regards from Maurice.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vadim B. Mikheev 1998-04-05 13:39:03 Re: [HACKERS] Everything leaks; How it mm suppose to work?
Previous Message Bruce Momjian 1998-04-05 05:51:30 Re: [QUESTIONS] pqexec error in psql when root creates table (fwd)