Table Partitioning, Part 1

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: bizgres-general <bizgres-general(at)pgfoundry(dot)org>
Subject: Table Partitioning, Part 1
Date: 2005-05-09 21:02:53
Message-ID: 1115672573.3830.73.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Many people have been discussing Table Partitioning lately. I've also
been giving thought to how to implement Table Partitioning within
PostgreSQL, as part of the Bizgres project for Business Intelligence.

After some discussion on Bizgres, I've now posted the most important and
common Use Cases for Business Intelligence. Input from Hannu Krosing,
Josh Berkus, Luke Lonergan, Bricklen Anderson, as well as noting
comments from Stacey White, Yann Michel on other lists.
This can be found at:
http://www.bizgres.org/pages.php?pg=developers%7Cprojects%7Cbizgres%
7Cpartitioningrequirements

It's now time to look in detail at how to implement some of the
internals of Partitioning within PostgreSQL. I'd like to kick off with
some of the trickiest design decisions we need to make, and comment on
what the implications are for each choice. Rather than go for a hugely
long post, I'll start with a summary of the main questions...then
continue some more in a second post soon afterwards, going into more
detail. (Thats why the subject of this says "Part 1").

1. Embellish inheritance, or separate infrastructure?

2. Individual Relations explicitly in the plan or MultiRelation plan
nodes? (i.e. is a SeqScan of a Partitioned Table one Node or many
nodes?)

3. Partitions are relations? Or sub-relations?

4. Should all partitions have the same indexes?

5. Constraints or specific Partitioning syntax?

Those are the main questions, as I see them currently. I'll go through
them in order, though (2) turns out to be the biggest question. Some of
those questions have been raised by Neil Conway, Hannu Krosing, Jim
Nasby, Josh Berkus and Gavin Sherry. I answered many of those questions
at the time, though now I want to discuss these again since they are
such important questions.

Some of the reason that this is fairly complex is that the Partitioning
Use Cases (PPUC) published show that we need to take advantage of both:
PPUC1 Direct Restriction of the Partitioning Key
PPUC2 Joins on the Partitioning Key
More detail can be found at the link (same as above)...
http://www.bizgres.org/pages.php?pg=developers%7Cprojects%7Cbizgres%
7Cpartitioningrequirements

Many of us will have experience with similar applications, though
frequently with just one or other of those Use Cases. At first, I
mistakenly thought the Use Cases were identical, though their technical
solutions do seem to be fairly different for the db internals.

In terms of other implementations, it appears that Oracle was designed
for Direct Restriction of the Partitioning Key (PPUC1), whereas the
latest SQLServer implementation appears more suitable for cases
involving Joins on the Partitioning Key (PPUC2). Thats not too
important, just that if you know either of those database systems you
may be tempted into thinking that you know how partitioning should be
done by reference to one of those implementations (as I did initially).

I'm also striving to find a subset of this that can be completed for
PostgreSQL 8.1, with the majority being completed for 8.2. (Bizgres
releases *may* vary from that schedule)

Best Regards, Simon Riggs

Browse pgsql-hackers by date

  From Date Subject
Next Message Darren King 2005-05-09 21:12:07 Re: Views, views, views! (long)
Previous Message Tom Lane 2005-05-09 21:02:37 Re: Case insensitive usernames