Quick Links

Partitioning

From:	"J(dot) Andrew Rogers" <jrogers(at)neopolitan(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Partitioning
Date:	2004-09-15 21:09:31
Message-ID:	1095282571.25123.181.camel@vulture.corp.neopolitan.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Tue, 2004-09-14 at 21:30, Joe Conway wrote:
> That's exactly what we're doing, but using inherited tables instead of a
> union view. With inheritance, there is no need to rebuild the view each
> time a table is added or removed. Basically, in our application, tables
> are partitioned by either month or week, depending on the type of data
> involved, and queries are normally date qualified.

We do something very similar, also using table inheritance and a lot of
triggers to automatically generate partitions and so forth. It works
pretty well, but it is a custom job every time I want to implement a
partitioned table. You can save a lot on speed and space if you use it
to break up large tables with composite indexes, since you can drop
columns from the table depending on how you use it. A big part of
performance gain is that the resulting partitions end up being more
well-ordered than the non-partitioned version, since inserts are hashed
to different partition according to the key and hash function. It is
kind of like a cheap and dirty real-time CLUSTER operation. It also
lets you truncate, lock, and generally be heavy-handed with subsets of
the table without affecting the rest of the table.

I think generic table partitioning could pretty much be built on top of
existing capabilities with a small number of tweaks.

The main difference would be the ability to associate a partitioning
hash function with a table (probably defined inline at CREATE TABLE
time). Something with syntax like:

...PARTITION ON 'date_trunc(''hour'',ts)'...

There would also probably need to be some type of metadata table to
associate specific hashes with partition table names. Other than that,
the capabilities largely already exist, and managing the partition
hashing and association is the ugly part when rolling your own.
Intercepting DML when necessary and making it behave correctly is
already pretty easy, but could probably be streamlined.

j. andrew rogers

In response to

Re: Data Warehouse Reevaluation - MySQL vs Postgres -- at 2004-09-15 04:30:24 from Joe Conway

Responses

Re: Partitioning at 2004-09-16 03:55:24 from Greg Stark
Re: Partitioning at 2004-09-16 20:39:51 from Jim C. Nasby

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Greg Stark	2004-09-16 03:55:24	Re: Partitioning
Previous Message	Simon Riggs	2004-09-15 20:59:06	Re: Data Warehouse Reevaluation - MySQL vs Postgres --