Re: Monthly table partitioning for fast purges?

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Roger Hand <rhand(at)ragingnet(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Monthly table partitioning for fast purges?
Date: 2003-08-04 15:09:45
Message-ID: Pine.LNX.4.33.0308040905310.10259-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 1 Aug 2003, Roger Hand wrote:

> We are moving an application from Oracle 8i to Postgres and I've run
> into a problem attempting to duplicate a feature we currently use.
>
> In Oracle you can divide a table into partitions. We use this feature
> to break up the data by month. Each month we store several tens of
> millions of rows in a particular table, and each month we drop the
> partition that's a year old. In other words, we always keep the last 12
> months of data (12 partitions). This is clean and fast. Since the
> partition is by a timestamp column, it also gives us a certain amount of
> automatic indexing.
>
> Postgres doesn't support table partitions (correct me if I'm wrong!) so
> the only option appears to be to dump everything into one big table.
> What I'm worried about is the purging of the data from 12 months ago
> ... I'm worried that this will be a slow and expensive operation.

There are two approaches I can think of, and you should test each one to
see how it holds up to your usage.

Approach the first: Using scripts, create a table for each month. Create
a view that combines all of these months. When a month goes out of date,
simply remove it from the view. Deleting the month can be done at your
leisure, as it only saves disk space at this point, but since it isn't in
the view, it doesn't slow you down to keep them.

Approach the second: Use partial indexes to make it look like the table
is partitioned. I.e. every month create a new partial index like:

create index on bigoldtable (datefield) where datefield >=1stofmonth and
datefield<=lastofmonth.

Then include the "where date >=firstofmonth AND date <= lastofmonth

This should then hit the partial index, which will be small compared to
the master table with all the rows, or the main index, which will index
all fields.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Robert Treat 2003-08-04 15:20:02 Re: plPHP -- sort of an announcement.. but not commercial
Previous Message Ian Harding 2003-08-04 14:22:25 Re: Apache - DBI - Postgresql: Cancelling queries