Quick Links

Re: Postgresql partitioning - single hot table or distributed

From:	Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To:	sam mulube <sam(dot)mulube(at)gmail(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Postgresql partitioning - single hot table or distributed
Date:	2010-07-02 13:59:46
Message-ID:	AANLkTinvwqST2A0fzmMnV8_tddeUR4WcbLbLNVIPtSF8@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Tue, Jun 29, 2010 at 4:00 PM, sam mulube <sam(dot)mulube(at)gmail(dot)com> wrote:
> Hi,
>
> we are considering database partitioning as a possible solution to
> some performance issues we are having with our database, and we are
> trying to decide on a partitioning scheme. We have a moderately write
> heavy application (approx 50 inserts per second, with writes
> outnumbering reads by roughly 5:1), and the table in question looks
> something like this:
>
> -------------------------------------------------------------------------------
> column name : id | value | server_id | created_at
> column type : integer | string | integer | timestamp
> with time zone
> other info : pk | | fk, indexed |
> indexed
> -------------------------------------------------------------------------------
>
> Or initial thoughts on partitioning was to partition by date using the
> created_at column, with a separate partition for each month; however
> the vast majority of our inserts would be for 'now', so we would be
> almost entirely writing to the partition for the current month. Other
> month partitions might get occasional updates, but this would be a
> relatively infrequent occurrence.
>
> Alternatively we wondered about partitioning by the server_id foreign
> key, using for example the modulo of the foreign key id. This would
> give us a finite number of partitions (rather than the potentially
> unbounded date option), and would likely cause writes to be much more
> evenly distributed between the partitions.
>
> Does anyone have any likely idea which would be the better choice. The
> single hot table getting most of the inserts, which might mean any
> indexes are fully in memory, or dividing the writes more evenly over
> all of our partitions?

Are most of your selects for now to now - 1 day or so as well? If so,
then look at having one big partition for historical data and one
small one for the last day. Every x hours run a cron job that moves
everything in the current partition to the old archive partition(s).

In response to

Postgresql partitioning - single hot table or distributed at 2010-06-29 20:00:21 from sam mulube

Browse pgsql-general by date

	From	Date	Subject
Next Message	Guillaume Lelarge	2010-07-02 14:08:06	Re: pgpool-II (max_pool and num_init_children)
Previous Message	Merlin Moncure	2010-07-02 13:54:02	Re: change array dimension