From: | Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com> |
---|---|
To: | "Rafal Pietrak" <rafal(at)zorro(dot)isa-geek(dot)com> |
Cc: | PostgreSQL <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: a performence question |
Date: | 2008-09-05 15:04:00 |
Message-ID: | 92869e660809050803o5340c4c2se4d7e593c5533d63@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
2008/9/4 Rafal Pietrak <rafal(at)zorro(dot)isa-geek(dot)com>:
> Hi,
>
> Maybe someone on this list actually have already tried this:
>
> I'm planning to make a partitioned database. From Postgres documentation
> I can see, that there are basically two methods to route INSERTS into
> partitioned table:
> one. is a TRIGGER
> other. is a RULE
>
> My Table will have over 1000 partitions. Some not so big, but
> significant number of them will be of multimillion rows. Partitioning
> will be done using a single column, on equality.... meaning:
>
> CREATE TABLE mainlog (sel int, tm timestamp, info text,...);
> CREATE TABLE mainlog_p1 (CHECK (sel=1)) INHERITS (mainlog);
> CREATE TABLE mainlog_p2 (CHECK (sel=2)) INHERITS (mainlog);
> ...etc.
>
> If I route INSERT with a TRIGGER, the function would look like:
> CREATE .... TRIGGER...AS $$ DECLARE x RECORD; BEGIN
> SELECT id INTO x FROM current_route; NEW.sel := x.id;
> IF NEW.sel = 1 THEN INSERT INTO mainlog_p1 VALUES (NEW.*);
> ELSE IF NEW.sel = 2 THEN INSERT INTO mainlog_p2 VALUES (NEW.*);
> ....
> END IF;
> RETURN NULL;
> $$;
>
> If I route INSETS with a RULE, I'd have something like 1000 rules hooked
> up to MAINLOG, all looking like:
> CREATE RULE .... ON INSERT ... WHERE EXISTS(SELECT 1 FROM current_route
> WHERE id = 1) DO INSTEAD INSERT INTO mainlog_p1 VALUES SELECT
> x.id,new.tm... FROM (SELECT id FROM current_route) x;
> ... and similar RULES for cases "WHERE id = 2", etc.
>
> My question is, where should I expect better performance on those
> INSERTS).
>
> I would prefer a set of RULES (as I wouldn't like to rewrite TRIGGER
> function every time I add a partition ... a thousand lines function),
> but since they all must make a select query on CURRENT_ROUTE table, may
> be that will not be particularly effective? The TRIGGER function does a
> single query - may be it'll be faster? I was planning to generate some
> dummy data and run a simulation, but may be someone already has that
> experience? Or maybe the TRIGGER should look differently? Or the set of
> RULES?
>
I had a bit spare time so I tested this
see http://filip.rembialkowski.net/postgres-partitioning-performance-rules-vs-triggers/
seems that in your scenario trigger will be better.
but If I had to do this, and if performance was very important, I
would move "partition selection" logic out of the INSERT phase. the
application can know this before the actual insert. unless you want to
shift selections very often...
--
Filip Rembiałkowski
From | Date | Subject | |
---|---|---|---|
Next Message | Sam Mason | 2008-09-05 15:10:47 | Re: large inserts and fsync |
Previous Message | Aaron Burnett | 2008-09-05 13:16:41 | large inserts and fsync |