From: | Ondrej Ivanič <ondrej(dot)ivanic(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Postgres for a "data warehouse", 5-10 TB |
Date: | 2011-09-12 02:56:37 |
Message-ID: | CAM6mieLr6rifjihkNVWFByNGgQYM+uMsKrxP00U6e=bfdjr_=w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi,
On 12 September 2011 12:28, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Once those are done, you can query against the 'parent' table with
> something like:
> select * from parent where date = '2010-01-01';
>
> And PG will realize it only has to look at table2 to get the results for
> that query. This means the partitioning can be more-or-less any check
> constraint that will be satisfied by the data in the table (and PG will
> check/enforce this) and that PG can figure out will eliminate a partition
> from possibly having the data that matches the request.
Theory is nice but there are few gotchas (in 8.4) :
- planner can use constant expressions only. You will get scans across
all partitions when you use function (like now(), immutable function
with constant arguments), sub query (like part_col = (select x from
...) .. ) or anything which can't be evaluated to constat during query
planning.
- partitions constraints are not "pushed to joins" (assuming tables
partitioned by primary key):
select ... from X left join Y on X.primary_key = Y.primary_key where
part_col >= ... and X.primary_key >= .,, and X.primary_key < ...
must be rewritten like
select ... from X
left join Y on X.primary_key = Y.primary_key and X.primary_key >= .,,
and Y.primary_key < ...
where X.primary_key >= .,, and X.primary_key < ...
in order to avoid scan entire Y table (not only relevant partitions)
- ORDER BY / LIMIT X issue fixed in 9.1 (Allow inheritance table scans
to return meaningfully-sorted results.
Moreover all queries should have 'WHERE' on column which is used for
partitioning otherwise partitioning is not very useful (yes, it could
simplify data management -- drop partition vs delete from X where
part_col between A and B)
--
Ondrej Ivanic
(ondrej(dot)ivanic(at)gmail(dot)com)
From | Date | Subject | |
---|---|---|---|
Next Message | mark | 2011-09-12 03:10:11 | Re: RAID Controller (HP P400) beat by SW-RAID? |
Previous Message | J Sisson | 2011-09-12 02:46:57 | Re: Databases optimization |