From: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Declarative partitioning |
Date: | 2016-05-20 08:37:23 |
Message-ID: | 573ECCC3.7080104@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Ildar,
On 2016/05/19 0:36, Ildar Musin wrote:
>
> Here is an experimental patch that optimizes planning time for range
> partitioned tables (it could be considered as a "proof of concept"). Patch
> should be applied on top of Amit's declarative partitioning patch. It
> handles only a very special case (often used though) where partitioning
> key consists of just a single attribute and doesn't contain expressions.
Great, thanks!
I understand that it's still PoC and the point may be just to consider
performance implications of excessive partdesc copying but I'm wondering
about a few things about the patch in general. See below.
> The main idea is the following:
> * we are looking for clauses like 'VAR OP CONST' (where VAR is
> partitioning key attribute, OP is a comparison operator);
> * using binary search find a partition (X) that fits CONST value;
> * based on OP operator determine which partitions are also covered by
> clause. There are possible cases:
> 1. If OP is '<' or '<=' then we need partitions standing left from X
> (including)
> 2. If OP is '>' or '>=' then we need partitions standing right from X
> (including)
> 3. If OP is '=' the we need only X partition
> (for '<' and '>' operators we also check if CONST value is equal to a
> lower or upper boundary (accordingly) and if it's true then exclude X).
>
> For boolean expressions we evaluate left and right sides accordingly to
> algorithm above and then based on boolean operator find intersection (for
> AND) or union (for OR).
Perhaps you're already aware but may I also suggest looking at how clauses
are matched to indexes? For example, consider how
match_clauses_to_index() in src/backend/optimizer/path/indxpath.c works.
Moreover, instead of pruning partitions in planner prep phase, might it
not be better to do that when considering paths for the (partitioned) rel?
IOW, instead of looking at parse->jointree, we should rather be working
with rel->baserestrictinfo. Although, that would require some revisions
to how append_rel_list, simple_rel_list, etc. are constructed and
manipulated in a given planner invocation. Maybe it's time for that...
Again, you may have already considered these things.
> I run some benchmarks on:
> 1. original constraint exclusion mechanism,
> 2. optimized version (this patch) and
> 3. optimized version using relation->rd_partdesc pointer instead of
> RelationGetPartitionDesc() function (see previous discussion).
>
> Initial conditions:
>
> CREATE TABLE abc (id SERIAL NOT NULL, a INT, dt TIMESTAMP) PARTITION BY
> RANGE (a);
> CREATE TABLE abc_1 PARTITION OF abc FOR VALUES START (0) END (1000);
> CREATE TABLE abc_2 PARTITION OF abc FOR VALUES START (1000) END (2000);
> ...
> etc
> INSERT INTO %s (a) SELECT generate_series(0, <partitions_count> * 1000);
>
> pgbench scripts:
> https://gist.github.com/zilder/872e634a8eeb405bd045465fc9527e53 (where
> :partitions is a number of partitions).
> The first script tests fetching a single row from the partitioned table.
> Results (tps):
>
> # of partitions | constraint excl. | optimized | optimized (using
> pointer)
> ----------------+------------------+---------------+----------------------------
>
> 100 | 658 | 2906 | 3079
> 1000 | 45 | 2174 | 3021
> 2000 | 22 | 1667 | 2919
>
>
> The second script tests fetching all data from a single partition. Results
> (tps):
>
> # of partitions | constraint excl. | optimized | optimized (using
> pointer)
> ----------------+------------------+---------------+----------------------------
>
> 100 | 317 | 1001 | 1051
> 1000 | 34 | 941 | 1023
> 2000 | 15 | 813 | 1016
>
> Optimized version works much faster on large amount of partitions and
> degradates slower than constraint exclusion. But still there is a
> noticeable performance degradation from copying PartitionDesc structure:
> with 2000 partitions RelationGetPartitionDesc() function spent more than
> 40% of all execution time on copying in first benchmark (measured with
> `perf`). Using reference counting as Amit suggests will allow to
> significantily decrease performance degradation.
Could you try with the attached updated set of patches? I changed
partition descriptor relcache code to eliminate excessive copying in
previous versions.
Thanks,
Amit
Attachment | Content-Type | Size |
---|---|---|
0001-Add-syntax-to-specify-partition-key-v5.patch | text/x-diff | 41.5 KB |
0002-Add-a-IGNORE-dependency-type-v5.patch | text/x-diff | 3.0 KB |
0003-Infrastructure-for-creation-of-partitioned-tables-v5.patch | text/x-diff | 86.9 KB |
0004-Add-syntax-to-create-partitions-v5.patch | text/x-diff | 77.8 KB |
0005-Infrastructure-for-partition-metadata-storage-and-ma-v5.patch | text/x-diff | 101.7 KB |
0006-Introduce-tuple-routing-for-partitioned-tables-v5.patch | text/x-diff | 29.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2016-05-20 12:12:24 | Re: It's seems that the function "do_text_output_multiline" does not suit for format "line1\nline2\n...lineN". |
Previous Message | Craig Ringer | 2016-05-20 07:35:17 | Re: foreign table batch inserts |