From: | Daniel Begin <jfd553(at)hotmail(dot)com> |
---|---|
To: | "'melvin6925'" <melvin6925(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: FW: Constraint exclusion in partitions |
Date: | 2015-05-23 21:23:21 |
Message-ID: | COL129-DS878FE5AF4920D9DA2EAA094CF0@phx.gbl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I am working with postgresql 9.3 and I understand from the documentation that constraint_exclusion is set to “partition” by default. Looking at my postgres.conf file, the concerned line is “#constraint_exclusion = partition”.
Furthermore, the execution plan shows that constraint_exclusion was used at least for constant id and range of ids
Daniel
From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of melvin6925
Sent: May-23-15 15:15
To: Daniel Begin; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] FW: Constraint exclusion in partitions
Did you remember to set constraint_exclusion = on and reload the .conf ?
Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
-------- Original message --------
From: Daniel Begin <jfd553(at)hotmail(dot)com>
Date: 05/23/2015 14:37 (GMT-05:00)
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] FW: Constraint exclusion in partitions
Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process completed and all the resulting tables analyzed.
Here is what I got on both the original table and its partitioned counterpart while running the same queries. I tested them only for a couple of values but in summary...
Using a constant id:
All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id value1, id value2 ...)
Using a range of ids:
Surprisingly again, all the queries I tried took longer on the partitioned table!
Using a list of ids from a select clause:
More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key. Using an indexed field took so long compared to the old table that I cancelled the execution for the new one!
Guess what, I will get back to my old fat table unless someone tells me I missed something obvious!
Daniel
Note: Tables/indexes description, queries and execution plans are below.
Tables/indexes description ----------------------------------------------------------------------------------
The original table has 3870130000 records. Primary key/index on each partition queries are
ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);
The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition queries are
ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id);
Where xx is the partition's number suffix
constant id -------------------------------------------------------------------------------------------------------
select * from oldtable where id=123456789;
"Index Scan using oldtable_idversion_pk on oldtable (cost=0.70..4437.15 rows=1682 width=66)"
" Index Cond: (id = 123456789::bigint)"
--Total query runtime: 62 ms. 1 rows retrieved
select * from newtable where id=123456789;
"Append (cost=0.00..20.19 rows=5 width=66)"
" -> Seq Scan on newtable (cost=0.00..0.00 rows=1 width=66)"
" Filter: (id = 123456789::bigint)"
" -> Index Scan using newtable72_idversion_pk on newtable_72 (cost=0.56..20.19 rows=4 width=66)"
" Index Cond: (id = 123456789::bigint)"
--Total query runtime: 156 ms. 1 rows retrieved
I got similar results for multiple records...
select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000);
"Index Scan using oldtable_idversion_pk on oldtable (cost=0.71..17739.18 rows=6726 width=66)"
" Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 187 ms. 4 rows retrieved
select * from newtable where id IN(10000000,1000000000,2000000000,3000000000);
"Append (cost=0.00..933.40 rows=223 width=66)"
" -> Seq Scan on newtable (cost=0.00..0.00 rows=1 width=66)"
" Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
" -> Index Scan using newtable01_idversion_pk on newtable_01 (cost=0.57..622.78 rows=156 width=66)"
" Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
...
" -> Index Scan using newtable85_idversion_pk on newtable_85 (cost=0.57..53.37 rows=9 width=66)"
" Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 421 ms. 4 rows retrieved
range of ids -------------------------------------------------------------------------------------------------------
select * from oldtable where id between 1522999949 and 1523000049;
"Index Scan using oldtable_idversion_pk on oldtable (cost=0.70..383.51 rows=144 width=66)"
" Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 47 ms. 53 rows retrieved.
select * from newtable where id between 1522999949 and 1523000049;
"Append (cost=0.00..408.16 rows=104 width=66)"
" -> Seq Scan on newtable (cost=0.00..0.00 rows=1 width=66)"
" Filter: ((id >= 1522999949) AND (id <= 1523000049))"
" -> Index Scan using newtable51_idversion_pk on newtable_51 (cost=0.56..183.52 rows=46 width=66)"
" Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
" -> Index Scan using newtable52_idversion_pk on newtable_52 (cost=0.56..224.64 rows=57 width=66)"
" Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 78 ms. 53 rows retrieved.
list of ids from a select clause -------------------------------------------------------------------------------------------------------
--Subset provides 4 ids similar but not identical to the previous query
select * from oldtable where id IN (select * from subset);
"Nested Loop (cost=37.45..886298.00 rows=2028512050 width=66)"
" -> HashAggregate (cost=36.75..38.75 rows=200 width=8)"
" -> Seq Scan on subset (cost=0.00..31.40 rows=2140 width=8)"
" -> Index Scan using oldtable_idversion_pk on oldtable (cost=0.70..4414.37 rows=1693 width=66)"
" Index Cond: (id = subset.id)"
Total query runtime: 171 ms. 4 rows retrieved.
select * from newtable where id IN (select * from subset)
"Nested Loop (cost=36.75..1407672.76 rows=1935067087 width=66)"
" -> HashAggregate (cost=36.75..38.75 rows=200 width=8)"
" -> Seq Scan on subset (cost=0.00..31.40 rows=2140 width=8)"
" -> Append (cost=0.00..7020.68 rows=1749 width=66)"
" -> Seq Scan on newtable (cost=0.00..0.00 rows=1 width=66)"
" Filter: (subset.id = id)"
" -> Index Scan using newtable01_idversion_pk on newtable_01 (cost=0.56..151.97 rows=39 width=66)"
" Index Cond: (id = subset.id)"
...
" -> Index Scan using newtable86_idversion_pk on newtable_86 (cost=0.56..12.42 rows=2 width=66)"
" Index Cond: (id = subset.id)"
Total query runtime: 140 ms. 4 rows retrieved.
Using an index, not the primary key ------------------------------------------------------------------------------
--Subset provides 58 group_id pointing to 5978 records in the concerned tables
select * from oldtable where group_id IN (select * from subset)
"Nested Loop (cost=37.33..21575715.89 rows=2028512050 width=66)"
" -> HashAggregate (cost=36.75..38.75 rows=200 width=8)"
" -> Seq Scan on subset (cost=0.00..31.40 rows=2140 width=8)"
" -> Index Scan using oldtable_groupid_idx on oldtable (cost=0.58..107364.99 rows=51340 width=66)"
" Index Cond: (group_id = subset.id)"
Total query runtime: 3986 ms. 5978 rows retrieved.
select * from newtable where group_id IN (select * from subset)
"Hash Join (cost=41.25..138092255.85 rows=1935067087 width=66)"
" Hash Cond: (newtable.group_id = subset.id)"
" -> Append (cost=0.00..84877869.72 rows=3870134173 width=66)"
" -> Seq Scan on newtable (cost=0.00..0.00 rows=1 width=66)"
" -> Seq Scan on newtable_01 (cost=0.00..946235.96 rows=46526896 width=66)"
...
" -> Seq Scan on newtable_86 (cost=0.00..986527.64 rows=44269664 width=66)"
" -> Hash (cost=38.75..38.75 rows=200 width=8)"
" -> HashAggregate (cost=36.75..38.75 rows=200 width=8)"
" -> Seq Scan on subset (cost=0.00..31.40 rows=2140 width=8)"
Execution Cancelled after 766702 ms !
I tried the same with "SET enable_seqscan = OFF" and got an index scan of all tables;
From | Date | Subject | |
---|---|---|---|
Next Message | Ravi Krishna | 2015-05-23 21:37:53 | Re: PG and undo logging |
Previous Message | Jan de Visser | 2015-05-23 21:04:22 | Re: PG and undo logging |