From: | "Spiegelberg, Greg" <gspiegelberg(at)cranel(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: DELETE with filter on ctid |
Date: | 2007-04-10 13:35:14 |
Message-ID: | 82E74D266CB9B44390D3CCE44A781ED90B6ADD@POSTOFFICE.cranel.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Tom et al,
Sometimes it takes a look from someone on the outside to get the job
done right.
Below is, I believe, everything pertinent to this problem. First is the
table in question, second is the problematic and original query, and
final is the transaction that I have working today with the CTID
implementation.
I would welcome any feedback.
TIA,
Greg
cranel=# \d sid2.data_id_table
Table "sid2.data_id_table"
Column | Type | Modifiers
-------------+---------+---------------
point_id | bigint |
dtype_id | bigint |
segment_id | bigint |
key1_id | bigint | not null
key2_id | bigint |
data_id | bigint | not null
deleted | boolean | default false
removed | boolean | default false
added | boolean | default false
persist | boolean | default false
Indexes:
"data_id_table_data_id_indx" btree (data_id)
"data_id_table_dtype_id_indx" btree (dtype_id)
"data_id_table_dtype_ss_id_indx" btree (dtype_id, point_id)
"data_id_table_key1_id_indx" btree (key1_id)
"data_id_table_key2_id_indx" btree (key2_id)
"data_id_table_mod_dtype_ss_id_indx" btree (segment_id, dtype_id,
point_id)
"data_id_table_segment_id_indx" btree (segment_id)
"data_id_table_point_id_indx" btree (point_id)
cranel=# explain analyze
DELETE FROM sid2.data_id_table AS dd
USING public.points AS ss,
(SELECT markeddel.*
FROM (SELECT d.*
FROM sid2.data_id_table d,public.points s
WHERE s.systems_id=2 AND s.id<2 AND s.permpoint=FALSE
AND s.id=d.point_id AND d.persist=FALSE
AND d.dtype_id=3) AS markeddel
JOIN
(SELECT DISTINCT ON (d.key1_id,d.key2_id) d.*
FROM sid2.data_id_table d,public.points s
WHERE s.systems_id=2 AND s.id<=2 AND s.id=d.point_id
AND d.dtype_id=3
ORDER BY d.key1_id,d.key2_id,d.point_id DESC) AS rollup
ON
(markeddel.key1_id,markeddel.key2_id)=(rollup.key1_id,rollup.key2_id)
WHERE markeddel.point_id<>rollup.point_id) ru
WHERE ss.systems_id=2 AND ss.id<2 AND ss.permpoint=FALSE AND
ss.id=dd.point_id
AND dd.persist=FALSE AND dd.dtype_id=3
AND
(dd.point_id,dd.key1_id,dd.key2_id)=(ru.point_id,ru.key1_id,ru.key2_id);
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------
Nested Loop (cost=1037.06..1130.46 rows=1 width=6) (actual
time=33291.639..678047.543 rows=564 loops=1)
Join Filter: ((dd.point_id = d.point_id) AND (d.point_id <>
rollup.point_id))
-> Merge Join (cost=1028.10..1117.47 rows=1 width=70) (actual
time=1775.971..3721.991 rows=156750 loops=1)
Merge Cond: ((rollup.key1_id = dd.key1_id) AND (rollup.key2_id
= dd.key2_id))
-> Unique (cost=629.66..659.24 rows=3944 width=52) (actual
time=896.293..1571.591 rows=156779 loops=1)
-> Sort (cost=629.66..639.52 rows=3944 width=52)
(actual time=896.285..1080.444 rows=157342 loops=1)
Sort Key: d.key1_id, d.key2_id, d.point_id
-> Nested Loop (cost=0.00..394.10 rows=3944
width=52) (actual time=8.846..529.901 rows=157352 loops=1)
-> Seq Scan on points s (cost=0.00..1.72
rows=1 width=8) (actual time=0.064..0.096 rows=2 loops=1)
Filter: ((systems_id = 2) AND (id <=
2))
-> Index Scan using
data_id_table_point_id_indx on data_id_table d (cost=0.00..339.79
rows=4207 width=52) (actual time=4.649..155.174 rows=78676 loops=2)
Index Cond: (s.id = d.point_id)
Filter: (dtype_id = 3)
-> Sort (cost=398.44..398.64 rows=82 width=46) (actual
time=879.658..1109.830 rows=156750 loops=1)
Sort Key: dd.key1_id, dd.key2_id
-> Nested Loop (cost=0.00..395.83 rows=82 width=46)
(actual time=5.197..549.873 rows=156750 loops=1)
-> Nested Loop (cost=0.00..3.45 rows=1 width=16)
(actual time=0.055..0.107 rows=1 loops=1)
Join Filter: (ss.id = s.id)
-> Seq Scan on points ss (cost=0.00..1.72
rows=1 width=8) (actual time=0.037..0.052 rows=1 loops=1)
Filter: ((systems_id = 2) AND (id < 2)
AND (NOT permpoint))
-> Seq Scan on points s (cost=0.00..1.72
rows=1 width=8) (actual time=0.006..0.039 rows=1 loops=1)
Filter: ((systems_id = 2) AND (id < 2)
AND (NOT permpoint))
-> Index Scan using data_id_table_point_id_indx on
data_id_table dd (cost=0.00..339.79 rows=4207 width=30) (actual
time=5.135..342.406 rows=156750 loops=1)
Index Cond: (ss.id = dd.point_id)
Filter: ((NOT persist) AND (dtype_id = 3))
-> Bitmap Heap Scan on data_id_table d (cost=8.96..12.97 rows=1
width=24) (actual time=4.289..4.290 rows=1 loops=156750)
Recheck Cond: ((d.key1_id = rollup.key1_id) AND (d.key2_id =
rollup.key2_id))
Filter: ((NOT persist) AND (dtype_id = 3))
-> BitmapAnd (cost=8.96..8.96 rows=1 width=0) (actual
time=4.280..4.280 rows=0 loops=156750)
-> Bitmap Index Scan on data_id_table_key1_id_indx
(cost=0.00..4.32 rows=4 width=0) (actual time=0.020..0.020 rows=31
loops=156750)
Index Cond: (d.key1_id = rollup.key1_id)
-> Bitmap Index Scan on data_id_table_key2_id_indx
(cost=0.00..4.38 rows=13 width=0) (actual time=4.254..4.254 rows=26187
loops=156750)
Index Cond: (d.key2_id = rollup.key2_id)
Total runtime: 678063.873 ms
(34 rows)
cranel=# \timing
Timing is on.
cranel=# BEGIN;
BEGIN
Time: 0.340 ms
cranel=# CREATE INDEX data_id_table_ctid_idx ON
sid2.data_id_table(ctid);
CREATE INDEX
Time: 648.911 ms
cranel=# explain analyze
DELETE FROM sid2.data_id_table AS dd
USING public.points AS ss,
(SELECT markeddel.ctid
FROM (SELECT d.ctid,d.*
FROM sid2.data_id_table d,public.points s
WHERE s.systems_id=2
AND s.id<2
AND s.permpoint=FALSE
AND s.id=d.point_id
AND d.persist=FALSE
AND d.dtype_id=3) AS markeddel
JOIN
(SELECT DISTINCT ON (d.key1_id,d.key2_id) d.*
FROM sid2.data_id_table d,public.points s
WHERE s.systems_id=2
AND s.id<=2
AND s.id=d.point_id
AND d.dtype_id=3
ORDER BY d.key1_id,d.key2_id,d.point_id DESC) AS rollup
ON
(markeddel.key1_id,markeddel.key2_id)=(rollup.key1_id,rollup.key2_id)
WHERE markeddel.point_id<>rollup.point_id) ru
WHERE ss.systems_id=2
AND ss.id<2
AND ss.permpoint=FALSE
AND ss.id=dd.point_id
AND dd.persist=FALSE
AND dd.dtype_id=3
AND dd.ctid=ru.ctid;
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------
Nested Loop (cost=1259.33..1378.37 rows=1 width=6) (actual
time=1807.429..2625.722 rows=562 loops=1)
-> Nested Loop (cost=1259.33..1378.08 rows=1 width=14) (actual
time=1807.372..2619.592 rows=562 loops=1)
-> Merge Join (cost=1259.33..1377.66 rows=1 width=6) (actual
time=1807.228..2606.901 rows=562 loops=1)
Merge Cond: ((rollup.key1_id = d.key1_id) AND
(rollup.key2_id = d.key2_id))
Join Filter: (d.point_id <> rollup.point_id)
-> Unique (cost=629.66..659.24 rows=3944 width=52)
(actual time=911.409..1271.121 rows=156779 loops=1)
-> Sort (cost=629.66..639.52 rows=3944 width=52)
(actual time=911.403..1024.775 rows=157342 loops=1)
Sort Key: d.key1_id, d.key2_id, d.point_id
-> Nested Loop (cost=0.00..394.10 rows=3944
width=52) (actual time=6.036..548.119 rows=157352 loops=1)
-> Seq Scan on points s
(cost=0.00..1.72 rows=1 width=8) (actual time=0.114..0.137 rows=2
loops=1)
Filter: ((systems_id = 2) AND (id
<= 2))
-> Index Scan using
data_id_table_point_id_indx on data_id_table d (cost=0.00..339.79
rows=4207 width=52) (actual time=3.216..155.284
rows=78676 loops=2)
Index Cond: (s.id = d.point_id)
Filter: (dtype_id = 3)
-> Sort (cost=629.66..639.52 rows=3944 width=30)
(actual time=875.213..980.618 rows=156750 loops=1)
Sort Key: d.key1_id, d.key2_id
-> Nested Loop (cost=0.00..394.10 rows=3944
width=30) (actual time=5.864..553.290 rows=156750 loops=1)
-> Seq Scan on points s (cost=0.00..1.72
rows=1 width=8) (actual time=0.022..0.053 rows=1 loops=1)
Filter: ((systems_id = 2) AND (id < 2)
AND (NOT permpoint))
-> Index Scan using
data_id_table_point_id_indx on data_id_table d (cost=0.00..339.79
rows=4207 width=30) (actual time=5.831..355.139 rows=156750 loops=1)
Index Cond: (s.id = d.point_id)
Filter: ((NOT persist) AND (dtype_id =
3))
-> Index Scan using data_id_table_ctid_idx on data_id_table dd
(cost=0.00..0.41 rows=1 width=14) (actual time=0.017..0.019 rows=1
loops=562)
Index Cond: (dd.ctid = d.ctid)
Filter: ((NOT persist) AND (dtype_id = 3))
-> Index Scan using points_pkey on points ss (cost=0.00..0.28
rows=1 width=8) (actual time=0.005..0.007 rows=1 loops=562)
Index Cond: ((ss.id < 2) AND (ss.id = dd.point_id))
Filter: ((systems_id = 2) AND (NOT permpoint))
Total runtime: 2641.820 ms
(29 rows)
Time: 2652.940 ms
cranel=# DROP INDEX data_id_table_ctid_idx;
DROP INDEX
Time: 33.653 ms
cranel=# DELETE FROM sid2.data_id_table AS dd WHERE dd.point_id=2 AND
dd.dtype_id=3 AND dd.deleted AND NOT dd.persist;
DELETE 0
Time: 0.960 ms
cranel=# COMMIT;
Time: 20.500 ms
-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Monday, April 09, 2007 4:55 PM
To: Spiegelberg, Greg
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] DELETE with filter on ctid
"Spiegelberg, Greg" <gspiegelberg(at)cranel(dot)com> writes:
> We have a query which generates a small set of rows (~1,000) which are
> to be used in a DELETE on the same table. The problem we have is that
> we need to join on 5 different columns and it takes far too long. I
> have a solution but I'm not sure it's the right one. Instead of
joining
> on 5 columns in the DELETE the join uses the ctid column.
> BEGIN;
> CREATE INDEX gregs_table_ctid_idx ON gregs_table(ctid);
> DELETE FROM gregs_table gt
> USING (SELECT ctid FROM gregs_table WHERE ...) as s
> WHERE gt.ctid=s.ctid;
> DROP INDEX gregs_table_ctid_idx;
> COMMIT;
Forget the index, it's useless here (hint: ctid is a physical address).
I'm wondering though why you don't just transpose the subquery's WHERE
condition into the DELETE's WHERE? Or is this example oversimplified?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Spiegelberg, Greg | 2007-04-10 13:37:34 | Re: DELETE with filter on ctid |
Previous Message | Mike Gargano | 2007-04-10 13:15:33 | Re: Beginner Question |