Re: Deleting more efficiently from large partitions

From: Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
To: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Deleting more efficiently from large partitions
Date: 2020-06-16 16:47:47
Message-ID: CAOC+FBVn_n5_QQKfFygrwv0SD4rN7neWRRa2xJNMaUVRfXTCWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

So the PK is date + uuid + two other columns. I also have an index (btree)
on uuid. When I try to delete via DELETE FROM mpd WHERE uuid IN (SELECT
uuid FROM temptable), it takes a long long time as it scans every
partition. The explain output is huge. Here is some of it.

The full explain is here:
https://www.dropbox.com/s/a6rowvalbu0vdiq/explain.txt?dl=0

This will take hours if I put a couple of thousand uuid records in the temp
table. It's untenable. The delete here does NOT seem to be using the index
on uuid

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Delete on mpd (cost=0.42..264953274.12 rows=5828371861 width=12) (actual
time=286791.526..286791.526 rows=0 loops=1)
Delete on mpd_2015_01
...
Delete on mpd_2021_12
Buffers: shared hit=834134 read=4148792 dirtied=16844, local read=463956
written=1023
-> Nested Loop Semi Join (cost=0.42..526.20 rows=550 width=12) (actual
time=0.010..0.010 rows=0 loops=1)
-> Seq Scan on mpd_2015_01 (cost=0.00..21.00 rows=1100 width=22)
(actual time=0.008..0.008 rows=0 loops=1)
-> Index Scan using uuid_idx on temptable (cost=0.42..120.11
rows=4088 width=22) (never executed)
Index Cond: (uuid = mpd_2015_01.uuid)
-> Nested Loop Semi Join (cost=0.42..526.20 rows=550 width=12) (actual
time=0.005..0.005 rows=0 loops=1)
-> Seq Scan on mpd_2015_02 (cost=0.00..21.00 rows=1100 width=22)
(actual time=0.005..0.005 rows=0 loops=1)
-> Index Scan using uuid_idx on temptable (cost=0.42..120.11
rows=4088 width=22) (never executed)
Index Cond: (uuid = mpd_2015_02.uuid)
-> Nested Loop Semi Join (cost=0.42..526.20 rows=550 width=12) (actual
time=0.002..0.003 rows=0 loops=1)
-> Seq Scan on mpd_2015_03 (cost=0.00..21.00 rows=1100 width=22)
(actual time=0.002..0.002 rows=0 loops=1)
-> Index Scan using uuid_idx on temptable (cost=0.42..120.11
rows=4088 width=22) (never executed)
Index Cond: (uuid = mpd_2015_03.uuid)
-> Nested Loop (cost=20606.19..7121506.57 rows=149367744 width=12)
(actual time=7934.416..7934.416 rows=0 loops=1)
Buffers: shared hit=478 read=558, local read=10086 written=1023
-> HashAggregate (cost=20306.34..20308.34 rows=200 width=22)
(actual time=7555.857..7556.264 rows=259 loops=1)
Group Key: temptable.uuid
Buffers: local read=10086 written=1023
-> Seq Scan on temptable (cost=0.00..18262.27 rows=817627
width=22) (actual time=7257.016..7399.478 rows=817627 loops=1)
Buffers: local read=10086 written=1023
-> Bitmap Heap Scan on mpd_2015_04 (cost=299.85..35376.78
rows=12921 width=22) (actual time=1.457..1.457 rows=0 loops=259)
Recheck Cond: (uuid = temptable.uuid)
Buffers: shared hit=478 read=558
-> Bitmap Index Scan on mpd_2015_04_uuid_idx
(cost=0.00..296.62 rows=12921 width=0) (actual time=1.455..1.455 rows=0
loops=259)
Index Cond: (uuid = temptable.uuid)
Buffers: shared hit=478 read=558
...
-> Hash Join (cost=20310.84..646068.95 rows=10984718 width=12) (actual
time=13374.218..13374.219 rows=0 loops=1)
Hash Cond: (mpd_2020_02.uuid = temptable.uuid)
Buffers: shared read=226189, local read=10086
-> Seq Scan on mpd_2020_02 (cost=0.00..445883.36 rows=21969436
width=22) (actual time=2.129..10268.765 rows=21969107 loops=1)
Buffers: shared read=226189
-> Hash (cost=20308.34..20308.34 rows=200 width=22) (actual
time=292.617..292.617 rows=259 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 22kB
Buffers: local read=10086
-> HashAggregate (cost=20306.34..20308.34 rows=200
width=22) (actual time=292.501..292.552 rows=259 loops=1)
Group Key: temptable.uuid
Buffers: local read=10086
-> Seq Scan on temptable (cost=0.00..18262.27
rows=817627 width=22) (actual time=0.023..136.102 rows=817627 loops=1)
Buffers: local read=10086
-> Hash Join (cost=20310.84..390240.09 rows=6494055 width=12) (actual
time=8221.586..8221.586 rows=0 loops=1)
Hash Cond: (mpd_2020_03.uuid = temptable.uuid)
Buffers: shared read=133708, local read=10086
-> Seq Scan on mpd_2020_03 (cost=0.00..263589.10 rows=12988110
width=22) (actual time=2.118..6207.373 rows=12987440 loops=1)
Buffers: shared read=133708
-> Hash (cost=20308.34..20308.34 rows=200 width=22) (actual
time=289.169..289.169 rows=259 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 22kB
Buffers: local read=10086
-> HashAggregate (cost=20306.34..20308.34 rows=200
width=22) (actual time=289.054..289.103 rows=259 loops=1)
Group Key: temptable.uuid
Buffers: local read=10086
-> Seq Scan on temptable (cost=0.00..18262.27
rows=817627 width=22) (actual time=0.018..135.729 rows=817627 loops=1)
Buffers: local read=10086
-> Nested Loop Semi Join (cost=0.42..526.20 rows=550 width=12) (actual
time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on mpd_2020_04 (cost=0.00..21.00 rows=1100 width=22)
(actual time=0.006..0.006 rows=0 loops=1)
-> Index Scan using uuid_idx on temptable (cost=0.42..120.11
rows=4088 width=22) (never executed)
Index Cond: (uuid = mpd_2020_04.uuid)
...
-> Nested Loop Semi Join (cost=0.42..526.20 rows=550 width=12) (actual
time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on mpd_2021_12 (cost=0.00..21.00 rows=1100 width=22)
(actual time=0.002..0.002 rows=0 loops=1)
-> Index Scan using uuid_idx on temptable (cost=0.42..120.11
rows=4088 width=22) (never executed)
Index Cond: (uuid = mpd_2021_12.uuid)
Planning Time: 473.612 ms
JIT:
Functions: 932
Options: Inlining true, Optimization true, Expressions true, Deforming
true
Timing: Generation 99.333 ms, Inlining 9.228 ms, Optimization 4079.459
ms, Emission 3159.415 ms, Total 7347.435 ms
Execution Time: 286901.250 ms
(843 rows)

On Tue, Jun 16, 2020 at 6:57 AM Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
wrote:

> On Mon, Jun 15, 2020 at 7:39 PM Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
> wrote:
>
>> Hi all. I have a partitioned table (by month from a date column), where
>> each partition contains something like 400m rows.
>>
>> Each partition is defined by a PK with a uuid and date field (the parent
>> table is partitioned by range on the date), and two other columns.
>>
>> In doing a delete for a specific date, e.g. DELETE FROM t WHERE date =
>> '2019-09-01' AND uuid IN (SELECT uuid FROM temptable), it runs very
>> efficiently.
>>
>> I am trying to write a processing script that deletes for potentially
>> multiple dates & uuid values, and it just takes hours, trying:
>>
>> DELETE FROM t WHERE date = (SELECT DISTINCT date from temp) AND uuid IN
>> (select uuid from tempuuds) -- no go, hours.
>>
>> Tried USING, e.g. DELETE FROM t USING temp WHERE t.date = temp.date AND
>> t.uuid = temp.uuid -- no go, hours.
>>
>> I just can't delete from this table without an explicit date and a set of
>> uuids using a WHERE IN approach, but I need to.I was thinking of making a
>> plpgsql function or something that loops through dates and makes a more
>> explicit DELETE statement, but I'm thinking there must be some better way
>> using indexing or something.
>>
>> Appreciate any tips.
>>
>
> Have you considered partitioning by day instead of month? Could eliminate
> an index you may have on the date column.
> How many days are in the many-days DELETE? Could you simply wrap it in a
> transaction and do one DELETE per day?
>
> You could potentially get better performance removing the JOIN/sub-SELECT
> using
> DELETE FROM mytable WHERE date_col = ANY( ARRAY['2020-01-01',
> '2020-01-13']::date[] );
>
> HTH
> -Greg
>

--
Wells Oliver
wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Bruce Momjian 2020-06-16 17:00:44 Re: Upgrade streaming replication and log-shipping standby servers
Previous Message Achilleas Mantzios 2020-06-16 16:04:05 please help! losing my subscriber node