Deleting more efficiently from large partitions

From: Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
To: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Deleting more efficiently from large partitions
Date: 2020-06-16 01:39:02
Message-ID: CAOC+FBWgX0D=WJBT43wTHUp=t=7xvBZB_ryVgK-QpURNrTJQ=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi all. I have a partitioned table (by month from a date column), where
each partition contains something like 400m rows.

Each partition is defined by a PK with a uuid and date field (the parent
table is partitioned by range on the date), and two other columns.

In doing a delete for a specific date, e.g. DELETE FROM t WHERE date =
'2019-09-01' AND uuid IN (SELECT uuid FROM temptable), it runs very
efficiently.

I am trying to write a processing script that deletes for potentially
multiple dates & uuid values, and it just takes hours, trying:

DELETE FROM t WHERE date = (SELECT DISTINCT date from temp) AND uuid IN
(select uuid from tempuuds) -- no go, hours.

Tried USING, e.g. DELETE FROM t USING temp WHERE t.date = temp.date AND
t.uuid = temp.uuid -- no go, hours.

I just can't delete from this table without an explicit date and a set of
uuids using a WHERE IN approach, but I need to.I was thinking of making a
plpgsql function or something that loops through dates and makes a more
explicit DELETE statement, but I'm thinking there must be some better way
using indexing or something.

Appreciate any tips.

--
Wells Oliver
wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Keith 2020-06-16 04:54:20 Re: block corruption on slave db.
Previous Message Raúl Rodríguez Rodríguez (Public EMail Adress) 2020-06-15 23:39:07