From: | Greg Spiegelberg <gspiegelberg(at)gmail(dot)com> |
---|---|
To: | Wells Oliver <wells(dot)oliver(at)gmail(dot)com> |
Cc: | pgsql-admin <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: Deleting more efficiently from large partitions |
Date: | 2020-06-16 13:57:10 |
Message-ID: | CAEtnbpWYqwSSmbZHPe5OKfx2zqEmTypv_4Y=QrFHdPKPfMKiOw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
On Mon, Jun 15, 2020 at 7:39 PM Wells Oliver <wells(dot)oliver(at)gmail(dot)com> wrote:
> Hi all. I have a partitioned table (by month from a date column), where
> each partition contains something like 400m rows.
>
> Each partition is defined by a PK with a uuid and date field (the parent
> table is partitioned by range on the date), and two other columns.
>
> In doing a delete for a specific date, e.g. DELETE FROM t WHERE date =
> '2019-09-01' AND uuid IN (SELECT uuid FROM temptable), it runs very
> efficiently.
>
> I am trying to write a processing script that deletes for potentially
> multiple dates & uuid values, and it just takes hours, trying:
>
> DELETE FROM t WHERE date = (SELECT DISTINCT date from temp) AND uuid IN
> (select uuid from tempuuds) -- no go, hours.
>
> Tried USING, e.g. DELETE FROM t USING temp WHERE t.date = temp.date AND
> t.uuid = temp.uuid -- no go, hours.
>
> I just can't delete from this table without an explicit date and a set of
> uuids using a WHERE IN approach, but I need to.I was thinking of making a
> plpgsql function or something that loops through dates and makes a more
> explicit DELETE statement, but I'm thinking there must be some better way
> using indexing or something.
>
> Appreciate any tips.
>
Have you considered partitioning by day instead of month? Could eliminate
an index you may have on the date column.
How many days are in the many-days DELETE? Could you simply wrap it in a
transaction and do one DELETE per day?
You could potentially get better performance removing the JOIN/sub-SELECT
using
DELETE FROM mytable WHERE date_col = ANY( ARRAY['2020-01-01',
'2020-01-13']::date[] );
HTH
-Greg
From | Date | Subject | |
---|---|---|---|
Next Message | Pepe TD Vo | 2020-06-16 14:20:56 | create batch script to import into postgres tables |
Previous Message | Greg Spiegelberg | 2020-06-16 11:48:10 | Re: Upgrade streaming replication and log-shipping standby servers |