Re: Deletion Challenge

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Berend Tober <btober(at)computer(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Deletion Challenge
Date: 2015-12-09 20:31:35
Message-ID: 56688FA7.9090800@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/09/2015 12:24 AM, Berend Tober wrote:
> Adrian Klaver wrote:
>> On 12/05/2015 08:08 AM, Berend Tober wrote:
>>> /*
>>>
>>> Deletion Challenge
>>>
>>> I want to delete all but the most recent transaction, per person, from a
>>> table that records a transaction history because at some point the
>>> transaction history grows large enough to adversely effect performance,
>>> and also becomes less relevant for retention.
>>>
>>> ...
>>>
>>
>> test=> delete from cash_journal where ARRAY[click, cash_journal_id]
>> NOT in (select max(ARRAY[click,cash_journal_id]) from cash_journal
>> group by fairian_id);
>> DELETE 7
>>
>> test=> SELECT * FROM cash_journal order by fairian_id, click,
>> cash_journal_id;
>> click | cash_journal_id | fairian_id | debit | credit | balance
>> | description
>> -------+-----------------+------------+-------+--------+---------+----------------------------------
>>
>> 412 | 1 | 7 | 5 | | 14 |
>> Sold food quantity 7 units.
>> 37 | 7 | 8 | 8 | | 8 |
>> Ratified contract f1abd670358e03
>> 37 | 9 | 9 | 7 | | 7 |
>> Ratified contract 1574bddb75c78a
>> 36 | 14 | 18 | 0 | 0 | 0 |
>> initial cash balance
>> 413 | 1 | 25 | | 995 | 0 |
>> Redeemed bond 7719a1c782a1ba
>> (5 rows)
>>
>
> Nice.
>
> The idea of a NOT IN query had occurred to me briefly, but I failed to
> pursue it because at some point in the distant past I had gained the
> impression that NOT IN queries were not computationally efficient.
> During one round of testing I had like a million rows. I'll have to run
> some EXPLAIN query testing with a larger data sample for comparison.
> Thanks!

Plan B:

WITH d AS
(SELECT * FROM
cash_journal
LEFT JOIN
(SELECT
MAX(ARRAY[click,cash_journal_id]) AS mx
FROM
cash_journal
GROUP BY
fairian_id)
AS
mxa
ON
mxa.mx=ARRAY[click, cash_journal_id]
WHERE
mx IS NULL)
DELETE FROM
cash_journal
USING
d
WHERE
d.click = cash_journal.click
AND
d.cash_journal_id = cash_journal.cash_journal_id;

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2015-12-09 20:43:11 Re: Deletion Challenge
Previous Message John R Pierce 2015-12-09 19:38:18 Re: HELP!!! The WAL Archive is taking up all space