Re: Deleting Rows From Large Tables

From: Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
To: Sékine Coulibaly <scoulibaly(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Deleting Rows From Large Tables
Date: 2013-05-18 07:15:26
Message-ID: CABWW-d3H5yq6FxdYk951OekwEUOSZNwSxH-iWSWtqSdBe8WRuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Analyze your temp tables after filling and before using!
17 трав. 2013 17:27, "Sékine Coulibaly" <scoulibaly(at)gmail(dot)com> напис.

> Oh, sorry, overlooked that part.
> Maybe refreshing stats with VACUUM FULL ?
>
>
> 2013/5/17 Robert Emery <robertemery(at)codeweavers(dot)net>
>
>> Hi Sékine,
>>
>> Unfortunately I'm not trying to empty the table completely, just
>> delete about 10-15% of the data in it.
>>
>> Thanks,
>>
>> On 17 May 2013 14:11, Sékine Coulibaly <scoulibaly(at)gmail(dot)com> wrote:
>> > Rob,
>> >
>> > Did you tried TRUNCATE ?
>> > http://www.postgresql.org/docs/8.4/static/sql-truncate.html
>> >
>> > This is is supposed to be quicker since it does scan the table.
>> >
>> > Regards
>> >
>> >
>> > 2013/5/17 Rob Emery <re-pgsql(at)codeweavers(dot)net>
>> >>
>> >> Hi All,
>> >>
>> >> We've got 3 quite large tables that due to an unexpected surge in
>> >> usage (!) have grown to about 10GB each, with 72, 32 and 31 million
>> >> rows in. I've been tasked with cleaning out about half of them, the
>> >> problem I've got is that even deleting the first 1,000,000 rows seems
>> >> to take an unreasonable amount of time. Unfortunately this is on quite
>> >> an old server (Dell 2950 with a RAID-10 over 6 disks) running Postgres
>> >> 8.4; which serves other things like our logging systems.
>> >>
>> >> If I run a sustained (more than about 5 minutes) delete it'll have a
>> >> detrimental effect on the other services. I'm trying to batch up the
>> >> deletes into small chunks of approximately 1 month of data ; even this
>> >> seems to take too long, I originally reduced this down to a single
>> >> day's data and had the same problem. I can keep decreasing the size of
>> >> the window I'm deleting but I feel I must be doing something either
>> >> fundamentally wrong or over-complicating this enormously. I've
>> >> switched over to retrieving a list of IDs to delete, storing them in
>> >> temporary tables and deleting based on the primary keys on each of the
>> >> tables with something similar to this:
>> >>
>> >> BEGIN TRANSACTION;
>> >>
>> >> CREATE TEMPORARY TABLE table_a_ids_to_delete (id INT);
>> >> CREATE TEMPORARY TABLE table_b_ids_to_delete (id INT);
>> >>
>> >> INSERT INTO table_a_ids_to_delete
>> >> SELECT id FROM table_a WHERE purchased ='-infinity' AND created_at
>> >> < '2007-01-01T00:00:00';
>> >>
>> >> INSERT INTO table_b_ids_to_delete
>> >> SELECT table_b_id FROM table_a_table_b_xref
>> >> INNER JOIN table_a_ids_to_delete ON (table_a_ids_to_delete.id =
>> >> table_a_table_b.quote_id);
>> >>
>> >> DELETE FROM table_a_table_b_xref USING table_a_ids_to_delete
>> >> WHERE table_a_table_b_xref.table_a_id = table_a_ids_to_delete.id;
>> >>
>> >> DELETE FROM table_b USING table_b_ids_to_delete
>> >> WHERE table_b.id = table_b_ids_to_delete.id;
>> >>
>> >> DELETE FROM table_a USING table_a_ids_to_delete
>> >> WHERE table_a.id = table_a_ids_to_delete.id;
>> >>
>> >> COMMIT;
>> >>
>> >> There're indices on table_a on the queried columns, table_b's primary
>> >> key is it's id, and table_a_table_b_xref has an index on (table_a_id,
>> >> table_b_id). There're FK defined on the xref table, hence why I'm
>> >> deleting from it first.
>> >>
>> >> Does anyone have any ideas as to what I can do to make the deletes any
>> >> faster? I'm running out of ideas!
>> >>
>> >> Thanks in advance,
>> >>
>> >> --
>> >> Rob Emery
>> >>
>> >>
>> >> --
>> >> Sent via pgsql-performance mailing list (
>> pgsql-performance(at)postgresql(dot)org)
>> >> To make changes to your subscription:
>> >> http://www.postgresql.org/mailpref/pgsql-performance
>> >
>> >
>>
>>
>>
>> --
>> Robert Emery
>> Database Administrator
>>
>> | T: 0800 021 0888 | www.codeweavers.net |
>> | Codeweavers Limited | Barn 4 | Dunston Business Village | Dunston |
>> ST18 9AB |
>> | Registered in England and Wales No. 04092394 | VAT registration no.
>> 974 9705 63 |
>>
>> CUSTOMERS' BLOG TWITTER FACEBOOK LINKED IN
>> DEVELOPERS' BLOG YOUTUBE
>>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message ach 2013-05-18 15:35:51 Re: statistics target for columns in unique constraint?
Previous Message Sékine Coulibaly 2013-05-17 14:26:00 Re: Deleting Rows From Large Tables