Re: Extremely Slow Cascade Delete Operation

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Yan Cheng Cheok <yccheok(at)yahoo(dot)com>
Cc: Grzegorz Jaśkiewicz <gryzman(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Extremely Slow Cascade Delete Operation
Date: 2010-01-22 04:51:18
Message-ID: 4B592EC6.6080709@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Yan Cheng Cheok wrote:
> I try to create a following simple scenario, to demonstrate cascade delete is rather slow in PostgreSQL.
>
> Can anyone help me to confirm? Is this my only machine problem, or every PostgreSQL users problem?
>
> I create 1 lot.
> every lot is having 10000 unit
> every unit is having 100 measurement.

101 measurements per unit by the looks. But it doesn't much matter.

test=> CREATE INDEX fk_unit_id_idx ON measurement (fk_unit_id);
CREATE INDEX
Time: 3072.635 ms

Now suddenly everything is much faster:

test=> delete from lot;
DELETE 1
Time: 8066.140 ms

Before that index creation, every deletion of a unit required a seqscan
of `measurement' to find referenced measurements. At 200ms apiece, it
would've taken about half an hour to `delete from lot' on my machine,
and smaller deletes took a proportional amount of time (ie 20s for 100
units). Now it takes 8 seconds to delete the lot.

You just forgot to create an index on one of the foreign key
relationships that you do a cascade delete on.

BTW, Pg doesn't force you to do this because sometimes you'd prefer to
wait. For example, you might do the deletes very rarely, and not way to
pay the cost of maintaining the index the rest of the time.

(What I was personally surprised by is that it's no faster to DELETE
FROM measurement; directly than to delete via LOT. I would've expected a
seqscan delete of the table to be MUCH faster than all the index-hopping
required to delete via lot. I guess the reason there's no real
difference is because the whole dataset fits in cache, so there's no
seek penalty. )

AFAIK, Pg isn't clever enough to batch foreign key deletes together and
then plan them as a single operation. That means it can't use something
other than a bunch of little index lookups where doing a sequential scan
or a hash join might be faster. Adding this facility would certainly be
an "interesting" project. Most of the time, though, you get on fine
using index-based delete cascading, and you can generally pre-delete
rows using a join on those rare occasions it is a problem.

--
Craig Ringer

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2010-01-22 05:46:50 Re: Slow Query / Check Point Segments
Previous Message Alex - 2010-01-22 04:13:17 Slow Query / Check Point Segments