Re: Extremely Slow Cascade Delete Operation

From: Yan Cheng Cheok <yccheok(at)yahoo(dot)com>
To: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
Cc: Grzegorz Jaśkiewicz <gryzman(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Extremely Slow Cascade Delete Operation
Date: 2010-01-22 06:16:06
Message-ID: 162102.5849.qm@web65708.mail.ac4.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Craig Ringer,

Really appreciate a lot for your advice! This at least has cleared my doubt, which had been confused me for quite some time.

Thanks and Regards
Yan Cheng CHEOK

--- On Fri, 1/22/10, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> wrote:

> From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
> Subject: Re: [GENERAL] Extremely Slow Cascade Delete Operation
> To: "Yan Cheng Cheok" <yccheok(at)yahoo(dot)com>
> Cc: "Grzegorz Jaśkiewicz" <gryzman(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
> Date: Friday, January 22, 2010, 12:51 PM
> Yan Cheng Cheok wrote:
> > I try to create a following simple scenario, to
> demonstrate cascade delete is rather slow in PostgreSQL.
> >
> > Can anyone help me to confirm? Is this my only machine
> problem, or every PostgreSQL users problem?
> >
> > I create 1 lot.
> > every lot is having 10000 unit
> > every unit is having 100 measurement.
>
> 101 measurements per unit by the looks. But it doesn't much
> matter.
>
>
> test=> CREATE INDEX fk_unit_id_idx ON measurement
> (fk_unit_id);
> CREATE INDEX
> Time: 3072.635 ms
>
>
> Now suddenly everything is much faster:
>
> test=> delete from lot;
> DELETE 1
> Time: 8066.140 ms
>
>
> Before that index creation, every deletion of a unit
> required a seqscan
> of `measurement' to find referenced measurements. At 200ms
> apiece, it
> would've taken about half an hour to `delete from lot' on
> my machine,
> and smaller deletes took a proportional amount of time (ie
> 20s for 100
> units). Now it takes 8 seconds to delete the lot.
>
> You just forgot to create an index on one of the foreign
> key
> relationships that you do a cascade delete on.
>
> BTW, Pg doesn't force you to do this because sometimes
> you'd prefer to
> wait. For example, you might do the deletes very rarely,
> and not way to
> pay the cost of maintaining the index the rest of the
> time.
>
> (What I was personally surprised by is that it's no faster
> to DELETE
> FROM measurement; directly than to delete via LOT. I
> would've expected a
> seqscan delete of the table to be MUCH faster than all the
> index-hopping
> required to delete via lot. I guess the reason there's no
> real
> difference is because the whole dataset fits in cache, so
> there's no
> seek penalty. )
>
> AFAIK, Pg isn't clever enough to batch foreign key deletes
> together and
> then plan them as a single operation. That means it can't
> use something
> other than a bunch of little index lookups where doing a
> sequential scan
> or a hash join might be faster. Adding this facility would
> certainly be
> an "interesting" project. Most of the time, though, you get
> on fine
> using index-based delete cascading, and you can generally
> pre-delete
> rows using a join on those rare occasions it is a problem.
>
> --
> Craig Ringer
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Greg Smith 2010-01-22 06:21:59 Re: Slow Query / Check Point Segments
Previous Message Scott Marlowe 2010-01-22 05:49:09 Re: Slow Query / Check Point Segments