From: | Yan Cheng Cheok <yccheok(at)yahoo(dot)com> |
---|---|
To: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
Cc: | Grzegorz Jaśkiewicz <gryzman(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Extremely Slow Cascade Delete Operation |
Date: | 2010-01-22 06:16:06 |
Message-ID: | 162102.5849.qm@web65708.mail.ac4.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi Craig Ringer,
Really appreciate a lot for your advice! This at least has cleared my doubt, which had been confused me for quite some time.
Thanks and Regards
Yan Cheng CHEOK
--- On Fri, 1/22/10, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> wrote:
> From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
> Subject: Re: [GENERAL] Extremely Slow Cascade Delete Operation
> To: "Yan Cheng Cheok" <yccheok(at)yahoo(dot)com>
> Cc: "Grzegorz Jaśkiewicz" <gryzman(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
> Date: Friday, January 22, 2010, 12:51 PM
> Yan Cheng Cheok wrote:
> > I try to create a following simple scenario, to
> demonstrate cascade delete is rather slow in PostgreSQL.
> >
> > Can anyone help me to confirm? Is this my only machine
> problem, or every PostgreSQL users problem?
> >
> > I create 1 lot.
> > every lot is having 10000 unit
> > every unit is having 100 measurement.
>
> 101 measurements per unit by the looks. But it doesn't much
> matter.
>
>
> test=> CREATE INDEX fk_unit_id_idx ON measurement
> (fk_unit_id);
> CREATE INDEX
> Time: 3072.635 ms
>
>
> Now suddenly everything is much faster:
>
> test=> delete from lot;
> DELETE 1
> Time: 8066.140 ms
>
>
> Before that index creation, every deletion of a unit
> required a seqscan
> of `measurement' to find referenced measurements. At 200ms
> apiece, it
> would've taken about half an hour to `delete from lot' on
> my machine,
> and smaller deletes took a proportional amount of time (ie
> 20s for 100
> units). Now it takes 8 seconds to delete the lot.
>
> You just forgot to create an index on one of the foreign
> key
> relationships that you do a cascade delete on.
>
> BTW, Pg doesn't force you to do this because sometimes
> you'd prefer to
> wait. For example, you might do the deletes very rarely,
> and not way to
> pay the cost of maintaining the index the rest of the
> time.
>
> (What I was personally surprised by is that it's no faster
> to DELETE
> FROM measurement; directly than to delete via LOT. I
> would've expected a
> seqscan delete of the table to be MUCH faster than all the
> index-hopping
> required to delete via lot. I guess the reason there's no
> real
> difference is because the whole dataset fits in cache, so
> there's no
> seek penalty. )
>
> AFAIK, Pg isn't clever enough to batch foreign key deletes
> together and
> then plan them as a single operation. That means it can't
> use something
> other than a bunch of little index lookups where doing a
> sequential scan
> or a hash join might be faster. Adding this facility would
> certainly be
> an "interesting" project. Most of the time, though, you get
> on fine
> using index-based delete cascading, and you can generally
> pre-delete
> rows using a join on those rare occasions it is a problem.
>
> --
> Craig Ringer
>
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2010-01-22 06:21:59 | Re: Slow Query / Check Point Segments |
Previous Message | Scott Marlowe | 2010-01-22 05:49:09 | Re: Slow Query / Check Point Segments |