| From: | Steve Wampler <swampler(at)noao(dot)edu> | 
|---|---|
| To: | Scott Lamb <slamb(at)slamb(dot)org> | 
| Cc: | Postgres-performance <pgsql-performance(at)postgresql(dot)org> | 
| Subject: | Re: Help speeding up delete | 
| Date: | 2005-11-14 23:52:53 | 
| Message-ID: | 43792355.1020501@noao.edu | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-performance pgsql-www | 
Scott Lamb wrote:
> On Nov 14, 2005, at 2:07 PM, Steve Wampler wrote:
> 
>> # SELECT at.id FROM "tmp_table2" at, "tmp_tabl2e" a
>> #   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505';
> 
> 
> Isn't this equivalent?
> 
> select id from tmp_table2 where name = 'obsid' and value = 'oid080505';
Probably, the user based the above on a query designed to find
all rows with the same id as those rows that have a.name='obsid' and
a.value='oid080505'.  However, I think the above would work to locate
all the ids, which is all we need for the delete (see below)
>> # DELETE FROM "tmp_table2" WHERE id IN
>> # (SELECT at.id FROM "tmp_table2" at, "tmp_table2" a
>> #   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505');
> 
> 
> and this?
> 
> delete from tmp_table2 where name = 'obsid' and value = 'oid080505';
> 
> Why are you doing a self-join using id, which I assume is a primary key?
Because I think we need to.  The above would only delete rows that have
name = 'obsid' and value = 'oid080505'.  We need to delete all rows that
have the same ids as those rows.  However, from what you note, I bet
we could do:
   DELETE FROM "tmp_table2" WHERE id IN
      (SELECT id FROM "temp_table2" WHERE name = 'obsid' and value= 'oid080505');
However, even that seems to have a much higher cost than I'd expect:
   lab.devel.configdb=# explain delete from "tmp_table2" where id in
        (select id from tmp_table2 where name='obsid' and value = 'oid080505');
   NOTICE:  QUERY PLAN:
   Seq Scan on tmp_table2  (cost=0.00..65705177237.26 rows=769844 width=6)
     SubPlan
       ->  Materialize  (cost=42674.32..42674.32 rows=38 width=50)
             ->  Seq Scan on tmp_table2  (cost=0.00..42674.32 rows=38 width=50)
EXPLAIN
And, sure enough, is taking an extrordinarily long time to run (more than
10 minutes so far, compared to < 10seconds for the select).  Is this
really typical of deletes?  It appears (to me) to be the Seq Scan on tmp_table2
that is the killer here.  If we put an index on, would it help?  (The user
claims she tried that and it's EXPLAIN cost went even higher, but I haven't
checked that...)
Thanks!
-- 
Steve Wampler -- swampler(at)noao(dot)edu
The gods that smiled on your birth are now laughing out loud.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Steve Wampler | 2005-11-15 00:00:35 | Re: Help speeding up delete | 
| Previous Message | Dave Cramer | 2005-11-14 23:51:09 | Re: Performance PG 8.0 on dual opteron / 4GB / 3ware Raid5 / Debian?? | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Steve Wampler | 2005-11-15 00:00:35 | Re: Help speeding up delete | 
| Previous Message | Tom Lane | 2005-11-14 23:42:43 | Re: Help speeding up delete |