improving performance for a delete

From: kevin kempter <kevin(at)kevinkempterllc(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: improving performance for a delete
Date: 2008-05-20 19:51:45
Message-ID: 1F6D96EA-31C8-47D0-9D72-664639C387C0@kevinkempterllc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi all;

I have 2 tables where I basically want to delete from the first table
(seg_id_tmp7) any rows where the entire row already exists in the
second table (sl_cd_segment_dim)

I have a query that looks like this (and it's slow):

delete from seg_id_tmp7
where
customer_srcid::text ||
show_srcid::text ||
show_name::text ||
season_srcid::text ||
season_name::text ||
episode_srcid::text ||
episode_name::text ||
segment_type_id::text ||
segment_type::text ||
segment_srcid::text ||
segment_name::text
in
( select
customer_srcid::text ||
show_srcid::text ||
show_name::text ||
season_srcid::text ||
season_name::text ||
episode_srcid::text ||
episode_name::text ||
segment_type_id::text ||
segment_type::text ||
segment_srcid::text ||
segment_name::text
from sl_cd_location_dim )
;

Here's the query plan for it:

QUERY PLAN
-----------------------------------------------------------------------------------
Seq Scan on seg_id_tmp7 (cost=0.00..138870701.56 rows=2136 width=6)
Filter: (subplan)
SubPlan
-> Seq Scan on sl_cd_location_dim (cost=0.00..63931.60
rows=433040 width=8)
(4 rows)

I also tried this:

delete from seg_id_tmp7
where
( customer_srcid ,
show_srcid ,
show_name ,
season_srcid ,
season_name ,
episode_srcid ,
episode_name ,
segment_type_id ,
segment_type ,
segment_srcid ,
segment_name )
in
( select
customer_srcid ,
show_srcid ,
show_name ,
season_srcid ,
season_name ,
episode_srcid ,
episode_name ,
segment_type_id ,
segment_type ,
segment_srcid ,
segment_name
from sl_cd_location_dim )
;

and I get this query plan:

QUERY PLAN
-----------------------------------------------------------------------------------
Seq Scan on seg_id_tmp7 (cost=0.00..87997034.20 rows=2136 width=6)
Filter: (subplan)
SubPlan
-> Seq Scan on sl_cd_location_dim (cost=0.00..40114.40
rows=433040 width=8)
(4 rows)

If it helps here's the describe's (including indexes) for both tables:

# \d seg_id_tmp7
Table "public.seg_id_tmp7"
Column | Type | Modifiers
-----------------+-----------------------------+-----------
customer_srcid | bigint |
show_srcid | bigint |
show_name | character varying |
season_srcid | bigint |
season_name | character varying |
episode_srcid | bigint |
episode_name | character varying |
segment_type_id | bigint |
segment_type | character varying |
segment_srcid | bigint |
segment_name | character varying |
create_dt | timestamp without time zone |

# \d sl_cd_segment_dim
Table
"public.sl_cd_segment_dim"
Column | Type
| Modifiers
----------------------+-----------------------------
+-------------------------------------------------------------
sl_cd_segment_dim_id | bigint | not null
default nextval('sl_cd_segment_dim_seq'::regclass)
customer_srcid | bigint | not null
show_srcid | bigint | not null
show_name | character varying(500) | not null
season_srcid | bigint | not null
season_name | character varying(500) | not null
episode_srcid | bigint | not null
episode_name | character varying(500) | not null
segment_type_id | integer |
segment_type | character varying(500) |
segment_srcid | bigint |
segment_name | character varying(500) |
effective_dt | timestamp without time zone | not null
default now()
inactive_dt | timestamp without time zone |
last_update_dt | timestamp without time zone | not null
default now()
Indexes:
"sl_cd_segment_dim_pk" PRIMARY KEY, btree (sl_cd_segment_dim_id)
"seg1" btree (customer_srcid)
"seg2" btree (show_srcid)
"seg3" btree (season_srcid)
"seg4" btree (episode_srcid)
"seg5" btree (segment_srcid)
"sl_cd_segment_dim_ix1" btree (customer_srcid)

Any thoughts, suggestions, etc on how to improve performance for this
delete ?

Thanks in advance..

/Kevin

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message kevin kempter 2008-05-20 20:03:30 Re: improving performance for a delete
Previous Message Scott Marlowe 2008-05-20 17:47:13 Re: slow update