From: | Drew Wilson <drewmwilson(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: how to efficiently update tuple in many-to-many relationship? |
Date: | 2007-04-10 01:29:41 |
Message-ID: | 22512E5C-A60F-43E7-A370-257F7AF956FD@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
My apologies. That function call was some test code to verify that my
subselect was only being called once.
Let me try again, please.
Here's the query plan for a SELECT statement that returns 1,207,161
rows in 6 seconds.
MatchBox=# explain select count(translation_pair_id) from
translation_pair_data
join instance i using(translation_pair_id)
join loc_submission ls using(loc_submission_id)
where ls.is_public = true;
QUERY PLAN
------------------------------------------------------------------------
-----------------------------------
Aggregate (cost=299276.72..299276.73 rows=1 width=4)
-> Hash Join (cost=59962.72..294036.83 rows=2095954 width=4)
Hash Cond: (i.translation_pair_id =
translation_pair_data.translation_pair_id)
-> Hash Join (cost=369.15..177405.01 rows=2095954 width=4)
Hash Cond: (i.loc_submission_id = ls.loc_submission_id)
-> Seq Scan on instance i (cost=0.00..99016.16
rows=5706016 width=8)
-> Hash (cost=296.92..296.92 rows=5778 width=4)
-> Index Scan using loc_submission_is_public
on loc_submission ls (cost=0.00..296.92 rows=5778 width=4)
Index Cond: (is_public = true)
Filter: is_public
-> Hash (cost=31861.92..31861.92 rows=1690292 width=4)
-> Seq Scan on translation_pair_data
(cost=0.00..31861.92 rows=1690292 width=4)
And here's the query plan for the UPDATE query that seems to never
complete. (Execution time > 30 minutes.)
MatchBox=# explain update translation_pair_data set is_public = true
where translation_pair_id in
(select translation_pair_id from translation_pair_data
join instance i using(translation_pair_id)
join loc_submission ls using(loc_submission_id)
where ls.is_public = true);
QUERY PLAN
------------------------------------------------------------------------
-------------------------------------------------------------
Hash IN Join (cost=328000.49..453415.65 rows=1690282 width=90)
Hash Cond: (public.translation_pair_data.translation_pair_id =
public.translation_pair_data.translation_pair_id)
-> Seq Scan on translation_pair_data (cost=0.00..31861.82
rows=1690282 width=90)
-> Hash (cost=293067.74..293067.74 rows=2067660 width=8)
-> Hash Join (cost=59958.35..293067.74 rows=2067660 width=8)
Hash Cond: (i.translation_pair_id =
public.translation_pair_data.translation_pair_id)
-> Hash Join (cost=365.00..177117.92 rows=2067660
width=4)
Hash Cond: (i.loc_submission_id =
ls.loc_submission_id)
-> Seq Scan on instance i
(cost=0.00..99016.16 rows=5706016 width=8)
-> Hash (cost=293.75..293.75 rows=5700 width=4)
-> Index Scan using
loc_submission_is_public on loc_submission ls (cost=0.00..293.75
rows=5700 width=4)
Index Cond: (is_public = true)
Filter: is_public
-> Hash (cost=31861.82..31861.82 rows=1690282 width=4)
-> Seq Scan on translation_pair_data
(cost=0.00..31861.82 rows=1690282 width=4)
I figure I must be doing something wrong here. Thanks for the help,
Drew
On Apr 9, 2007, at 2:43 PM, Tom Lane wrote:
> Drew Wilson <drewmwilson(at)gmail(dot)com> writes:
>> I have 2 tables (A,B) joined in a many-to-many relationship via a
>> join table ("membership"), where updating table A based on table B
>> takes a very long time.
>> ...
>> -> Function Scan on a (cost=0.00..12.50 rows=1000 width=4)
>
> I think you've left out some relevant details ... there's nothing
> in what you said about a set-returning function ...
>
> regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-04-10 02:13:05 | Re: how to efficiently update tuple in many-to-many relationship? |
Previous Message | s d | 2007-04-10 00:45:53 | Re: Beginner Question |