From: | Li Jin <ljin(at)tripadvisor(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Performance penalty when using WITH |
Date: | 2011-07-28 21:00:06 |
Message-ID: | 339959BC-3A27-4269-8A3E-4C157EBABB86@tripadvisor.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi guys,
I met with the problem that when I was using WITH clause to reuse a subquery, I got a huge performance penalty because of query planner.
Here are the details, the original query is
EXPLAIN ANALYZE WITH latest_identities AS
(
SELECT DISTINCT ON (memberid) memberid, username, changedate
FROM t_username_history
WHERE memberid IN (SELECT memberid FROM t_member WHERE firstname || ' ' || substring(lastname,1,1) = 'Eddie T')
ORDER BY memberid, changedate DESC
)
SELECT t_member.email as email, t_member.username as username, t_member.location as location, t_member.locale as locale, t_member.status as status, t_member.creationdate as creationdate, t_forum_member.pos\
ts as posts, t_forum_member.expertlocations as expertlocations, t_nexus_member.pages_created as pages_created, t_nexus_member.pages_edited as pages_edited, t_member_contributions.hotel_reviews as hotel_rev\
iews, t_member_contributions.restaurant_reviews as restaurant_reviews, t_member_contributions.attraction_reviews as attraction_reviews, t_member_contributions.geo_reviews as geo_reviews, t_member_contribut\
ions.photos as photos, t_member_contributions.videos as videos, t_recent_contribution.recent_contribution_date as recent_contribution_date, t_recent_contribution.recent_contribution_type as recent_contribu\
tion_type, t_owner_member.memberid as owner_memberid, t_member_interaction.flags as interaction_flags, t_media.path as ta_avatar_path, t_external_member.externalid as facebookid, latest_identities.username\
as latest_identity
FROM t_member left join t_forum_member on (t_member.memberid = t_forum_member.memberid) left join t_nexus_member on (t_member.memberid = t_nexus_member.memberid) left join t_member_contributions on (t_memb\
er.memberid = t_member_contributions.memberid) left join t_recent_contribution on (t_member.memberid = t_recent_contribution.memberid) left join t_owner_member on (t_member.memberid = t_owner_member.member\
id) left join t_member_interaction on (t_member.memberid = t_member_interaction.memberid) left join t_media on (t_member.avatar = t_media.id) left join t_external_member on (t_member.memberid = t_external_\
member.memberid AND t_external_member.idtype = 'FB') left join latest_identities on (t_member.memberid = latest_identities.memberid)
WHERE t_member.firstname || ' ' || substring(t_member.lastname,1,1) = 'Eddie T';
The may seems scary, but what it really does is searching for members with certain name and joining with a bunch of other tables on memberid. The t_username_history table has multiple rows for a memberid therefore I just get the most recent record for each memberid that I am interested in before the join.
Here is the link to explain:
http://explain.depesz.com/s/ZKb
Since the red part looks suboptimal to me, I changed it using WITH subquery:
EXPLAIN WITH memberids AS
(
SELECT memberid FROM t_member WHERE firstname || ' ' || substring(lastname,1,1) = 'Eddie T'
),
latest_identities AS
(
SELECT DISTINCT ON (memberid) memberid, username, changedate
FROM t_username_history
WHERE memberid IN (SELECT memberid FROM memberids)
ORDER BY memberid, changedate DESC
)
SELECT t_member.email as email, t_member.username as username, t_member.location as location, t_member.locale as locale, t_member.status as status, t_member.creationdate as creationdate, t_forum_member.pos\
ts as posts, t_forum_member.expertlocations as expertlocations, t_nexus_member.pages_created as pages_created, t_nexus_member.pages_edited as pages_edited, t_member_contributions.hotel_reviews as hotel_rev\
iews, t_member_contributions.restaurant_reviews as restaurant_reviews, t_member_contributions.attraction_reviews as attraction_reviews, t_member_contributions.geo_reviews as geo_reviews, t_member_contribut\
ions.photos as photos, t_member_contributions.videos as videos, t_recent_contribution.recent_contribution_date as recent_contribution_date, t_recent_contribution.recent_contribution_type as recent_contribu\
tion_type, t_owner_member.memberid as owner_memberid, t_member_interaction.flags as interaction_flags, t_media.path as ta_avatar_path, t_external_member.externalid as facebookid, latest_identities.username\
as latest_identity
FROM t_member left join t_forum_member on (t_member.memberid = t_forum_member.memberid) left join t_nexus_member on (t_member.memberid = t_nexus_member.memberid) left join t_member_contributions on (t_memb\
er.memberid = t_member_contributions.memberid) left join t_recent_contribution on (t_member.memberid = t_recent_contribution.memberid) left join t_owner_member on (t_member.memberid = t_owner_member.member\
id) left join t_member_interaction on (t_member.memberid = t_member_interaction.memberid) left join t_media on (t_member.avatar = t_media.id) left join t_external_member on (t_member.memberid = t_external_\
member.memberid AND t_external_member.idtype = 'FB') left join latest_identities on (t_member.memberid = latest_identities.memberid)
WHERE t_member.memberid IN (SELECT memberid FROM memberids)
However, this query runs forever because (I think) the planner join the tables before filter by where clause.
Here is the explain link:
http://explain.depesz.com/s/v2K
Anyone knows why the planner is doing this?
Regards,
Li
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-07-29 00:25:20 | Re: very large record sizes and ressource usage |
Previous Message | Terry Schmitt | 2011-07-27 16:02:53 | Re: Hardware advice for scalable warehouse db |