Strange query plan invloving a view

From: Rich Doughty <rich(at)opusvl(dot)com>
To: Pgsql-Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Strange query plan invloving a view
Date: 2005-11-17 01:06:42
Message-ID: 437BD7A2.608@opusvl.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

i have the following query involving a view that i really need to optimise:

SELECT *
FROM
tokens.ta_tokenhist h INNER JOIN
tokens.vw_tokens t ON h.token_id = t.token_id
WHERE
h.sarreport_id = 9
;

where vw_tokens is defined as

CREATE VIEW tokens.vw_tokens AS SELECT
-- too many columns to mention
FROM
tokens.ta_tokens t LEFT JOIN
tokens.ta_tokenhist i ON t.token_id = i.token_id AND
i.status = 'issued' LEFT JOIN
tokens.ta_tokenhist s ON t.token_id = s.token_id AND
s.status = 'sold' LEFT JOIN
tokens.ta_tokenhist r ON t.token_id = r.token_id AND
r.status = 'redeemed'
;

this gives me the following query plan:

Merge Join (cost=18276278.45..31793043.16 rows=55727 width=322)
Merge Cond: (("outer".token_id)::integer = "inner"."?column23?")
-> Merge Left Join (cost=18043163.64..31639175.71 rows=4228018 width=76)
Merge Cond: (("outer".token_id)::integer = "inner"."?column3?")
-> Merge Left Join (cost=13649584.94..27194793.37 rows=4228018 width=48)
Merge Cond: (("outer".token_id)::integer = "inner"."?column3?")
-> Merge Left Join (cost=7179372.62..20653326.29 rows=4228018 width=44)
Merge Cond: (("outer".token_id)::integer = "inner"."?column3?")
-> Index Scan using ta_tokens_pkey on ta_tokens t (cost=0.00..13400398.89 rows=4053805 width=27)
-> Sort (cost=7179372.62..7189942.67 rows=4228018 width=21)
Sort Key: (i.token_id)::integer
-> Index Scan using fkx_tokenhist__status on ta_tokenhist i (cost=0.00..6315961.47 rows=4228018 width=21)
Index Cond: ((status)::text = 'issued'::text)
-> Sort (cost=6470212.32..6479909.69 rows=3878949 width=8)
Sort Key: (s.token_id)::integer
-> Index Scan using fkx_tokenhist__status on ta_tokenhist s (cost=0.00..5794509.99 rows=3878949 width=8)
Index Cond: ((status)::text = 'sold'::text)
-> Sort (cost=4393578.70..4400008.00 rows=2571718 width=32)
Sort Key: (r.token_id)::integer
-> Index Scan using fkx_tokenhist__status on ta_tokenhist r (cost=0.00..3841724.02 rows=2571718 width=32)
Index Cond: ((status)::text = 'redeemed'::text)
-> Sort (cost=233114.81..233248.38 rows=53430 width=246)
Sort Key: (h.token_id)::integer
-> Index Scan using fkx_tokenhist__sarreports on ta_tokenhist h (cost=0.00..213909.12 rows=53430 width=246)
Index Cond: ((sarreport_id)::integer = 9)

However, the following query (which i believe should be equivalent)

SELECT *
FROM
tokens.ta_tokenhist h INNER JOIN
tokens.ta_tokens t ON h.token_id = t.token_id LEFT JOIN
tokens.ta_tokenhist i ON t.token_id = i.token_id AND
i.status = 'issued' LEFT JOIN
tokens.ta_tokenhist s ON t.token_id = s.token_id AND
s.status = 'sold' LEFT JOIN
tokens.ta_tokenhist r ON t.token_id = r.token_id AND
r.status = 'redeemed'
WHERE
h.sarreport_id = 9
;

gives the following query plan:

Nested Loop Left Join (cost=0.00..3475785.52 rows=55727 width=1011)
-> Nested Loop Left Join (cost=0.00..2474425.17 rows=55727 width=765)
-> Nested Loop Left Join (cost=0.00..1472368.23 rows=55727 width=519)
-> Nested Loop (cost=0.00..511614.87 rows=53430 width=273)
-> Index Scan using fkx_tokenhist__sarreports on ta_tokenhist h (cost=0.00..213909.12 rows=53430 width=246)
Index Cond: ((sarreport_id)::integer = 9)
-> Index Scan using ta_tokens_pkey on ta_tokens t (cost=0.00..5.56 rows=1 width=27)
Index Cond: (("outer".token_id)::integer = (t.token_id)::integer)
-> Index Scan using fkx_tokenhist__tokens on ta_tokenhist i (cost=0.00..17.96 rows=2 width=246)
Index Cond: (("outer".token_id)::integer = (i.token_id)::integer)
Filter: ((status)::text = 'issued'::text)
-> Index Scan using fkx_tokenhist__tokens on ta_tokenhist s (cost=0.00..17.96 rows=2 width=246)
Index Cond: (("outer".token_id)::integer = (s.token_id)::integer)
Filter: ((status)::text = 'sold'::text)
-> Index Scan using fkx_tokenhist__tokens on ta_tokenhist r (cost=0.00..17.96 rows=1 width=246)
Index Cond: (("outer".token_id)::integer = (r.token_id)::integer)
Filter: ((status)::text = 'redeemed'::text)

This query returns a lot quicker than the plan would suggest, as the
planner is over-estimating the amount of rows where
((sarreport_id)::integer = 9). it thinks there are 53430 when in fact
there are only 7 (despite a vacuum and analyse).

Can anyone give me any suggestions? are the index stats the cause of
my problem, or is it the rewrite of the query?

Cheers

Version: PostgreSQL 8.0.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)

--

- Rich Doughty

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Christopher Kings-Lynne 2005-11-17 01:38:42 Re: [PERFORM] Help speeding up delete
Previous Message Michael Fuhr 2005-11-17 00:23:45 Re: Perl DBD and an alarming problem