Planner problem

From: Mikael Kjellström <mikael(dot)kjellstrom(at)mksoft(dot)nu>
To: pgsql-performance(at)postgresql(dot)org
Subject: Planner problem
Date: 2004-06-04 12:10:41
Message-ID: 40C066C1.2090808@mksoft.nu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi!

I am having a bit of problem with the plan that the planner produces.

First a little info:

OS: Linux Debian
Version: PostgreSQL 7.4.2 on i686-pc-linux-gnu, compiled by GCC gcc GCC)
3.3.3 (Debian 20040401)
CPU: AMD Athlon XP 2000+
Memory: 1GB
Disk: 4 SCSI-3 UW 10000rpm in RAID 1 (mirror) mode

shared_mem: 20000
Sort mem: 8192
effective_cache_size: 80000

Row count for the tables involved:
priceavailable: 8564
product: 3
productCodeAlias: 7
locationIATA: 3402
locationConnection: 64
price: 4402

I have runned both vacuum full and vacuum analyze before I run my query
and shared_mem is set to 20000

Here is the result I am getting:

EXPLAIN ANALYZE
SELECT
CASE WHEN 'D' = 'A' THEN price.priceArrival ELSE price.priceDeparture
END AS price,
price.vatPercent AS vatPercent
FROM
priceavailable pa,
product product,
productCodeAlias productAlias,
locationiata la,
locationconnection lc,
price price
WHERE
pa.direction IN('D', 'B')
AND pa.productId = product.productId
AND product.productId = productAlias.productId
AND productAlias.productCode = 'TAXI'
AND pa.locationConnectionId = lc.locationConnectionId
AND lc.locationConnectionCode = 'RNB'
AND pa.locationId = la.locationId
AND la.iataCode = 'KB8'
AND price.pricegroupId = pa.priceGroupId
AND price.productId = product.productId
AND '2004-06-01 05:30:00.000000+02' BETWEEN price.validstartdate AND
price.validstopdate
AND product.organizationId = 1
AND price.organizationId = product.organizationId
AND pa.deletionDate IS NULL
AND product.deletionDate IS NULL
AND la.deletionDate IS NULL
AND lc.deletionDate IS NULL
AND price.deletionDate IS NULL
;

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=2.14..17.16 rows=1 width=8) (actual
time=3274.196..4620.299 rows=1 loops=1)
-> Nested Loop (cost=2.14..13.49 rows=1 width=12) (actual
time=2643.441..4617.246 rows=47 loops=1)
Join Filter: (("outer".productid = "inner".productid) AND
("inner".pricegroupid = "outer".pricegroupid))
-> Nested Loop (cost=0.00..7.72 rows=1 width=12) (actual
time=0.238..5.455 rows=111 loops=1)
-> Seq Scan on locationconnection lc (cost=0.00..1.80
rows=1 width=4) (actual time=0.153..0.245 rows=1 loops=1)
Filter: ((locationconnectioncode = 'RNB'::text)
AND (deletiondate IS NULL))
-> Index Scan using priceavailableix1 on priceavailable
pa (cost=0.00..5.91 rows=1 width=16) (actual time=0.067..4.182 rows=111
loops=1)
Index Cond: (pa.locationconnectionid =
"outer".locationconnectionid)
Filter: (((direction = 'D'::text) OR (direction =
'B'::text)) AND (deletiondate IS NULL))
-> Hash Join (cost=2.14..5.74 rows=2 width=24) (actual
time=0.058..39.116 rows=1243 loops=111)
Hash Cond: ("outer".productid = "inner".productid)
-> Index Scan using priceix2 on price (cost=0.00..3.41
rows=23 width=20) (actual time=0.042..25.811 rows=4402 loops=111)
Index Cond: ('2004-06-01'::date <= validstopdate)
Filter: (('2004-06-01'::date >= validstartdate)
AND (deletiondate IS NULL) AND (organizationid = 1))
-> Hash (cost=2.14..2.14 rows=1 width=12) (actual
time=0.132..0.132 rows=0 loops=1)
-> Nested Loop (cost=0.00..2.14 rows=1 width=12)
(actual time=0.088..0.123 rows=1 loops=1)
Join Filter: ("outer".productid =
"inner".productid)
-> Seq Scan on product (cost=0.00..1.04
rows=1 width=8) (actual time=0.013..0.022 rows=3 loops=1)
Filter: ((organizationid = 1) AND
(deletiondate IS NULL))
-> Seq Scan on productcodealias
productalias (cost=0.00..1.09 rows=1 width=4) (actual time=0.022..0.024
rows=1 loops=3)
Filter: (productcode = 'TAXI'::text)
-> Index Scan using locationiataix4 on locationiata la
(cost=0.00..3.66 rows=1 width=4) (actual time=0.052..0.052 rows=0 loops=47)
Index Cond: ("outer".locationid = la.locationid)
Filter: ((iatacode = 'KB8'::text) AND (deletiondate IS NULL))
Total runtime: 4620.852 ms

If I do an "set enable_nestloop = 0" and run the exact same question I get:

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=150.55..154.20 rows=1 width=8) (actual
time=68.095..103.006 rows=1 loops=1)
Hash Cond: ("outer".locationid = "inner".locationid)
-> Hash Join (cost=145.65..149.29 rows=1 width=12) (actual
time=66.304..102.709 rows=47 loops=1)
Hash Cond: (("outer".productid = "inner".productid) AND
("outer".pricegroupid = "inner".pricegroupid))
-> Hash Join (cost=2.15..5.75 rows=2 width=24) (actual
time=0.343..38.844 rows=1243 loops=1)
Hash Cond: ("outer".productid = "inner".productid)
-> Index Scan using priceix2 on price (cost=0.00..3.41
rows=23 width=20) (actual time=0.053..26.225 rows=4402 loops=1)
Index Cond: ('2004-06-01'::date <= validstopdate)
Filter: (('2004-06-01'::date >= validstartdate)
AND (deletiondate IS NULL) AND (organizationid = 1))
-> Hash (cost=2.14..2.14 rows=1 width=12) (actual
time=0.191..0.191 rows=0 loops=1)
-> Hash Join (cost=1.09..2.14 rows=1 width=12)
(actual time=0.174..0.184 rows=1 loops=1)
Hash Cond: ("outer".productid =
"inner".productid)
-> Seq Scan on product (cost=0.00..1.04
rows=1 width=8) (actual time=0.017..0.025 rows=3 loops=1)
Filter: ((organizationid = 1) AND
(deletiondate IS NULL))
-> Hash (cost=1.09..1.09 rows=1 width=4)
(actual time=0.073..0.073 rows=0 loops=1)
-> Seq Scan on productcodealias
productalias (cost=0.00..1.09 rows=1 width=4) (actual time=0.042..0.045
rows=1 loops=1)
Filter: (productcode = 'TAXI'::text)
-> Hash (cost=143.50..143.50 rows=1 width=12) (actual
time=61.650..61.650 rows=0 loops=1)
-> Merge Join (cost=1.81..143.50 rows=1 width=12)
(actual time=59.914..61.419 rows=111 loops=1)
Merge Cond: ("outer".locationconnectionid =
"inner".locationconnectionid)
-> Index Scan using priceavailableix1 on
priceavailable pa (cost=0.00..141.57 rows=43 width=16) (actual
time=0.048..48.739 rows=6525 loops=1)
Filter: (((direction = 'D'::text) OR
(direction = 'B'::text)) AND (deletiondate IS NULL))
-> Sort (cost=1.81..1.81 rows=1 width=4) (actual
time=0.215..0.290 rows=1 loops=1)
Sort Key: lc.locationconnectionid
-> Seq Scan on locationconnection lc
(cost=0.00..1.80 rows=1 width=4) (actual time=0.125..0.194 rows=1 loops=1)
Filter: ((locationconnectioncode =
'RNB'::text) AND (deletiondate IS NULL))
-> Hash (cost=4.89..4.89 rows=1 width=4) (actual time=0.122..0.122
rows=0 loops=1)
-> Index Scan using locationiataix1 on locationiata la
(cost=0.00..4.89 rows=1 width=4) (actual time=0.108..0.112 rows=1 loops=1)
Index Cond: (iatacode = 'KB8'::text)
Filter: (deletiondate IS NULL)
Total runtime: 103.563 ms

Which is a lot faster than the original question.

If I then enable nestloop again and lower the geqo_threshold from 11
(default) to 5 and run the same query again I get:


QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=7.72..17.19 rows=1 width=8) (actual
time=9.752..42.278 rows=1 loops=1)
-> Nested Loop (cost=7.72..13.52 rows=1 width=12) (actual
time=5.698..41.079 rows=47 loops=1)
Join Filter: ("outer".productid = "inner".productid)
-> Nested Loop (cost=7.72..12.47 rows=1 width=28) (actual
time=5.663..39.992 rows=47 loops=1)
Join Filter: ("inner".productid = "outer".productid)
-> Hash Join (cost=7.72..11.37 rows=1 width=24)
(actual time=5.470..36.848 rows=111 loops=1)
Hash Cond: (("outer".pricegroupid =
"inner".pricegroupid) AND ("outer".productid = "inner".productid))
-> Index Scan using priceix2 on price
(cost=0.00..3.41 rows=23 width=20) (actual time=0.050..26.475 rows=4402
loops=1)
Index Cond: ('2004-06-01'::date <=
validstopdate)
Filter: (('2004-06-01'::date >=
validstartdate) AND (deletiondate IS NULL) AND (organizationid = 1))
-> Hash (cost=7.72..7.72 rows=1 width=12)
(actual time=1.809..1.809 rows=0 loops=1)
-> Nested Loop (cost=0.00..7.72 rows=1
width=12) (actual time=0.253..1.587 rows=111 loops=1)
-> Seq Scan on locationconnection lc
(cost=0.00..1.80 rows=1 width=4) (actual time=0.163..0.234 rows=1 loops=1)
Filter: ((locationconnectioncode
= 'RNB'::text) AND (deletiondate IS NULL))
-> Index Scan using priceavailableix1
on priceavailable pa (cost=0.00..5.91 rows=1 width=16) (actual
time=0.070..0.965 rows=111 loops=1)
Index Cond:
(pa.locationconnectionid = "outer".locationconnectionid)
Filter: (((direction =
'D'::text) OR (direction = 'B'::text)) AND (deletiondate IS NULL))
-> Seq Scan on productcodealias productalias
(cost=0.00..1.09 rows=1 width=4) (actual time=0.020..0.022 rows=1 loops=111)
Filter: (productcode = 'TAXI'::text)
-> Seq Scan on product (cost=0.00..1.04 rows=1 width=8)
(actual time=0.005..0.013 rows=3 loops=47)
Filter: ((organizationid = 1) AND (deletiondate IS NULL))
-> Index Scan using locationiataix4 on locationiata la
(cost=0.00..3.66 rows=1 width=4) (actual time=0.022..0.022 rows=0 loops=47)
Index Cond: ("outer".locationid = la.locationid)
Filter: ((iatacode = 'KB8'::text) AND (deletiondate IS NULL))
Total runtime: 42.852 ms

Which is even faster than disable:ing nestloop.

So my question is why is the planner making such a bad choice and how
can I make it choose a better plan?

/Mikael

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message CH 2004-06-04 14:21:11 Re: filesystem option tuning
Previous Message Greg Stark 2004-06-04 05:57:43 Re: Most transactions per second on largest box?