From: | Richard Neill <rn214(at)cam(dot)ac(dot)uk> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Should the optimiser convert a CASE into a WHERE if it can? |
Date: | 2010-01-26 17:10:26 |
Message-ID: | 4B5F2202.3050107@cam.ac.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Dear All,
Just wondering whether there is a missing scope for the query planner
(on 8.4.2) to be cleverer than it currently is.
Specifically, I wonder whether the optimiser should know that by
converting a CASE condition into a WHERE condition, it can use an index.
Have I found a possible enhancement, or is this simply too hard to do?
Best wishes,
Richard
Example:
--------
In this example, tbl_tracker has 255751 rows, with a primary key "id",
whose values lie uniformly in the range 1...1255750.
If one is trying to count multiple conditions, the following query seems
to be the most obvious way to do it:
SELECT
SUM (case when id > 1200000 and id < 1210000 then 1 else 0 end) AS c1,
SUM (case when id > 1210000 and id < 1220000 then 1 else 0 end) AS c2,
SUM (case when id > 1220000 and id < 1230000 then 1 else 0 end) AS c3,
SUM (case when id > 1230000 and id < 1240000 then 1 else 0 end) AS c4,
SUM (case when id > 1240000 and id < 1250000 then 1 else 0 end) AS c5
FROM tbl_tracker;
c1 | c2 | c3 | c4 | c5
------+------+------+------+------
2009 | 2018 | 2099 | 2051 | 2030
Time: 361.666 ms
This can be manually optimised into a far uglier (but much much faster)
query:
SELECT * FROM
(SELECT COUNT (1) AS c1 FROM tbl_tracker
WHERE id > 1200000 and id < 1210000) AS s1,
(SELECT COUNT (1) AS c2 FROM tbl_tracker
WHERE id > 1210000 and id < 1220000) AS s2,
(SELECT COUNT (1) AS c3 FROM tbl_tracker
WHERE id > 1220000 and id < 1230000) AS s3,
(SELECT COUNT (1) AS c4 FROM tbl_tracker
WHERE id > 1230000 and id < 1240000) AS s4,
(SELECT COUNT (1) AS c5 FROM tbl_tracker
WHERE id > 1240000 and id < 1250000) AS s5
c1 | c2 | c3 | c4 | c5
------+------+------+------+------
2009 | 2018 | 2099 | 2051 | 2030
(1 row)
Time: 21.091 ms
Debugging
---------
The simple queries are:
SELECT SUM (case when id > 1200000 and id < 1210000 then 1 else 0 end)
from tbl_tracker;
Time: 174.804 ms
Explain shows that this does a sequential scan.
SELECT COUNT(1) from tbl_tracker WHERE id > 1200000 and id < 1210000;
Time: 4.153 ms
Explain shows that this uses the index, as expected.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-01-26 17:21:38 | Re: Should the optimiser convert a CASE into a WHERE if it can? |
Previous Message | Tom Lane | 2010-01-26 16:41:54 | Re: Poor query plan across OR operator |