From: | Shaun Thomas <sthomas(at)optionshouse(dot)com> |
---|---|
To: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Loose Index Scans by Planner? |
Date: | 2012-08-24 16:20:21 |
Message-ID: | 5037A9C5.4030701@optionshouse.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Maybe I should post this in Hackers instead, but I figured I'd start
here to avoid cluttering up that list.
So, we know we have a way of doing a loose index scan with CTEs:
http://wiki.postgresql.org/wiki/Loose_indexscan
But that got me wondering. The planner knows from pg_stats that col1
could have low cardinality. If that's the case, and a WHERE clause uses
a two column index, and col2 is specified, why can't it walk each
individual bucket in the two-column index, and use col2? So I forced
such a beast with a CTE:
WITH RECURSIVE t AS (
SELECT min(col1) AS col1
FROM tablename
UNION ALL
SELECT (SELECT min(col1)
FROM tablename
WHERE col1 > t.col1)
FROM t
WHERE t.col1 IS NOT NULL
)
SELECT p.*
FROM t
JOIN tablename p USING (col1)
where p.col2 = 12345
I ask, because while the long-term fix would be to re-order the index to
(col2, col1), this seems like a situation the planner could easily
detect and compensate for. In our particular example, execution time
went from 160ms to 2ms with the CTE rewrite. This is a contrived
example, but it seems like loose index scans would be useful in other
ways. Heck, this:
SELECT DISTINCT col1
FROM tablename;
Has terrible performance because it always seems to revert to a sequence
scan, but it's something people do *all the time*. I can't reasonably
expect all of my devs to switch to that admittedly gross CTE to get a
faster effect, so I'm just thinking out loud.
Until PG puts in something to fix this, I plan on writing a stored
procedure that writes a dynamic CTE and returns a corresponding result
set. It's not ideal, but it would solve our particular itch. Really,
this should be possible with any indexed column, so I might abstract it.
--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas(at)optionshouse(dot)com
______________________________________________
See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-08-24 17:13:45 | Re: default_isolation_level='serializable' crashes on Windows |
Previous Message | Heikki Linnakangas | 2012-08-24 15:51:20 | Re: Statistics and selectivity estimation for ranges |
From | Date | Subject | |
---|---|---|---|
Next Message | Artur Zając | 2012-08-24 18:46:42 | NOTIFY performance |
Previous Message | Felix Schubert | 2012-08-24 09:47:45 | Slow Performance on a XEON E5504 |