| From: | Bill Moran <wmoran(at)potentialtech(dot)com> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | SELECT DISTINCT never uses an index? |
| Date: | 2016-07-07 20:56:19 |
| Message-ID: | 20160707165619.037e6730c53316c263343895@potentialtech.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Take the following table as an example:
CREATE TABLE grue (
id SERIAL PRIMARY KEY,
size VARCHAR(255)
);
CREATE INDEX grue_size ON grue(size);
Now insert approximately eleventy jillion rows, but ensure
that there are only about 20 distinct values for size.
SELECT DISTINCT size FROM grue;
Always does a seq scan on Postgres 9.5.2. (Yes, I know we're
a patch behind, the upgrade is on the schedule) on
Ubuntu 14.
I would expect it to be possible, and significantly more
efficient to do an index scan for that query. Is this
a bug, an optimization that is simply waiting for someone
to take the time to implement, or is there some underlying
reason why this isn't possible that I'm not seeing.
And, yes, I know that's not a normalized table and that
properly normalizing it makes the problem disappear. And
yes, this is repeatable (I'm working with about 6 tables
with similar structure that all exhibit the same
behavior) and yes I've done ANALYZE and VACUUM and the
behavior doesn't change.
--
Bill Moran
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2016-07-07 21:49:27 | Re: SELECT DISTINCT never uses an index? |
| Previous Message | Robert Haas | 2016-07-07 20:28:07 | Re: strange explain in upstream - subplan 1 twice - is it bug? |