Re: index problems (again)

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Geoff Winkless <pgsqladmin(at)geoff(dot)dj>
Cc: Victor Yegorov <vyegorov(at)gmail(dot)com>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: index problems (again)
Date: 2016-03-07 20:23:12
Message-ID: CAMkU=1w3kbT6k6-4u19j=i5BncMs5Rcy3dq_3N9cLt=-5UTbtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Mar 7, 2016 at 5:01 AM, Geoff Winkless <pgsqladmin(at)geoff(dot)dj> wrote:
> On 7 March 2016 at 11:48, Victor Yegorov <vyegorov(at)gmail(dot)com> wrote:
>> 2016-03-07 13:38 GMT+02:00 Geoff Winkless <pgsqladmin(at)geoff(dot)dj>:
>>>
>>> # EXPLAIN (ANALYZE,BUFFERS) SELECT MIN(sc_id) FROM legs WHERE scdate
>>> BETWEEN 20160219 AND 20160221;
>>
>>
>> Will it help if you'll add `count(*)` to your query like this:
>>
>> SELECT min(sc_id), count(*) FROM legs WHERE scdate BETWEEN 20160219 AND
>> 20160221;
>
> Thanks for the reply.
>
> Yes, that does work around the problem, sort-of (although it's only
> using the scdate-only index, since it needs all the data):

You could also do "min(sc_id+0)" rather than adding a count(*) column.
Although that is not as future proof, as someday the planner might
recognize that '+0' is a no-op.

If your table is well-vacuumed such that pg_class.relallvisible is
high, then it should use the (scdate,sc_id) index in an index-only
scan. But if relallvisible is low, it has to check the table itself
for visibility information which destroys most of the benefit of an
index-only scan, and thus would prefer to use the smaller index
instead.

> Even given that, I still don't see why the (scdate,sc_id) index isn't
> perfect for this; it allows the planner to use sc_id for MIN() while
> using scdate to restrict the values. Three values to look up from the
> index-only.
>
> If I manually change the query to do what I hoped the planner would do for me:
>
> SELECT LEAST(( SELECT MIN(sc_id) FROM legs WHERE scdate =20160219), (
> SELECT MIN(sc_id) FROM legs WHERE scdate =20160220), ( SELECT
> MIN(sc_id) FROM legs WHERE scdate =20160221));

PostgreSQL does not (yet) implement "loose" index scans or "skip
scans", which is what you are asking for. You can roll your own using
the techniques described here:
https://wiki.postgresql.org/wiki/Loose_indexscan, which has the
benefit over your example code in that you don't need to enumerate all
possible values, it effectively does it for you.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Janes 2016-03-07 20:31:02 Re: index problems (again)
Previous Message Melvin Davidson 2016-03-07 20:18:57 Re: pg_restore man page question