From: | Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com> |
---|---|
To: | Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
Cc: | pg(at)bowt(dot)ie, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> |
Subject: | Re: Index Skip Scan |
Date: | 2018-11-16 15:06:09 |
Message-ID: | b4835174-f383-c912-dd7f-2961b69eed1a@redhat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 11/15/18 6:41 AM, Alexander Kuzmenkov wrote:
>>> But having this logic inside _bt_next means that it will make a
>>> non-skip index
>>> only scan a bit slower, am I right?
>>
>> Correct.
>
> Well, it depends on how the skip scan is implemented. We don't have to
> make normal scans slower, because skip scan is just a separate thing.
>
> My main point was that current implementation is good as a proof of
> concept, but it is inefficient for some data and needs some unreliable
> planner logic to work around this inefficiency. And now we also have
> execution-time fallback because planning-time fallback isn't good
> enough. This looks overly complicated. Let's try to design an algorithm
> that works well regardless of the particular data and doesn't need these
> heuristics. It should be possible to do so for btree.
>
> Of course, I understand the reluctance to implement an entire new type
> of btree traversal. Anastasia Lubennikova suggested a tweak for the
> current method that may improve the performance for small groups of
> equal values. When we search for the next unique key, first check if it
> is contained on the current btree page using its 'high key'. If it is,
> find it on the current page. If not, search from the root of the tree
> like we do now. This should improve the performance for small equal
> groups, because there are going to be several such groups on the page.
> And this is exactly where we have the regression compared to unique +
> index scan.
>
Robert suggested something similar in [1]. I'll try and look at that
when I'm back from my holiday.
> By the way, what is the data for which we intend this feature to work?
> Obviously a non-unique btree index, but how wide are the tuples, and how
> big the equal groups? It would be good to have some real-world examples.
>
Although my primary use-case is int I agree that we should test the
different data types, and tuple widths.
Best regards,
Jesper
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-11-16 15:11:19 | Re: Convert MAX_SAOP_ARRAY_SIZE to new guc |
Previous Message | Robert Haas | 2018-11-16 15:05:44 | Re: Refactoring the checkpointer's fsync request queue |