From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
Cc: | Floris Van Nee <florisvannee(at)optiver(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> |
Subject: | Re: Index Skip Scan |
Date: | 2020-01-22 23:13:21 |
Message-ID: | CAH2-Wz=PUY9MJs-9cL9vEvYThBRatmkY=wmP-A0oESt_TZxLZg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 22, 2020 at 1:35 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
> Oh, that's what you mean. Yes, I was somehow tricked by the name of this
> function and didn't notice that it checks only one boundary, so in case
> of backward scan it returns wrong result. I think in the situation
> you've describe it would actually not find any item on the current page
> and restart from the root, but nevertheless we need to check for both
> keys in _bt_scankey_within_page.
I suggest reading the nbtree README file's description of backwards
scans. Read the paragraph that begins with 'We support the notion of
an ordered "scan" of an index...'. I also suggest that you read a bit
of the stuff in the large section on page deletion. Certainly read the
paragraph that begins with 'Moving left in a backward scan is
complicated because...'.
It's important to grok why it's okay that we don't "couple" or "crab"
buffer locks as we descend the tree with Lehman & Yao's design -- we
can get away with having *no* interlock against page splits (e.g.,
pin, buffer lock) when we are "between" levels of the tree. This is
safe, since the page that we land on must still be "substantively the
same page", no matter how much time passes. That is, it must at least
cover the leftmost portion of the keyspace covered by the original
version of the page that we saw that we needed to descend to within
the parent page. The worst that can happen is that we have to recover
from a concurrent page split by moving right one or more times.
(Actually, page deletion can change the contents of a page entirely,
but that's not really an exception to the general rule -- page
deletion is careful about recycling pages that an in flight index scan
might land on.)
Lehman & Yao don't have backwards scans (or left links, or page
deletion). Unlike nbtree. This is why the usual Lehman & Yao
guarantees don't quite work with backward scans. We must therefore
compensate as described by the README file (basically, we check and
re-check for races, possibly returning to the original page when we
think that we might have overlooked something and need to make sure).
It's an exception to the general rule, you could say.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2020-01-22 23:17:26 | Re: error context for vacuum to include block number |
Previous Message | Tom Lane | 2020-01-22 23:00:14 | Re: Index-only scan for "f(x)" without "x" |