Re: index prefetching

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: index prefetching
Date: 2024-02-14 21:02:29
Message-ID: CAAKRu_btSd9taRfwthMNKYxyAGw_AG6mF46OyQ42hBOACxGMLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 14, 2024 at 11:40 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Tue, Feb 13, 2024 at 2:01 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >
> > On 2/7/24 22:48, Melanie Plageman wrote:
> > > ...
> > > - switching scan directions
> > >
> > > If the index scan switches directions on a given invocation of
> > > IndexNext(), heap blocks may have already been prefetched and read for
> > > blocks containing tuples beyond the point at which we want to switch
> > > directions.
> > >
> > > We could fix this by having some kind of streaming read "reset"
> > > callback to drop all of the buffers which have been prefetched which
> > > are now no longer needed. We'd have to go backwards from the last TID
> > > which was yielded to the caller and figure out which buffers in the
> > > pgsr buffer ranges are associated with all of the TIDs which were
> > > prefetched after that TID. The TIDs are in the per_buffer_data
> > > associated with each buffer in pgsr. The issue would be searching
> > > through those efficiently.
> > >
> >
> > Yeah, that's roughly what I envisioned in one of my previous messages
> > about this issue - walking back the TIDs read from the index and added
> > to the prefetch queue.
> >
> > > The other issue is that the streaming read API does not currently
> > > support backwards scans. So, if we switch to a backwards scan from a
> > > forwards scan, we would need to fallback to the non streaming read
> > > method. We could do this by just setting the TID queue size to 1
> > > (which is what I have currently implemented). Or we could add
> > > backwards scan support to the streaming read API.
> > >
> >
> > What do you mean by "support for backwards scans" in the streaming read
> > API? I imagined it naively as
> >
> > 1) drop all requests in the streaming read API queue
> >
> > 2) walk back all "future" requests in the TID queue
> >
> > 3) start prefetching as if from scratch
> >
> > Maybe there's a way to optimize this and reuse some of the work more
> > efficiently, but my assumption is that the scan direction does not
> > change very often, and that we process many items in between.
>
> Yes, the steps you mention for resetting the queues make sense. What I
> meant by "backwards scan is not supported by the streaming read API"
> is that Thomas/Andres had mentioned that the streaming read API does
> not support backwards scans right now. Though, since the callback just
> returns a block number, I don't know how it would break.
>
> When switching between a forwards and backwards scan, does it go
> backwards from the current position or start at the end (or beginning)
> of the relation?

Okay, well I answered this question for myself, by, um, trying it :).
FETCH backward will go backwards from the current cursor position. So,
I don't see exactly why this would be an issue.

> If it is the former, then the blocks would most
> likely be in shared buffers -- which the streaming read API handles.
> It is not obvious to me from looking at the code what the gap is, so
> perhaps Thomas could weigh in.

I have the same problem with the sequential scan streaming read user,
so I am going to try and figure this backwards scan and switching scan
direction thing there (where we don't have other issues).

- Melanie

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-02-14 21:37:11 Re: Refactoring backend fork+exec code
Previous Message Daniel Gustafsson 2024-02-14 20:48:43 Re: Fix a typo in pg_rotate_logfile