Quick Links

Re: index prefetching

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>
Subject:	Re: index prefetching
Date:	2023-12-21 13:27:42
Message-ID:	20231221132742.kqxt3iujna3z33ab@alap3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2023-12-09 19:08:20 +0100, Tomas Vondra wrote:
> But there's a layering problem that I don't know how to solve - I don't
> see how we could make indexam.c entirely oblivious to the prefetching,
> and move it entirely to the executor. Because how else would you know
> what to prefetch?

> With index_getnext_tid() I can imagine fetching XIDs ahead, stashing
> them into a queue, and prefetching based on that. That's kinda what the
> patch does, except that it does it from inside index_getnext_tid(). But
> that does not work for index_getnext_slot(), because that already reads
> the heap tuples.

> We could say prefetching only works for index_getnext_tid(), but that
> seems a bit weird because that's what regular index scans do. (There's a
> patch to evaluate filters on index, which switches index scans to
> index_getnext_tid(), so that'd make prefetching work too, but I'd ignore
> that here.

I think we should just switch plain index scans to index_getnext_tid(). It's
one of the primary places triggering index scans, so a few additional lines
don't seem problematic.

I continue to think that we should not have split plain and index only scans
into separate files...

> There are other index_getnext_slot() callers, and I don't
> think we should accept does not work for those places seems wrong (e.g.
> execIndexing/execReplication would benefit from prefetching, I think).

I don't think it'd be a problem to have to opt into supporting
prefetching. There's plenty places where it doesn't really seem likely to be
useful, e.g. doing prefetching during syscache lookups is very likely just a
waste of time.

I don't think e.g. execReplication is likely to benefit from prefetching -
you're just fetching a single row after all. You'd need a lot of dead rows to
make it beneficial. I think it's similar in execIndexing.c.

I suspect we should work on providing executor nodes with some estimates about
the number of rows that are likely to be consumed. If an index scan is under a
LIMIT 1, we shoulnd't prefetch. Similar for sequential scan with the
infrastructure in
https://postgr.es/m/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com

Greetings,

Andres Freund

In response to

Re: index prefetching at 2023-12-09 18:08:20 from Tomas Vondra

Responses

Re: index prefetching at 2023-12-21 15:32:51 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Laurenz Albe	2023-12-21 13:29:05	Set log_lock_waits=on by default
Previous Message	Peter Eisentraut	2023-12-21 13:24:00	Re: GUC names in messages