From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com> |
Subject: | Re: index prefetching |
Date: | 2023-12-21 13:27:42 |
Message-ID: | 20231221132742.kqxt3iujna3z33ab@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-12-09 19:08:20 +0100, Tomas Vondra wrote:
> But there's a layering problem that I don't know how to solve - I don't
> see how we could make indexam.c entirely oblivious to the prefetching,
> and move it entirely to the executor. Because how else would you know
> what to prefetch?
> With index_getnext_tid() I can imagine fetching XIDs ahead, stashing
> them into a queue, and prefetching based on that. That's kinda what the
> patch does, except that it does it from inside index_getnext_tid(). But
> that does not work for index_getnext_slot(), because that already reads
> the heap tuples.
> We could say prefetching only works for index_getnext_tid(), but that
> seems a bit weird because that's what regular index scans do. (There's a
> patch to evaluate filters on index, which switches index scans to
> index_getnext_tid(), so that'd make prefetching work too, but I'd ignore
> that here.
I think we should just switch plain index scans to index_getnext_tid(). It's
one of the primary places triggering index scans, so a few additional lines
don't seem problematic.
I continue to think that we should not have split plain and index only scans
into separate files...
> There are other index_getnext_slot() callers, and I don't
> think we should accept does not work for those places seems wrong (e.g.
> execIndexing/execReplication would benefit from prefetching, I think).
I don't think it'd be a problem to have to opt into supporting
prefetching. There's plenty places where it doesn't really seem likely to be
useful, e.g. doing prefetching during syscache lookups is very likely just a
waste of time.
I don't think e.g. execReplication is likely to benefit from prefetching -
you're just fetching a single row after all. You'd need a lot of dead rows to
make it beneficial. I think it's similar in execIndexing.c.
I suspect we should work on providing executor nodes with some estimates about
the number of rows that are likely to be consumed. If an index scan is under a
LIMIT 1, we shoulnd't prefetch. Similar for sequential scan with the
infrastructure in
https://postgr.es/m/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2023-12-21 13:29:05 | Set log_lock_waits=on by default |
Previous Message | Peter Eisentraut | 2023-12-21 13:24:00 | Re: GUC names in messages |