Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)
Date: 2024-08-27 22:44:50
Message-ID: 3c5d98f2-abce-4225-b3b3-f63078dd3c07@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/27/24 20:53, Peter Geoghegan wrote:
> On Tue, Aug 27, 2024 at 2:40 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>> Have you considered instead expanding the parameterized scan logic? Right now
>> nestloop passes down values one-by-one via PARAM_EXEC. What if we expanded
>> that to allow nodes, e.g. nestloop in this case, to pass down multiple values
>> in one parameter? That'd e.g. allow passing down multiple rows to fetch from
>> nodeNestloop.c to nodeIndexscan.c without needing to iterate over the executor
>> state tree.
>
> This sounds a bit like block nested loop join.
>

Yeah.

>> And it might be more powerful than just doing prefetching -
>> e.g. we could use one ScalarArrayOps scan in the index instead of doing a
>> separate scan for each of the to-be-prefetched values.
>
> ScalarArrayOps within nbtree are virtually the same thing as regular
> index scans these days. That could make a big difference (perhaps this
> is obvious).
>
> One reason to do it this way is because it cuts down on index descent
> costs, and other executor overheads. But it is likely that it will
> also make prefetching itself more effective, too -- just because
> prefetching will naturally end up with fewer, larger batches of
> logically related work.
>

Perhaps. So nestloop would pass down multiple values, the inner subplan
would do whatever it wants (including prefetching), and then return the
matching rows, somehow? It's not very clear to me how would we return
the tuples for many matches, but it seems to shift the prefetching
closer to the "normal" index prefetching discussed elsewhere.

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-08-27 23:03:08 Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
Previous Message Tomas Vondra 2024-08-27 22:38:44 Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)