Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)
Date: 2024-08-27 23:17:45
Message-ID: CAH2-Wz=TKjMa9GGNXJg_sz-meYiNA2giMjmwNw5jw5LryO3Qsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 27, 2024 at 6:44 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> > One reason to do it this way is because it cuts down on index descent
> > costs, and other executor overheads. But it is likely that it will
> > also make prefetching itself more effective, too -- just because
> > prefetching will naturally end up with fewer, larger batches of
> > logically related work.
> >
>
> Perhaps.

I expect this to be particularly effective whenever there is naturally
occuring locality. I think that's fairly common. We'll sort the SAOP
array on the nbtree side, as we always do.

> So nestloop would pass down multiple values, the inner subplan
> would do whatever it wants (including prefetching), and then return the
> matching rows, somehow?

Right.

> It's not very clear to me how would we return
> the tuples for many matches, but it seems to shift the prefetching
> closer to the "normal" index prefetching discussed elsewhere.

It'll be necessary to keep track of which outer side rows relate to
which inner-side array values (within a given batch/block). Some new
data structure will be needed to manage that book keeping.

Currently, we deduplicate arrays for SAOP scans. I suppose that it
works that way because it's not really clear what it would mean for
the scan to have duplicate array keys. I don't see any need to change
that for block nested loop join/whatever this is. We would have to use
the new data structure to "pair up" outer side tuples with their
associated inner side result sets, at the end of processing each
batch/block. That way we avoid repeating the same inner index scan
within a given block/batch -- a little like with a memoize node.

Obviously, that's the case where we can exploit naturally occuring
locality most effectively -- the case where multiple duplicate inner
index scans are literally combined into only one. But, as I already
touched on, locality will be important in a variety of cases, not just
this one.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2024-08-27 23:22:20 Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Previous Message Tom Lane 2024-08-27 23:15:21 Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.