Re: Retrieving unused tuple attributes in ExecScan

From: "Finnerty, Jim" <jfinnert(at)amazon(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, "Ma, Marcus" <marcjma(at)amazon(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Retrieving unused tuple attributes in ExecScan
Date: 2022-06-27 20:12:41
Message-ID: 19452DD3-4DF1-4B85-8894-2A8131F04A4C@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Re: So I'm actually using the columns during merge join, basically I'm building a bloom filter on the outer relation and filtering out data on the inner relation of the join. I'm building the filter on the join keys

We had a whole implementation for Bloom filtering for hash inner join, complete with costing and pushdown of the Bloom filter from the build side to the execution tree on the probe side (i.e. building a Bloom filter on the inner side of the join at the conclusion of the build phase of the hash join, then pushing it down as a semi-join filter to the probe side of the join, where it could potentially be applied to multiple scans). After a large change to that same area of the code by the community it got commented out and has been in that state ever since. It's a good example of the sort of change that really ought to be made with the community because there's too much merge burden otherwise.

It was a pretty effective optimization in some cases, though. Most commercial systems have an optimization like this, sometimes with special optimizations when the number of distinct join keys is very small. If there is interest in reviving this functionality, we could probably extract some patches and work with the community to try to get it running again.

/Jim

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-06-27 20:36:36 Re: do only critical work during single-user vacuum?
Previous Message Andres Freund 2022-06-27 19:52:46 Re: Retrieving unused tuple attributes in ExecScan