Re: Support run-time partition pruning for hash join

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support run-time partition pruning for hash join
Date: 2023-08-23 01:19:34
Message-ID: CAKU4AWp3rBakJL+ACXT57M0Sxj2+veFJuaEvR1RazYCsrygPeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 22, 2023 at 5:43 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:

>
> On Mon, Aug 21, 2023 at 8:34 PM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
>> This feature looks good, but is it possible to know if we can prune
>> any subnodes before we pay the extra effort (building the Hash
>> table, for each row... stuff)?
>>
>
> It might be possible if we take the partition prunning into
> consideration when estimating costs. But it seems not easy to calculate
> the costs accurately.
>

This is a real place I am worried about the future of this patch.
Personally, I do like this patch, but not sure what if this issue can't be
fixed to make everyone happy, and fixing this perfectly looks hopeless
for me. However, let's see what will happen.

>
>
>> Maybe at least, if we have found no subnodes can be skipped
>> during the hashing, we can stop doing such work anymore.
>>
>
> Yeah, this is what we can do.
>

cool.

>
>
>> In my current knowledge, we have to build the inner table first for this
>> optimization? so hash join and sort merge should be OK, but nestloop
>> should
>> be impossible unless I missed something.
>>
>
> For nestloop and mergejoin, we'd always execute the outer side first.
> So the Append/MergeAppend nodes need to be on the inner side for the
> join partition prunning to take effect. For a mergejoin that will
> explicitly sort the outer side, the sort node would process all the
> outer rows before scanning the inner side, so we can do the join
> partition prunning with that. For a nestloop, if we have a Material
> node on the outer side, we can do that too, but I wonder if we'd have
> such a plan in real world, because we only add Material to the inner
> side of nestloop.
>

This is more interesting than I expected,thanks for the explaination.

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-08-23 01:28:25 pg_upgrade - a function parameter shadows global 'new_cluster'
Previous Message Andres Freund 2023-08-23 01:17:45 Re: initdb caching during tests