Re: Support run-time partition pruning for hash join

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Richard Guo <guofenglinux(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support run-time partition pruning for hash join
Date: 2023-08-23 01:41:41
Message-ID: CAKU4AWq=CdOOX0FwRKmV2x+870Uyp6gO-3bUOPTULAPXEBPRnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> > fwiw, the current master totally ignores the cost reduction for run-time
> > partition prune, even for init partition prune. So in some real cases,
> > pg chooses a hash join just because the cost of nest loop join is
> > highly over estimated.
>
> This is true about the existing code. It's a very tricky thing to cost
> given that the parameter values are always unknown to the planner.
> The best we have for these today is the various hardcoded constants in
> selfuncs.h. While I do agree that it's not great that the costing code
> knows nothing about run-time pruning, I also think that run-time
> pruning during execution with parameterised nested loops is much more
> likely to be able to prune partitions and save actual work than the
> equivalent with Hash Joins. It's more common for the planner to
> choose to Nested Loop when there are fewer outer rows, so the pruning
> code is likely to be called fewer times with Nested Loop than with
> Hash Join.
>

Yes, I agree with this. In my 4 years of PostgresSQL, I just run into
2 cases of this issue and 1 of them is joining 12+ tables with run-time
partition prune for every join. But this situation causes more issues than
generating a wrong plan, like for a simple SELECT * FROM p WHERE
partkey = $1; generic plan will never win so we have to pay the expensive
planning cost for partitioned table.

If we don't require very accurate costing for every case, like we only
care about '=' operator which is the most common case, it should be
easier than the case here since we just need to know if only 1 partition
will survive after pruning, but don't care about which one it is. I'd like
to discuss in another thread, and leave this thread for Richard's patch
only.

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2023-08-23 02:23:40 Re: PG 16 draft release notes ready
Previous Message Peter Smith 2023-08-23 01:28:25 pg_upgrade - a function parameter shadows global 'new_cluster'