Re: Asymmetric partition-wise JOIN

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, KaiGai Kohei <kaigai(at)heterodb(dot)com>, sulamul(at)gmail(dot)com, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Aleksander Alekseev <afiskon(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>
Subject: Re: Asymmetric partition-wise JOIN
Date: 2024-08-01 18:56:12
Message-ID: CAPpHfdt0NevYsYSy_QuJAJmk34+9qJY3Cc34Qi6LbdTugPBTmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Tue, Apr 2, 2024 at 6:07 AM Andrei Lepikhov
<a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> On 15/10/2023 13:25, Alexander Korotkov wrote:
> > Great! I'm looking forward to the revised patch.
> Revising the code and opinions before restarting this work, I found two
> different possible strategies mentioned in the thread:
> 1. 'Common Resources' shares the materialised result of the inner table
> scan (a hash table in the case of HashJoin) to join each partition one
> by one. It gives us a profit in the case of parallel append and possibly
> other cases, like the one shown in the initial message.
> 2. 'Individual strategies' - By limiting the AJ feature to cases when
> the JOIN clause contains a partitioning expression, we can push an
> additional scan clause into each copy of the inner table scan, reduce
> the number of tuples scanned, and even prune something because of proven
> zero input.
>
> I see the pros and cons of both approaches. The first option is more
> straightforward, and its outcome is obvious in the case of parallel
> append. But how can we guarantee the same join type for each join? Why
> should we ignore the positive effect of different strategies for
> different partitions?
> The second strategy is more expensive for the optimiser, especially in
> the multipartition case. But as I can predict, it is easier to implement
> and looks more natural for the architecture. What do you think about that?

Actually, the idea I tried to express is the combination of #1 and #2:
to build individual plan for every partition, but consider the 'Common
Resources'. Let me explain this a bit more.

Right now, we basically we consider the following properties during
selection of paths.
1) Cost. The cheaper path wins. There a two criteria though: startup
cost and total cost. So, we can keep both paths with cheaper startup
costs and paths with cheaper total cost.
2) Pathkeys. We can keep a path with more expensive path, which has
pathkeys potentially useful in future.

My idea is to introduce a new property for paths selection.
3) Usage of common resources. The common resource can be: hash
representation of relation, memoize over relation scan, etc. We can
exclude the cost of common resource generation from the path cost, but
keep the reference for the common resource with its generation cost.
If one path uses more common resources than another path, it could
cost-dominate another one only if its cheaper together with its extra
common resources cost. If one path uses less or equal common
resources than another, it could normally cost-dominate another one.

Using these rules, we can gather the the plurality of paths for each
child join taking common resources into account. After that we can
apply some global optimization finding generation of which common
resources can reduce the global cost.

However, I understand this is huge amount of work given we have to
introduce new basic optimizer concepts. I get that the main
application of this patch is sharding. If we have global tables
residing each shard, we can push down any joins with them. Given this
patch gives some optimization for non-sharded case, I think we
*probably* can accept its concept even that it this optimization is
obviously not perfect.

------
Regards,
Alexander Korotkov
Supabase

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2024-08-01 19:13:18 Re: Inconsistency with EXPLAIN ANALYZE CREATE MATERIALIZED VIEW
Previous Message Kirill Reshke 2024-08-01 18:41:18 Re: Inconsistency with EXPLAIN ANALYZE CREATE MATERIALIZED VIEW