Re: Do not scan index in right table if condition for left join evaluates to false using columns in left table

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Илья Жарков <izharkov1243(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, p(dot)petrov(at)postgrespro(dot)ru
Subject: Re: Do not scan index in right table if condition for left join evaluates to false using columns in left table
Date: 2024-12-09 02:35:57
Message-ID: 50b7ecd9-7512-479b-8096-2bda4e0e7ac1@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/12/2024 23:37, Andres Freund wrote:
> On 2024-12-08 15:44:23 +0700, Andrei Lepikhov wrote:
>> On 8/12/2024 09:52, Andres Freund wrote:
>>>> I think avoiding touching a hash table and an index under MergeJoin can also
>>>> be beneficial.
>>>
>>> How would you get significant wins for mergejoins? You need to go through both
>>> inner and outer anyway?
>
>> In my mind, this trick can be designed for specific cases like sales tables,
>> as illustrated before and used by well-rounded developers.
>
> I'm not saying that the optimization isn't useful, just that i don't think it
> makes much sense for mergeoins.
>
> Besides, at least as far as I can see, fundamentally not optimizing mergejoins
> in any substantial manner, it also just doesn't seem likely that queries where
> this optimization would come up are likely to be planned as mergejoins. If you
> have a leftjoin to a category-style table, you're very rarely going to scan
> the "main" table with an ordered index scan on the category key, which would
> be required for a merge join. And sorting the main table once for each
> to-be-joined-category-table isn't a desirable path most of the time either.
>
> I don't know what it means that it's to be "used by well-rounded
> developers". We have to write the implementation in way it works regardless of
> what kind of developer is using postgres.
I got it, thanks. I agree that the profit for MJ & HJ cases looks
modest. When designing the prototype, I realised that it seemed more
"symmetrical" and logical to separate such 'one-sided' clauses during
the planning for all types of join.
But I don't insist on this statement. It would be OK even in the case of
NestLoop (remember, we should change the NL cost model too).

>> I'm not sure that such optimisation would be profitable in general.
>
> Are you suggesting it would only be enabled by a GUC?
No, such a feature probably won't generate much overhead; I don't see
any reason to cover it under a GUC.

--
regards, Andrei Lepikhov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-12-09 03:19:10 Re: Parallel heap vacuum
Previous Message Michael Paquier 2024-12-09 01:48:42 Re: Missing LWLock protection in pgstat_reset_replslot()