Quick Links

Re: Converting NOT IN to anti-joins during planning

From:	"Li, Zheng" <zhelli(at)amazon(dot)com>
To:	David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	"Finnerty, Jim" <jfinnert(at)amazon(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Converting NOT IN to anti-joins during planning
Date:	2019-06-14 16:35:21
Message-ID:	AB147063-56E3-49A9-82B6-E8E57C1363BD@amazon.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

-----
The big "IF" here is if we can calculate the size of the subplan to
know if it'll be hashed or not at the point in planning where this
conversion is done. I personally can't quite see how that'll work
reliably without actually planning the subquery, which I really doubt
is something we'd consider doing just for a cost estimate. Remember
the subquery may not just be a single relation scan, it could be a
complex query containing many joins and UNION / GROUP BY / DISTINCT /
HAVING clauses etc.
-----

In our latest patch, we plan the subquery right before conversion, we only
proceed with the ANTI JOIN conversion if subplan_is_hashable(subplan) is
false. To avoid re-planning the subquery again in a later phase, I think we can
keep a pointer to the subplan in SubLink.

-----------
Zheng Li
AWS, Amazon Aurora PostgreSQL

On 6/14/19, 5:51 AM, "David Rowley" <david(dot)rowley(at)2ndquadrant(dot)com> wrote:

On Fri, 14 Jun 2019 at 20:41, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On Wed, 6 Mar 2019 at 04:11, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>>
>> Hi Jim,
>>
>> Thanks for replying here.
>>
>> On Wed, 6 Mar 2019 at 16:37, Jim Finnerty <jfinnert(at)amazon(dot)com> wrote:
>> >
>> > Actually, we're working hard to integrate the two approaches. I haven't had
>> > time since I returned to review your patch, but I understand that you were
>> > checking for strict predicates as part of the nullness checking criteria,
>> > and we definitely must have that. Zheng tells me that he has combined your
>> > patch with ours, but before we put out a new patch, we're trying to figure
>> > out how to preserve the existing NOT IN execution plan in the case where the
>> > materialized subplan fits in memory. This (good) plan is effectively an
>> > in-memory hash anti-join.
>> >
>> > This is tricky to do because the NOT IN Subplan to anti-join transformation
>> > currently happens early in the planning process, whereas the decision to
>> > materialize is made much later, when the best path is being converted into a
>> > Plan.
>>
>> I guess you're still going with the OR ... IS NULL in your patch then?
>> otherwise, you'd likely find that the transformation (when NULLs are
>> not possible) is always a win since it'll allow hash anti-joins. (see
>> #2 in the original email on this thread) FWIW I mentioned in [1] and
>> Tom confirmed in [2] that we both think hacking the join condition to
>> add an OR .. IS NULL is a bad idea. I guess you're not deterred by
>> that?
>
>
> Surely we want both?
>
> 1. Transform when we can
> 2. Else apply some other approach if the cost can be reduced by doing it

Maybe. If the scope for the conversion is reduced to only add the OR
.. IS NULL join clause when the subplan could not be hashed then it's
maybe less likely to cause performance regressions. Remember that this
forces the planner to use a nested loop join since no other join
algorithms support OR clauses. I think Jim and Zheng have now changed
their patch to do that. If we can perform a parameterised nested loop
join then that has a good chance of being better than scanning the
subquery multiple times, however, if there's no index to do a
parameterized nested loop, then we need to do a normal nested loop
which will perform poorly, but so will the non-hashed subplan...

# create table t1 (a int);
CREATE TABLE
# create table t2 (a int);
CREATE TABLE
# set work_mem = '64kB';
SET
# insert into t1 select generate_Series(1,10000);
INSERT 0 10000
# insert into t2 select generate_Series(1,10000);
INSERT 0 10000
# explain analyze select count(*) from t1 where a not in(select a from t2);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1668739.50..1668739.51 rows=1 width=8) (actual
time=7079.077..7079.077 rows=1 loops=1)
-> Seq Scan on t1 (cost=0.00..1668725.16 rows=5738 width=0)
(actual time=7079.072..7079.072 rows=0 loops=1)
Filter: (NOT (SubPlan 1))
Rows Removed by Filter: 10000
SubPlan 1
-> Materialize (cost=0.00..262.12 rows=11475 width=4)
(actual time=0.004..0.397 rows=5000 loops=10000)
-> Seq Scan on t2 (cost=0.00..159.75 rows=11475
width=4) (actual time=0.019..4.921 rows=10000 loops=1)
Planning Time: 0.348 ms
Execution Time: 7079.309 ms
(9 rows)

# explain analyze select count(*) from t1 where not exists(select 1
from t2 where t1.a = t2.a or t2.a is null);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1250873.25..1250873.26 rows=1 width=8) (actual
time=7263.980..7263.980 rows=1 loops=1)
-> Nested Loop Anti Join (cost=0.00..1250858.97 rows=5709
width=0) (actual time=7263.976..7263.976 rows=0 loops=1)
Join Filter: ((t1.a = t2.a) OR (t2.a IS NULL))
Rows Removed by Join Filter: 49995000
-> Seq Scan on t1 (cost=0.00..159.75 rows=11475 width=4)
(actual time=0.013..2.350 rows=10000 loops=1)
-> Materialize (cost=0.00..262.12 rows=11475 width=4)
(actual time=0.004..0.396 rows=5000 loops=10000)
-> Seq Scan on t2 (cost=0.00..159.75 rows=11475
width=4) (actual time=0.007..4.075 rows=10000 loops=1)
Planning Time: 0.086 ms
Execution Time: 7264.141 ms
(9 rows)

When the index exists the transformation is certainly much better.

# create index on t2(a);
CREATE INDEX
# explain analyze select count(*) from t1 where not exists(select 1
from t2 where t1.a = t2.a or t2.a is null);
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=111342.50..111342.51 rows=1 width=8) (actual
time=29.580..29.581 rows=1 loops=1)
-> Nested Loop Anti Join (cost=7.10..111342.50 rows=1 width=0)
(actual time=29.574..29.622 rows=0 loops=1)
-> Seq Scan on t1 (cost=0.00..145.00 rows=10000 width=4)
(actual time=0.010..0.883 rows=10000 loops=1)
-> Bitmap Heap Scan on t2 (cost=7.10..11.11 rows=1 width=4)
(actual time=0.002..0.002 rows=1 loops=10000)
Recheck Cond: ((t1.a = a) OR (a IS NULL))
Heap Blocks: exact=10000
-> BitmapOr (cost=7.10..7.10 rows=1 width=0) (actual
time=0.002..0.002 rows=0 loops=10000)
-> Bitmap Index Scan on t2_a_idx
(cost=0.00..0.30 rows=1 width=0) (actual time=0.001..0.001 rows=1
loops=10000)
Index Cond: (a = t1.a)
-> Bitmap Index Scan on t2_a_idx
(cost=0.00..4.29 rows=1 width=0) (actual time=0.001..0.001 rows=0
loops=10000)
Index Cond: (a IS NULL)
Planning Time: 0.311 ms
Execution Time: 29.670 ms
(13 rows)

The big "IF" here is if we can calculate the size of the subplan to
know if it'll be hashed or not at the point in planning where this
conversion is done. I personally can't quite see how that'll work
reliably without actually planning the subquery, which I really doubt
is something we'd consider doing just for a cost estimate. Remember
the subquery may not just be a single relation scan, it could be a
complex query containing many joins and UNION / GROUP BY / DISTINCT /
HAVING clauses etc.

However, if there turns out to be some good way to do that that I
can't see then I think that each patch should be separate so that they
can be accepted or rejected on their own merits. The problem, for now,
is that the patches conflict with each other. I don't really want to
base mine on Jim and Zheng's patch, perhaps they feel the same about
basing theirs on mine.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: Converting NOT IN to anti-joins during planning at 2019-06-14 09:50:40 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fabien COELHO	2019-06-14 17:01:19	Re: pg_dump multi VALUES INSERT
Previous Message	Stephen Frost	2019-06-14 16:08:35	Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions