I downloaded the latest version,and the cost of "not in" is much higher than that of "not exist". please see attachment for detail.
As the time of query is very long,I didn't get the explain analyze result.
I think the id columns of table a and b are not null, so the query of "not in" and "not exists" are equal,they should use similar plans.
I notice there is a "Materialize" step in my bad plans of "not in",but there isn't in your following plan. I wonder how to get the plan you posted here, are there any configure argument or command?
egress=# explain select max(a.info)from sli_test a where a.id not in(select
regress(# b.id from sli_test2 b where b.id<50000);
QUERY PLAN
---------------------------------------------------------------------------------
Aggregate (cost=38050.82..38050.83 rows=1 width=12)
-> Seq Scan on sli_test a (cost=18026.82..36800.82 rows=500000 width=12)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on sli_test2 b (cost=0.00..17906.00 rows=48329 width=4)
Filter: (id < 50000)
(6 rows)
It runs in about 500ms here.
and I got following plan on kingbase 7.1,a modified version of postgres product.
TEST=# explain select max(a.name)from a where a.id not in(select b.id from b where b.id<50000);
QUERY PLAN
--------------------------------------------------------------------------------
-----------------------
Aggregate (cost=72541.71..72541.72 rows=1 width=12)
-> Merge Anti Join (cost=0.00..70041.71 rows=1000000 width=12)
Merge Cond: (A.ID = B.ID)
-> Index Scan using A_PKEY on A (cost=0.00..32257.36 rows=1000000 width=16)
-> Index Scan using B_ID on B (cost=0.00..31117.69 rows=333333 width=4)
Index Cond: (ID < 50000)
(6 行)
时间: 20.095 ms