From: | Dmitry Lazurkin <dilaz03(at)gmail(dot)com> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Perfomance of IN-clause with many elements and possible solutions |
Date: | 2017-07-24 22:44:10 |
Message-ID: | 3059a9df-dcf7-8ea0-a452-9d4db4783b67@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 25.07.2017 01:25, David G. Johnston wrote:
> On Mon, Jul 24, 2017 at 3:22 PM, Dmitry Lazurkin <dilaz03(at)gmail(dot)com
> <mailto:dilaz03(at)gmail(dot)com>>wrote:
>
> ALTER TABLE ids ALTER COLUMN id SET NOT NULL;
> EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN
> :values_clause;
>
> Aggregate (cost=245006.46..245006.47 rows=1 width=8) (actual
> time=3824.095..3824.095 rows=1 loops=1)
> Buffers: shared hit=44248
> -> Hash Join (cost=7.50..235006.42 rows=4000019 width=0)
> (actual time=1.108..3327.112 rows=3998646 loops=1)
> ...
>
>
> You haven't constrained the outer relation (i.e., :values_clause) to
> be non-null which is what I believe is required for the semi-join
> algorithm to be considered.
>
> David J.
CREATE TABLE second_ids (i bigint);
INSERT INTO second_ids :values_clause;
EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN (select
i from second_ids);
Aggregate (cost=225004.36..225004.37 rows=1 width=8) (actual
time=3826.641..3826.641 rows=1 loops=1)
Buffers: shared hit=44249
-> Hash Semi Join (cost=5.50..215004.32 rows=4000019 width=0)
(actual time=0.352..3338.601 rows=3998646 loops=1)
Hash Cond: (ids.id = second_ids.i)
Buffers: shared hit=44249
-> Seq Scan on ids (cost=0.00..144248.48 rows=10000048
width=8) (actual time=0.040..1069.006 rows=10000000 loops=1)
Buffers: shared hit=44248
-> Hash (cost=3.00..3.00 rows=200 width=8) (actual
time=0.288..0.288 rows=200 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 16kB
Buffers: shared hit=1
-> Seq Scan on second_ids (cost=0.00..3.00 rows=200
width=8) (actual time=0.024..0.115 rows=200 loops=1)
Buffers: shared hit=1
Planning time: 0.413 ms
Execution time: 3826.752 ms
Hash Semi-Join without NOT NULL constraint on second table.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-07-24 22:46:44 | Re: Perfomance of IN-clause with many elements and possible solutions |
Previous Message | David G. Johnston | 2017-07-24 22:25:09 | Re: Perfomance of IN-clause with many elements and possible solutions |