From: | Vijaykumar Jain <vjain(at)opentable(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | simple query on why a merge join plan got selected |
Date: | 2018-12-15 20:13:56 |
Message-ID: | CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Hey Guys,
I was just playing with exploring joins and plans i came across this
create table t1(a int);
create table t2(a int);
insert into t1 select (x % 10) from generate_series(1, 100000) x;
insert into t2 select (x % 100) from generate_series(1, 100000) x;
pgtesting=> analyze t1;
ANALYZE
pgtesting=> analyze t2;
ANALYZE
this is reproducible
the below query by default makes use of merge join (which takes way
longer to return rows as compared to when i explicitly disable merge
join it returns in half the time taken by merge join) but i am not
able to figure out why, although i have run analyze on the tables.
pgtesting=> explain (analyze, buffers) select * from t1 join t2 using (a);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Merge Join (cost=19495.64..1039705.09 rows=97241600 width=4) (actual
time=124.153..22243.262 rows=100000000 loops=1)
Merge Cond: (t1.a = t2.a)
Buffers: shared hit=886, temp read=320384 written=616
-> Sort (cost=9747.82..9997.82 rows=100000 width=4) (actual
time=56.442..81.611 rows=100000 loops=1)
Sort Key: t1.a
Sort Method: external merge Disk: 1376kB
Buffers: shared hit=443, temp read=172 written=173
-> Seq Scan on t1 (cost=0.00..1443.00 rows=100000 width=4)
(actual time=0.030..10.003 rows=100000 loops=1)
Buffers: shared hit=443
-> Sort (cost=9747.82..9997.82 rows=100000 width=4) (actual
time=67.702..9469.366 rows=100000001 loops=1)
Sort Key: t2.a
Sort Method: external sort Disk: 1768kB
Buffers: shared hit=443, temp read=220222 written=443
-> Seq Scan on t2 (cost=0.00..1443.00 rows=100000 width=4)
(actual time=0.013..8.186 rows=100000 loops=1)
Buffers: shared hit=443
Planning time: 0.402 ms
Execution time: 26093.192 ms
(17 rows)
pgtesting=> set enable_mergejoin TO FALSE;
SET
pgtesting=> explain (analyze, buffers) select * from t1 join t2 using (a);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Hash Join (cost=3084.00..1117491.00 rows=97241600 width=4) (actual
time=26.893..10229.924 rows=100000000 loops=1)
Hash Cond: (t1.a = t2.a)
Buffers: shared hit=889, temp read=273 written=271
-> Seq Scan on t1 (cost=0.00..1443.00 rows=100000 width=4)
(actual time=0.028..18.123 rows=100000 loops=1)
Buffers: shared hit=443
-> Hash (cost=1443.00..1443.00 rows=100000 width=4) (actual
time=26.255..26.255 rows=100000 loops=1)
Buckets: 131072 Batches: 2 Memory Usage: 2713kB
Buffers: shared hit=443, temp written=152
-> Seq Scan on t2 (cost=0.00..1443.00 rows=100000 width=4)
(actual time=0.017..9.163 rows=100000 loops=1)
Buffers: shared hit=443
Planning time: 0.099 ms
Execution time: 14095.975 ms
(12 rows)
pgtesting=> show work_mem;
work_mem
----------
4MB
(1 row)
pgtesting=> show shared_buffers;
shared_buffers
----------------
1GB
(1 row)
pgtesting=> select version();
version
---------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 10.5 (Ubuntu 10.5-1.pgdg16.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609,
64-bit
(1 row)
May be i am missing something way obvious :) but my only concern being
high cardinality joins may not use the statistics correctly?
Regards,
Vijay
From | Date | Subject | |
---|---|---|---|
Next Message | Ron | 2018-12-15 22:01:13 | Re: simple query on why a merge join plan got selected |
Previous Message | Pavel Stehule | 2018-12-15 18:02:55 | Re: new stored procedure with OUT parameters |
From | Date | Subject | |
---|---|---|---|
Next Message | Hugh Ranalli | 2018-12-15 21:03:33 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Previous Message | Tom Lane | 2018-12-15 20:02:37 | Re: 'infinity'::Interval should be added |