From: | armand pirvu <armand(dot)pirvu(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: hash join performance question |
Date: | 2017-07-20 19:07:02 |
Message-ID: | 2335B68C-C97A-4ACB-A283-ED723B7EFF07@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> On Jul 18, 2017, at 10:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> armand pirvu <armand(dot)pirvu(at)gmail(dot)com> writes:
>> testdb3=# explain analyze SELECT a.company_id FROM csischema.dim_company a, woc.dim_company b
>> testdb3-# WHERE a.company_id = b.company_id;
>> QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------------------
>> Hash Join (cost=711.05..54938.35 rows=18980 width=4) (actual time=34.067..1118.603 rows=18980 loops=1)
>> Hash Cond: (a.company_id = b.company_id)
>> -> Seq Scan on dim_company a (cost=0.00..47097.82 rows=1850582 width=4) (actual time=0.013..523.249 rows=1786376 loops=1)
>> -> Hash (cost=473.80..473.80 rows=18980 width=4) (actual time=20.203..20.203 rows=18980 loops=1)
>> Buckets: 32768 Batches: 1 Memory Usage: 924kB
>> -> Seq Scan on dim_company b (cost=0.00..473.80 rows=18980 width=4) (actual time=0.007..10.076 rows=18980 loops=1)
>> Planning time: 0.511 ms
>> Execution time: 1121.068 ms
>> (8 rows)
>
>> I was expecting at least the PK of csischema.dim_company to be used . In another DBMS that was the case. The larger table , csischema.dim_company used the PK.
>
> That looks like a perfectly reasonable plan to me. If you think it isn't,
> perhaps because you're assuming that both tables are fully cached in RAM,
> then you should reduce random_page_cost to teach the planner that that's
> the execution scenario you're expecting. Everything always in RAM would
> correspond to random_page_cost = 1, and some rough calculations suggest
> that that would reduce the estimated cost of a
> nestloop-with-inner-indexscan enough to make the planner choose that way.
>
> regards, tom lane
Thank you Tom
Made a bit reading about the random_page_cost value
I understand not all optimizers are equal
But for example in Ingres world
K Join(col1)
Heap
Pages 57 Tups 18981
D696 C1139
/ \
Proj-rest $tk1
Sorted(col1) I(a)
Pages 76 Tups 18981 B-Tree(col1)
D25 C190 Pages 2140 Tups 426435
/
$tk2
I(b)
B-Tree(NU)
Pages 98 Tups 18981
ds8(armandp):/u1/sys_admin/armandp> time sql -uirs testdb <foo.sql > /dev/null
real 0m0.37s
user 0m0.04s
sys 0m0.01s
And that is pretty constant, whether pages are in the cache or not
More important IMHO , rather than scan the smaller table , I just scan it’s PK which is an index at the end of the day, which then I join with the larger table PK
Now granted I have hash joins disabled on Ingres so not sure this is a true apple to apple .
And that what made me raise the question
I would like to know why in Postgres smaller table gets scanned as opposed to use it’s PK
After all , one column is far less expensive to traverse top to bottom than all columns
Thank you
Armand
From | Date | Subject | |
---|---|---|---|
Next Message | Leonardo M. Ramé | 2017-07-20 19:46:09 | Streaming Replication archive_command is really needed? |
Previous Message | greigwise | 2017-07-20 17:46:22 | Bug in postgres 9.6.2? |