From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | armand pirvu <armand(dot)pirvu(at)gmail(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: hash join performance question |
Date: | 2017-07-19 03:30:30 |
Message-ID: | 7811.1500435030@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
armand pirvu <armand(dot)pirvu(at)gmail(dot)com> writes:
> testdb3=# explain analyze SELECT a.company_id FROM csischema.dim_company a, woc.dim_company b
> testdb3-# WHERE a.company_id = b.company_id;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------
> Hash Join (cost=711.05..54938.35 rows=18980 width=4) (actual time=34.067..1118.603 rows=18980 loops=1)
> Hash Cond: (a.company_id = b.company_id)
> -> Seq Scan on dim_company a (cost=0.00..47097.82 rows=1850582 width=4) (actual time=0.013..523.249 rows=1786376 loops=1)
> -> Hash (cost=473.80..473.80 rows=18980 width=4) (actual time=20.203..20.203 rows=18980 loops=1)
> Buckets: 32768 Batches: 1 Memory Usage: 924kB
> -> Seq Scan on dim_company b (cost=0.00..473.80 rows=18980 width=4) (actual time=0.007..10.076 rows=18980 loops=1)
> Planning time: 0.511 ms
> Execution time: 1121.068 ms
> (8 rows)
> I was expecting at least the PK of csischema.dim_company to be used . In another DBMS that was the case. The larger table , csischema.dim_company used the PK.
That looks like a perfectly reasonable plan to me. If you think it isn't,
perhaps because you're assuming that both tables are fully cached in RAM,
then you should reduce random_page_cost to teach the planner that that's
the execution scenario you're expecting. Everything always in RAM would
correspond to random_page_cost = 1, and some rough calculations suggest
that that would reduce the estimated cost of a
nestloop-with-inner-indexscan enough to make the planner choose that way.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Glen Huang | 2017-07-19 11:37:57 | Is it possible to define a constraint based on the values in other rows in the current table? |
Previous Message | armand pirvu | 2017-07-19 03:05:24 | hash join performance question |