From: | Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com> |
---|---|
To: | Daniel Westermann <Daniel(dot)Westermann(at)lcsystems(dot)ch> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Subject: | Re: FW: performance issue with a 2.5gb joinded table |
Date: | 2013-01-09 09:30:05 |
Message-ID: | CAB=Je-H+Ci9JPSza+M1XBker2YbVSeOUE-N2BmBtRnNSTevb0w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Daniel,
>>Somehow oracle seems to know that a right join is the better way to go.
In fact, PostgreSQL is just doing the same thing: it hashes smaller table
and scans the bigger one.
Could you please clarify how do you consume 25M rows?
It could be the difference of response times comes not from the PostgreSQL
itself, but from the client code.
Could you please add the following information?
1) Execution time of simple query that selects MAX of all the required
columns "select max(test1.slsales_batch) , max(test1.slsales_checksum),
...".
I mean not explain (analyze, buffers), but simple execution.
The purpose of MAX is to split overhead of consuming of the resultset from
the overhead of producing it.
2) explain (analyze, buffers) for the same query with maxes. That should
reveal the overhead of explain analyze itself.
3) The output of the following SQLPlus script (from Oracle):
set linesize 1000 pagesize 10000 trimout on trimspool on time on timing on
spool slow_query.lst
select /*+ gather_plan_statistics */ max(test1.slsales_batch) ,
max(test1.slsales_checksum), ..;
select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS
LAST'));
spool off
That would display detailed statistics on execution time similar to the
explain (analyze, buffers).
4) Could you please clarify how did you migrate test1 table?
I guess the order of rows in that table might affect overall execution time.
Sorted table would be more CPU cache friendly, thus giving speedup. (see
[1] for similar example).
As far as I understand, simple create table as select * from test1 order by
slsales_date_id, slsales_prod_id should improve cache locality.
--
Regards,
Vladimir Sitnikov
From | Date | Subject | |
---|---|---|---|
Next Message | Alex Vinnik | 2013-01-09 15:49:43 | Re: Simple join doesn't use index |
Previous Message | Jeff Janes | 2013-01-09 04:34:11 | Re: Simple join doesn't use index |