From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "andremachado" <andremachado(at)techforce(dot)com(dot)br> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows) |
Date: | 2006-05-12 22:19:38 |
Message-ID: | 23876.1147472378@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
"andremachado" <andremachado(at)techforce(dot)com(dot)br> writes:
> continuing the saga,
> http://archives.postgresql.org/pgsql-performance/2006-04/msg00558.php ,
> my coleague created a test database with fake data (see below).
Thanks. I played around with this a bit, and got results like these:
original query, 8.1 branch from a couple weeks back: 945 sec
original query, 8.1 branch tip: 184 sec
modified query, 8.1 branch tip: 15 sec
The first differential is because of this patch:
http://archives.postgresql.org/pgsql-committers/2006-04/msg00355.php
viz
Remove the restriction originally coded into
optimize_minmax_aggregates() that MIN/MAX not be converted to
use an index if the query WHERE clause contains any volatile
functions or subplans.
Allowing the max(DEC2.AM_REFERENCIA) subquery to be converted to an
indexscan makes for about a 5X reduction in the number of times the
EXISTS sub-subquery is executed. But the real problem is that Postgres
isn't excessively smart about EXISTS subqueries. I manually changed it
into an IN to get the 15-second runtime: instead of
(select max(DEC2.AM_REFERENCIA) from DECLARACAO DEC2
where DEC2.IN_FOI_RETIFICADA=0 and
exists (select CAD3.ID_CADASTRO from CADASTRO CAD3 where
CAD3.ID_DECLARACAO=DEC2.ID_DECLARACAO and
CAD3.ID_EMPRESA=CADASTRO.ID_EMPRESA ) )
write
(select max(DEC2.AM_REFERENCIA) from DECLARACAO DEC2
where DEC2.IN_FOI_RETIFICADA=0 and DEC2.ID_DECLARACAO in
(select CAD3.ID_DECLARACAO from CADASTRO CAD3 where
CAD3.ID_EMPRESA=CADASTRO.ID_EMPRESA ) )
I'm not clear on how Firebird is managing to do this query in under
a second --- I can believe that they know how to do EXISTS as a join
but it still seems like the subqueries need to be done many thousand
times. I thought maybe they were caching the results of the overall
subquery for specific values of CADASTRO.ID_EMPRESA, but now that I
see your test data, there are several thousand distinct values of
that, so there's not a lot of traction to be gained that way.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | André Felipe Machado | 2006-05-13 09:58:02 | Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows) |
Previous Message | Joshua D. Drake | 2006-05-12 19:54:11 | Re: Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows) |