Re: Slow HashAggregate/cache access

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Andreas Joseph Krogh <andreas(at)visena(dot)com>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Slow HashAggregate/cache access
Date: 2015-08-10 03:16:16
Message-ID: CAKJS1f9TNE4QZaFBkZCOqNC+inaHhnhJmzG3z=SRpoPYpRYDzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 6 August 2015 at 22:05, Andreas Joseph Krogh <andreas(at)visena(dot)com> wrote:

> På torsdag 06. august 2015 kl. 03:09:55, skrev Alexandre de Arruda Paes <
> adaldeia(at)gmail(dot)com>:
>
>
>> Notice the seq scan on t1 instead of the index scan on t1_a_idx.
>>
>> A way around this is to manually push the predicate down into the
>> subquery:
>>
>> explain select t2.a,s.sumv from (select a,sum(v) sumv from t1 where t1.a
>> <= 1 group by a) s inner join t2 on t2.a = s.a where t2.a <= 1;
>> QUERY PLAN
>>
>> -------------------------------------------------------------------------------
>> Nested Loop (cost=0.42..21.98 rows=1 width=12)
>> Join Filter: (t1.a = t2.a)
>> -> GroupAggregate (cost=0.42..4.46 rows=1 width=8)
>> Group Key: t1.a
>> -> Index Scan using t1_a_idx on t1 (cost=0.42..4.44 rows=1
>> width=8)
>> Index Cond: (a <= 1)
>> -> Seq Scan on t2 (cost=0.00..17.50 rows=1 width=4)
>> Filter: (a <= 1)
>> (8 rows)
>>
>>
>>
>
> Hi David,
>
> You are right. If the subquery includes the same filters of the main
> select (of the existing fields, sure), the times down to the floor (50 ms
> in the first execution and *18* ms by cache. Superb! ):
>
> (...) (SELECT SUM(fr13quant) AS fr13TotQtd, fr01codemp, fr02codigo,
> fr13dtlanc, SUM(COALESCE( fr13quant, 0) * CAST(COALESCE( fr13preco, 0) AS
> NUMERIC(18,10))) AS fr13VrTot
> FROM FR13T1 *WHERE (fr01codemp = '1' and fr13dtlanc >= '01/05/2014') AND
> (fr02codigo >= '0' and fr02codigo <= '9999999999') AND (fr13dtlanc <=
> '31/05/2014') *GROUP BY fr01codemp, fr02codigo, fr13dtlanc ) T4 ON
> T4.fr01codemp = T1.fr01codemp AND T4.fr02codigo = T1.fr02codigo AND
> T4.fr13dtlanc = T1.fr13dtlanc)
> (...)
>
>
> QUERY PLAN
>
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop Left Join (cost=5770.32..7894.70 rows=1 width=130) (actual
> time=13.715..18.366 rows=2 loops=1)
> Join Filter: ((fr13t3.fr01codemp = t1.fr01codemp) AND
> (fr13t3.fr02codigo = t1.fr02codigo) AND (fr13t3.fr13dtlanc = t1.fr13dtlanc))
> Rows Removed by Join Filter: 368
> Buffers: shared hit=5920
> -> Nested Loop Left Join (cost=5764.18..7887.47 rows=1 width=98)
> (actual time=13.529..18.108 rows=2 loops=1)
> Join Filter: (t3.fr01codemp = t1.fr01codemp)
> Buffers: shared hit=5918
> -> Nested Loop Left Join (cost=5764.04..7887.30 rows=1
> width=87) (actual time=13.519..18.094 rows=2 loops=1)
> Join Filter: ((fr13t1.fr01codemp = t1.fr01codemp) AND
> (fr13t1.fr02codigo = t1.fr02codigo) AND (fr13t1.fr13dtlanc = t1.fr13dtlanc))
> Rows Removed by Join Filter: 11144
> Buffers: shared hit=5914
> -> Nested Loop Left Join (cost=0.70..2098.42 rows=1
> width=23) (actual time=0.796..2.071 rows=2 loops=1)
> Buffers: shared hit=181
> -> Index Scan using ufr13t2 on fr13t t1
> (cost=0.42..2094.11 rows=1 width=19) (actual time=0.787..2.054 rows=2
> loops=1)
> Index Cond: ((fr01codemp = 1::smallint) AND
> (fr13dtlanc >= '2014-05-01'::date) AND (fr13dtlanc <= '2014-05-31'::date))
> Filter: ((fr02codigo >= 0::numeric) AND
> (fr02codigo <= 9999999999::numeric) AND (fr13codpr = 60732))
> Rows Removed by Filter: 5621
> Buffers: shared hit=175
> -> Index Scan using fr02t_pkey on fr02t t2
> (cost=0.28..4.30 rows=1 width=12) (actual time=0.005..0.005 rows=1 loops=2)
> Index Cond: ((fr01codemp = t1.fr01codemp) AND
> (fr01codemp = 1::smallint) AND (fr02codigo = t1.fr02codigo))
> Buffers: shared hit=6
> -> HashAggregate (cost=5763.34..5770.15 rows=681
> width=21) (actual time=5.576..6.787 rows=5573 loops=2)
> Buffers: shared hit=5733
> -> Index Scan using ufr13t15 on fr13t1
> (cost=0.42..5644.31 rows=6802 width=21) (actual time=0.020..3.371
> rows=7053 loops=1)
> Index Cond: ((fr01codemp = 1::smallint) AND
> (fr13dtlanc >= '2014-05-01'::date) AND (fr13dtlanc <= '2014-05-31'::date)
> AND (fr02codigo >= 0::numeric) AND (fr02codigo <= 9999999999::numeric))
> Buffers: shared hit=5733
> -> Index Scan using fr09t_pkey on fr09t t3 (cost=0.14..0.16
> rows=1 width=15) (actual time=0.005..0.005 rows=1 loops=2)
> Index Cond: ((fr01codemp = 1::smallint) AND (fr09cod =
> t2.fr09cod))
> Buffers: shared hit=4
> -> HashAggregate (cost=6.14..6.43 rows=29 width=17) (actual
> time=0.056..0.086 rows=184 loops=2)
> Buffers: shared hit=2
> -> Seq Scan on fr13t3 (cost=0.00..4.30 rows=184 width=17)
> (actual time=0.003..0.027 rows=184 loops=1)
> Filter: (fr01codemp = 1::smallint)
> Buffers: shared hit=2
> Total runtime: 18.528 ms
> (35 rows)
>
>
> Tomorrow I will try to do the same with the other slow query, reporting
> here.
>
>
> It will be interesting to see how Oracle and SQL-Server perform with the
> re-written query too.
> Thanks.
>
>
Glad that's looking better for you.

I'd guess that they're likely already pushing down those predicates into
the subquery going by the execution times that you posted.

I can't imagine Oracle can perform a seq scan / table scan that much faster
than Postgres

Interested to hear the results of your tests though.

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Rural Hunter 2015-08-11 08:46:30 Query Plan Performance on Partitioned Table
Previous Message Andreas Joseph Krogh 2015-08-06 10:05:46 Re: Slow HashAggregate/cache access