Re: Aggregate and many LEFT JOIN

From: Michael Lewis <mlewis(at)entrata(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: kimaidou <kimaidou(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Aggregate and many LEFT JOIN
Date: 2019-02-22 16:14:17
Message-ID: CAHOFxGpQ+RDdbfaVTPC8mHhRu5mQ48q3CTac3Vtfzo=7yn4U9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Curious- Is geqo_threshold still set to 12? Is increasing
join_collapse_limit to be higher than geqo_threshold going to have a
noticeable impact?

The disk sorts are the killer as Justin says. I wonder how it performs with
that increased significantly. Is the storage SSD or traditional hard disks?

*Michael Lewis*

On Fri, Feb 22, 2019 at 8:54 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> On Fri, Feb 22, 2019 at 04:14:05PM +0100, kimaidou wrote:
> > Explain shows that the GROUP AGGREGATE and needed sort kill the
> performance.
> > Do you have any hint how to optimize this ?
> > https://explain.depesz.com/s/6nf
>
> This is writing 2GB tempfile, perhaps the query would benefit from larger
> work_mem:
> |Sort (cost=3,014,498.66..3,016,923.15 rows=969,796 width=1,818) (actual
> time=21,745.193..22,446.561 rows=1,212,419 loops=1)
> | Sort Method: external sort Disk: 1782200kB
> | Buffers: shared hit=5882951, temp read=230958 written=230958
>
> This is apparently joining without indices:
> |Nested Loop Left Join (cost=1.76..360,977.37 rows=321,583 width=1,404)
> (actual time=0.080..1,953.007 rows=321,849 loops=1)
> | Join Filter: (tgc1.groupe_nom = t.group1_inpn)
> | Rows Removed by Join Filter: 965547
> | Buffers: shared hit=1486327
>
> This perhaps should have an index on tgc2.groupe_type ?
> |Index Scan using t_group_categorie_pkey on taxon.t_group_categorie tgc2
> (cost=0.14..0.42 rows=1 width=28) (actual time=0.002..0.002 rows=1
> loops=321,849)
> | Index Cond: (tgc2.groupe_nom = t.group2_inpn)
> | Filter: (tgc2.groupe_type = 'group2_inpn'::text)
> | Buffers: shared hit=643687
>
> This would perhaps benefit from an index on tv.cd_ref ?
> |Index Scan using taxref_consolide_non_filtre_cd_nom_idx on
> taxon.taxref_consolide_non_filtre tv (cost=0.42..0.63 rows=1 width=94)
> (actual time=0.002..0.002 rows=1 loops=690,785)
> | Index Cond: (tv.cd_nom = t.cd_ref)
> | Filter: (tv.cd_nom = tv.cd_ref)
> | Buffers: shared hit=2764875
>
> I don't think it's causing a significant fraction of the issue, but for
> some
> reason this is overestimating rowcount by 2000. Do you need to VACUUM
> ANALYZE
> the table ?
> |Seq Scan on occtax.personne p_1 (cost=0.00..78.04 ROWS=2,204 width=56)
> (actual time=0.011..0.011 ROWS=1 loops=1)
>
> Justin
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message kimaidou 2019-02-22 16:33:11 Re: Aggregate and many LEFT JOIN
Previous Message Justin Pryzby 2019-02-22 15:54:15 Re: Aggregate and many LEFT JOIN