Re: Hash Join over Nested Loop

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Luís Roberto Weck <luisroberto(at)siscobra(dot)com(dot)br>
Cc: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Hash Join over Nested Loop
Date: 2019-11-22 17:55:34
Message-ID: CAFj8pRD=4DMG98UEdqMQrOc=24Z-ucBSOJ0qgbO2uner9Z6Qrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

pá 22. 11. 2019 v 18:37 odesílatel Luís Roberto Weck <
luisroberto(at)siscobra(dot)com(dot)br> napsal:

> Hey,
>
> I'm trying to figure out why Postgres is choosing a Hash Join over a
> Nested Loop in this query:
>
> SELECT T1.PesID, T1.PesNom, T1.PesValSal, T1.PesCPFCNPJ, T2.CarAti,
> T1.CarCod, T1.EmpCod,
> T2.CarFan, T1.PesDatAge, T1.PesCod, COALESCE( T3.PesDatAnt, DATE
> '00010101') AS PesDatAnt
> FROM ((public.Pessoa T1
> INNER JOIN public.Carteira T2 ON T2.EmpCod = T1.EmpCod AND
> T2.CarCod = T1.CarCod)
> LEFT JOIN (SELECT MIN(COALESCE( T5.ConVenAnt, DATE
> '00010101')) AS PesDatAnt, T4.EmpCod, T4.CarCod, T4.ConPesCod AS ConPesCod
> FROM (public.Contrato T4
> LEFT JOIN (SELECT MIN(ConParDatVen) AS
> ConVenAnt, EmpCod, CarCod, ConPesCod, ConSeq
> FROM public.ContratoParcela
> T5
> WHERE ConParAti = true
> AND ConParValSal > 0
> GROUP BY EmpCod, CarCod,
> ConPesCod, ConSeq ) T5 ON T5.EmpCod = T4.EmpCod AND
>
> T5.CarCod = T4.CarCod AND
>
> T5.ConPesCod = T4.ConPesCod AND
>
> T5.ConSeq = T4.ConSeq)
> WHERE T4.ConAti = TRUE
> GROUP BY T4.EmpCod, T4.CarCod, T4.ConPesCod )
> T3 ON t3.EmpCod = T1.EmpCod AND
>
> t3.CarCod = T1.CarCod AND
>
> t3.ConPesCod = T1.PesCod)
> WHERE (T2.CarAti = true)
> AND (T1.EmpCod = 112)
> and (UPPER(T1.PesNom) like UPPER('%MARIA%'))
> ORDER BY T1.EmpCod, T1.CarCod, T1.PesCod
>
> Here the Hash Join[1] plan takes ~700ms, and if I change the first LEFT
> JOIN to a LEFT JOIN LATERAL, forcing a nested loop, the query[2] runs in
> 3ms.
>
> [1] https://explain.depesz.com/s/8IL3
> [2] https://explain.depesz.com/s/f8Q9
>
>
Maybe I am wrong, but probably you have to do more than just change LEFT
JOIN to LATERAL JOIN. Lateral join is based on correlated subquery - so you
had to push some predicates to subquery - and then the query can be much
more effective.

Regards

Pavel

> PostgreSQL version is 11.5, I have run analyze on all the tables.
>
> PG settings:
>
> name |setting |unit|
> -------------------------------|---------|----|
> autovacuum |on | |
> default_statistics_target |250 | |
> effective_cache_size |983040 |8kB |
> effective_io_concurrency |200 | |
> max_parallel_workers |6 | |
> max_parallel_workers_per_gather|3 | |
> random_page_cost |1.1 | |
> work_mem |51200 |kB |
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Michael Lewis 2019-11-22 18:13:30 Re: Re[4]: Postgresql planning time too high
Previous Message Luís Roberto Weck 2019-11-22 17:43:04 Re: Hash Join over Nested Loop