Re: Sort is generating rows

From: Nicolas Seinlet <nicolas(at)seinlet(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Sort is generating rows
Date: 2018-06-05 14:42:46
Message-ID: CAMEUSaR9-dvpzzUfQBRg5q9duYwq=UYC13Ht1=4TnLRnWGi3oQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

you're right, I've also changed the lateral join. The lateral join remove
the row generation, and use a loop.
I've remove the lateral join, on the same dataset as the one used for
provided explain plan, and time to execute grows up to 18s:
SELECT min(l.id) AS id,
l.product_id,
t.uom_id AS product_uom,
sum(l.product_uom_qty / u.factor * u2.factor) AS product_uom_qty,
sum(l.qty_delivered / u.factor * u2.factor) AS qty_delivered,
sum(l.qty_invoiced / u.factor * u2.factor) AS qty_invoiced,
sum(l.qty_to_invoice / u.factor * u2.factor) AS qty_to_invoice,
sum(l.price_total / COALESCE(cr.rate, 1.0)) AS price_total,
sum(l.price_subtotal / COALESCE(cr.rate, 1.0)) AS price_subtotal,
sum(l.price_reduce * l.qty_to_invoice / COALESCE(cr.rate, 1.0)) AS
amount_to_invoice,
sum(l.price_reduce * l.qty_invoiced / COALESCE(cr.rate, 1.0)) AS
amount_invoiced,
count(*) AS nbr,
s.name,
s.date_order AS date,
s.confirmation_date,
s.state,
s.partner_id,
s.user_id,
s.company_id,
date_part('epoch'::text, avg(date_trunc('day'::text, s.date_order) -
date_trunc('day'::text, s.create_date))) / (24 * 60 *
60)::numeric(16,2)::double precision AS delay,
t.categ_id,
s.pricelist_id,
s.analytic_account_id,
s.team_id,
p.product_tmpl_id,
partner.country_id,
partner.commercial_partner_id,
sum(p.weight * l.product_uom_qty / u.factor * u2.factor) AS weight,
sum(p.volume * l.product_uom_qty::double precision / u.factor::double
precision * u2.factor::double precision) AS volume
FROM sale_order_line l
JOIN sale_order s ON l.order_id = s.id
JOIN res_partner partner ON s.partner_id = partner.id
LEFT JOIN product_product p ON l.product_id = p.id
LEFT JOIN product_template t ON p.product_tmpl_id = t.id
LEFT JOIN uom_uom u ON u.id = l.product_uom
LEFT JOIN uom_uom u2 ON u2.id = t.uom_id
JOIN product_pricelist pp ON s.pricelist_id = pp.id
LEFT JOIN ( SELECT r.currency_id,
COALESCE(r.company_id, c.id) AS company_id,
r.rate,
r.name AS date_start,
r.date_end
FROM res_currency_rate r
JOIN res_company c ON r.company_id IS NULL OR r.company_id =
c.id) cr ON cr.currency_id = pp.currency_id AND (cr.company_id =
s.company_id OR cr.company_id IS NULL) AND daterange(cr.date_start,
COALESCE(cr.date_end, now()::date)) @> COALESCE(s.date_order::timestamp
with time zone, now())::date
GROUP BY l.product_id, l.order_id, t.uom_id, t.categ_id, s.name,
s.date_order, s.confirmation_date, s.partner_id, s.user_id, s.state,
s.company_id, s.pricelist_id, s.analytic_account_id, s.team_id,
p.product_tmpl_id, partner.country_id, partner.commercial_partner_id;

I've put the new explain plan as an optimisation of the previous one:
https://explain.depesz.com/s/zrn

If I remove the daterange, and use original query, execution time is better
and fall to 13s:
https://explain.depesz.com/s/OYWBN

SELECT min(l.id) AS id,
l.product_id,
t.uom_id AS product_uom,
sum(l.product_uom_qty / u.factor * u2.factor) AS product_uom_qty,
sum(l.qty_delivered / u.factor * u2.factor) AS qty_delivered,
sum(l.qty_invoiced / u.factor * u2.factor) AS qty_invoiced,
sum(l.qty_to_invoice / u.factor * u2.factor) AS qty_to_invoice,
sum(l.price_total / COALESCE(cr.rate, 1.0)) AS price_total,
sum(l.price_subtotal / COALESCE(cr.rate, 1.0)) AS price_subtotal,
sum(l.price_reduce * l.qty_to_invoice / COALESCE(cr.rate, 1.0)) AS
amount_to_invoice,
sum(l.price_reduce * l.qty_invoiced / COALESCE(cr.rate, 1.0)) AS
amount_invoiced,
count(*) AS nbr,
s.name,
s.date_order AS date,
s.confirmation_date,
s.state,
s.partner_id,
s.user_id,
s.company_id,
date_part('epoch'::text, avg(date_trunc('day'::text, s.date_order) -
date_trunc('day'::text, s.create_date))) / (24 * 60 *
60)::numeric(16,2)::double precision AS delay,
t.categ_id,
s.pricelist_id,
s.analytic_account_id,
s.team_id,
p.product_tmpl_id,
partner.country_id,
partner.commercial_partner_id,
sum(p.weight * l.product_uom_qty / u.factor * u2.factor) AS weight,
sum(p.volume * l.product_uom_qty::double precision / u.factor::double
precision * u2.factor::double precision) AS volume
FROM sale_order_line l
JOIN sale_order s ON l.order_id = s.id
JOIN res_partner partner ON s.partner_id = partner.id
LEFT JOIN product_product p ON l.product_id = p.id
LEFT JOIN product_template t ON p.product_tmpl_id = t.id
LEFT JOIN uom_uom u ON u.id = l.product_uom
LEFT JOIN uom_uom u2 ON u2.id = t.uom_id
JOIN product_pricelist pp ON s.pricelist_id = pp.id
LEFT JOIN ( SELECT r.currency_id,
COALESCE(r.company_id, c.id) AS company_id,
r.rate,
r.name AS date_start,
r.date_end
FROM res_currency_rate r
JOIN res_company c ON r.company_id IS NULL OR r.company_id =
c.id) cr ON cr.currency_id = pp.currency_id AND (cr.company_id =
s.company_id OR cr.company_id IS NULL) AND cr.date_start <=
COALESCE(s.date_order::timestamp with time zone, now()) AND (cr.date_end IS
NULL OR cr.date_end > COALESCE(s.date_order::timestamp with time zone,
now()))
GROUP BY l.product_id, l.order_id, t.uom_id, t.categ_id, s.name,
s.date_order, s.confirmation_date, s.partner_id, s.user_id, s.state,
s.company_id, s.pricelist_id, s.analytic_account_id, s.team_id,
p.product_tmpl_id, partner.country_id, partner.commercial_partner_id;

And thanks again for the help.

Have a nice day,

Nicolas

2018-06-01 16:33 GMT+02:00 Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>:

> On 06/01/2018 02:36 AM, Nicolas Seinlet wrote:
>
>> Hi,
>>
>> thanks for the answer. The query is based on a view, so here are the
>> view, the query as well as the query plan.
>> I've already taken into account remarks like date ranges.
>>
>
> You changed the query from the original, besides just adding the
> daterange, I see an addition of a LATERAL, where there other changes?
>
> The changes did eliminate the 300 million line sort from what I can see.
>
> The new query takes ~9 secs is that an improvement over the old?
>
> I took the liberty of running the EXPLAIN ANALYZE through
> explain.depesz.com:
>
> https://explain.depesz.com/s/9thl
>
> The largest amount of time was in the Index Scan(8,706.712ms) and that was
> because the scan was looped 32,732 times. I have not used LATERAL in my own
> code so I looked it up:
>
> https://www.postgresql.org/docs/10/static/sql-select.html
>
> LATERAL
>
> "
> ...
>
> When a FROM item contains LATERAL cross-references, evaluation proceeds as
> follows: for each row of the FROM item providing the cross-referenced
> column(s), or set of rows of multiple FROM items providing the columns, the
> LATERAL item is evaluated using that row or row set's values of the
> columns. The resulting row(s) are joined as usual with the rows they were
> computed from. This is repeated for each row or set of rows from the column
> source table(s).
> ...
> "
>
> If I am following correctly that might explain some of looping seen above.
>
>
>> SELECT min(l.id <http://l.id>) AS id,
>> l.product_id,
>> t.uom_id AS product_uom,
>> sum(l.product_uom_qty / u.factor * u2.factor) AS product_uom_qty,
>> sum(l.qty_delivered / u.factor * u2.factor) AS qty_delivered,
>> sum(l.qty_invoiced / u.factor * u2.factor) AS qty_invoiced,
>> sum(l.qty_to_invoice / u.factor * u2.factor) AS qty_to_invoice,
>> sum(l.price_total / COALESCE(cr.rate, 1.0)) AS price_total,
>> sum(l.price_subtotal / COALESCE(cr.rate, 1.0)) AS price_subtotal,
>> sum(l.price_reduce * l.qty_to_invoice / COALESCE(cr.rate, 1.0)) AS
>> amount_to_invoice,
>> sum(l.price_reduce * l.qty_invoiced / COALESCE(cr.rate, 1.0)) AS
>> amount_invoiced,
>> count(*) AS nbr,
>> s.name <http://s.name>,
>> s.date_order AS date,
>> s.confirmation_date,
>> s.state,
>> s.partner_id,
>> s.user_id,
>> s.company_id,
>> date_part('epoch'::text, avg(date_trunc('day'::text, s.date_order) -
>> date_trunc('day'::text, s.create_date))) / (24 * 60 *
>> 60)::numeric(16,2)::double precision AS delay,
>> t.categ_id,
>> s.pricelist_id,
>> s.analytic_account_id,
>> s.team_id,
>> p.product_tmpl_id,
>> partner.country_id,
>> partner.commercial_partner_id,
>> sum(p.weight * l.product_uom_qty / u.factor * u2.factor) AS weight,
>> sum(p.volume * l.product_uom_qty::double precision /
>> u.factor::double precision * u2.factor::double precision) AS volume
>> FROM sale_order_line l
>> JOIN sale_order s ON l.order_id = s.id <http://s.id>
>> JOIN res_partner partner ON s.partner_id = partner.id <
>> http://partner.id>
>> LEFT JOIN product_product p ON l.product_id = p.id <http://p.id>
>> LEFT JOIN product_template t ON p.product_tmpl_id = t.id <
>> http://t.id>
>> LEFT JOIN uom_uom u ON u.id <http://u.id> = l.product_uom
>> LEFT JOIN uom_uom u2 ON u2.id <http://u2.id> = t.uom_id
>> JOIN product_pricelist pp ON s.pricelist_id = pp.id <http://pp.id>
>> LEFT JOIN LATERAL ( SELECT res_currency_rate.rate
>> FROM res_currency_rate
>> WHERE res_currency_rate.currency_id = pp.currency_id AND
>> (res_currency_rate.company_id = s.company_id OR
>> res_currency_rate.company_id IS NULL) AND daterange(res_currency_rate.na
>> me <http://res_currency_rate.name>, COALESCE(res_currency_rate.date_end,
>> now()::date)) @> COALESCE(s.date_order::timestamp with time zone,
>> now())::date
>> LIMIT 1) cr ON true
>> GROUP BY l.product_id, l.order_id, t.uom_id, t.categ_id, s.name <
>> http://s.name>, s.date_order, s.confirmation_date, s.partner_id,
>> s.user_id, s.state, s.company_id, s.pricelist_id, s.analytic_account_id,
>> s.team_id, p.product_tmpl_id, partner.country_id,
>> partner.commercial_partner_id;
>>
>> explain analyse select team_id,partner_id,sum(price_total) from
>> sale_report group by team_id,partner_id;
>>
>>
>> QUERY
>> PLAN
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> -------------------------------------------------------
>> GroupAggregate (cost=1344575.91..1344986.97 rows=3654 width=40)
>> (actual time=8934.915..8944.487 rows=43 loops=1)
>> Group Key: sale_report.team_id, sale_report.partner_id
>> -> Sort (cost=1344575.91..1344667.26 rows=36539 width=40) (actual
>> time=8934.686..8937.833 rows=32732 loops=1)
>> Sort Key: sale_report.team_id, sale_report.partner_id
>> Sort Method: quicksort Memory: 3323kB
>> -> Subquery Scan on sale_report (cost=1339157.70..1341806.77
>> rows=36539 width=40) (actual time=8870.269..8923.114 rows=32732 loops=1)
>> -> GroupAggregate (cost=1339157.70..1341441.38
>> rows=36539 width=395) (actual time=8870.268..8920.155 rows=32732 loops=1)
>> Group Key: l.product_id, l.order_id, t.uom_id,
>> t.categ_id, s.name <http://s.name>, s.date_order, s.confirmation_date,
>> s.partner_id, s.user_id, s.state, s.company_id, s.pricelist_id,
>> s.analytic_account_id, s.team_id, p.product_tmpl_id, partner.country_id,
>> partner.commercial_partner_id
>> -> Sort (cost=1339157.70..1339249.04 rows=36539
>> width=92) (actual time=8870.247..8875.191 rows=32732 loops=1)
>> Sort Key: l.product_id, l.order_id, t.uom_id,
>> t.categ_id, s.name <http://s.name>, s.date_order, s.confirmation_date,
>> s.partner_id, s.user_id, s.state, s.company_id, s.pricelist_id,
>> s.analytic_account_id, s.team_id, p.product_tmpl_id, partner.country_id,
>> partner.commercial_partner_id
>> Sort Method: quicksort Memory: 5371kB
>> -> Nested Loop Left Join
>> (cost=695.71..1336388.56 rows=36539 width=92) (actual time=13.468..8797.655
>> rows=32732 loops=1)
>> -> Hash Left Join
>> (cost=695.43..3338.19 rows=36539 width=88) (actual time=13.323..65.600
>> rows=32732 loops=1)
>> Hash Cond: (l.product_id = p.id <
>> http://p.id>)
>> -> Hash Join
>> (cost=656.36..2796.71 rows=36539 width=76) (actual time=13.236..49.047
>> rows=32732 loops=1)
>> Hash Cond: (l.order_id =
>> s.id <http://s.id>)
>> -> Seq Scan on
>> sale_order_line l (cost=0.00..1673.39 rows=36539 width=17) (actual
>> time=0.019..7.338 rows=32732 loops=1)
>> -> Hash
>> (cost=550.72..550.72 rows=8451 width=67) (actual time=13.184..13.184
>> rows=8382 loops=1)
>> Buckets: 16384
>> Batches: 1 Memory Usage: 947kB
>> -> Hash Join
>> (cost=37.69..550.72 rows=8451 width=67) (actual time=0.164..10.135
>> rows=8382 loops=1)
>> Hash Cond:
>> (s.pricelist_id = pp.id <http://pp.id>)
>> -> Hash Join
>> (cost=13.97..420.42 rows=8451 width=63) (actual time=0.151..7.064 rows=8382
>> loops=1)
>> Hash
>> Cond: (s.partner_id = partner.id <http://partner.id>)
>> -> Seq
>> Scan on sale_order s (cost=0.00..301.51 rows=8451 width=55) (actual
>> time=0.005..1.807 rows=8382 loops=1)
>> -> Hash
>> (cost=13.43..13.43 rows=43 width=12) (actual time=0.136..0.136 rows=43
>> loops=1)
>>
>> Buckets: 1024 Batches: 1 Memory Usage: 10kB
>>
>> -> Seq Scan on res_partner partner (cost=0.00..13.43 rows=43 width=12)
>> (actual time=0.013..0.112 rows=43 loops=1)
>> -> Hash
>> (cost=16.10..16.10 rows=610 width=8) (actual time=0.007..0.007 rows=1
>> loops=1)
>> Buckets:
>> 1024 Batches: 1 Memory Usage: 9kB
>> -> Seq
>> Scan on product_pricelist pp (cost=0.00..16.10 rows=610 width=8) (actual
>> time=0.005..0.005 rows=1 loops=1)
>> -> Hash (cost=32.95..32.95
>> rows=490 width=16) (actual time=0.076..0.076 rows=43 loops=1)
>> Buckets: 1024 Batches: 1
>> Memory Usage: 11kB
>> -> Hash Left Join
>> (cost=11.88..32.95 rows=490 width=16) (actual time=0.051..0.068 rows=43
>> loops=1)
>> Hash Cond:
>> (p.product_tmpl_id = t.id <http://t.id>)
>> -> Seq Scan on
>> product_product p (cost=0.00..14.90 rows=490 width=8) (actual
>> time=0.007..0.010 rows=43 loops=1)
>> -> Hash
>> (cost=11.39..11.39 rows=39 width=12) (actual time=0.039..0.039 rows=39
>> loops=1)
>> Buckets: 1024
>> Batches: 1 Memory Usage: 10kB
>> -> Seq Scan on
>> product_template t (cost=0.00..11.39 rows=39 width=12) (actual
>> time=0.006..0.030 rows=39 loops=1)
>> -> Limit (cost=0.28..36.46 rows=1
>> width=8) (actual time=0.266..0.266 rows=1 loops=32732)
>> -> Index Scan using
>> res_currency_rate_currency_id_index on res_currency_rate
>> (cost=0.28..36.46 rows=1 width=8) (actual time=0.266..0.266 rows=1
>> loops=32732)
>> Index Cond: (currency_id =
>> pp.currency_id)
>> Filter: (((company_id =
>> s.company_id) OR (company_id IS NULL)) AND (daterange(name,
>> COALESCE(date_end, (now())::date)) @> (COALESCE((s.date_order)::timestamp
>> with time zone, now()))::date))
>> Rows Removed by Filter: 502
>> Planning time: 5.731 ms
>> Execution time: 8944.950 ms
>> (45 rows)
>>
>>
>> Have a nice day,
>>
>> Nicolas.
>>
>>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Chris Travers 2018-06-05 14:45:54 Re: Code of Conduct plan
Previous Message Thomas Kellerer 2018-06-05 13:08:22 Re: Postgres 11 beta - no JITing