Re: Optimizing execution of expensive subqueries

From: Hellmuth Vargas <hivs77(at)gmail(dot)com>
To: mathieu(dot)fenniak(at)replicon(dot)com
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Optimizing execution of expensive subqueries
Date: 2018-07-11 19:44:03
Message-ID: CAN3Qy4o=GVcUAssU=AVVtTRK7DGRQDrQAVCrfYXh=u4nYD1jtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi

Try this way:

SELECT
tbl.field1, tbl.field2, tbl.field3, ...,
b.Thingy1Sum,
... repeat for multiply thingies ...
FROM
tbl
LATERAL JOIN (
SELECT anothertbl.UserId,SUM(Duration) as Thingy1Sum
FROM anothertbl
WHERE anothertbl.UserId = tbl.UserId AND anothertbl.ThingyId = 1
group by 1) as b on tbl.UserId=b.UserId
ORDER BY tbl.field1 LIMIT 20

El mié., 11 de jul. de 2018 a la(s) 09:25, Mathieu Fenniak (
mathieu(dot)fenniak(at)replicon(dot)com) escribió:

> Hi pgsql-general!
>
> I'm currently looking at a query that is generally selecting a bunch of
> simple columns from a table, and also performing some subqueries to
> aggregate related data, and then sorting by one of the simple columns and
> paginating the result.
>
> eg.
>
> SELECT
> tbl.field1, tbl.field2, tbl.field3, ...,
> (SELECT SUM(Duration) FROM anothertbl WHERE anothertbl.UserId = tbl.UserId
> AND anothertbl.ThingyId = 1) as Thingy1Sum,
> ... repeat for multiply thingies ...
> FROM
> tbl
> ORDER BY tbl.field1 LIMIT 20
>
> I'm finding that if "tbl" contains hundreds of thousands of rows, the
> subqueries are being executed hundreds of thousands of times. Because of
> the sorting and pagination, this is appears to be unnecessary, and the
> result is slow performance. (PostgreSQL 9.5.9 server)
>
> I've only found one solution so far, which is to perform the sort &
> pagination in a CTE, and the subqueries externally. Are there any other
> approaches that can be taken to optimize this and prevent the unnecessary
> computation?
>
> CTE rewrite:
>
> WITH cte AS (
> SELECT
> tbl.field1, tbl.field2, tbl.field3
> FROM
> tbl
> ORDER BY tbl.field1 LIMIT 20
> )
> SELECT cte.*,
> (SELECT SUM(Duration) FROM anothertbl WHERE anothertbl.UserId
> = tbl.UserId AND anothertbl.ThingyId = 1) as Thingy1Sum,
> ... repeat for multiply thingies ...
> FROM cte;
>
> Thanks for any thoughts you have,
>
> Mathieu Fenniak
>

--
Cordialmente,

Ing. Hellmuth I. Vargas S.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2018-07-11 19:45:21 Re: timestamp (military) at time zone without the suffix
Previous Message David Gauthier 2018-07-11 19:36:01 timestamp (military) at time zone without the suffix