Re: query patterns for multipass aggregating

From: Ondřej Bouda <obouda(at)email(dot)cz>
To: Rob Nikander <rob(dot)nikander(at)gmail(dot)com>, Postgres General <pgsql-general(at)postgresql(dot)org>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: query patterns for multipass aggregating
Date: 2018-11-11 18:57:52
Message-ID: 51a95440-c9c8-7240-e7eb-4b1a62ebf6b4@email.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Dne 11.11.2018 v 17:20 Rob Nikander napsal(a):
> I have tables `items`, `colors`, and `images`. Items have many
colors, and many images.
>
> I want a query to list items, with their colors and images. Each
result row is an item, so the colors and images must be aggregated into
arrays or json.

What about subqueries?

SELECT
items.*,
(SELECT array_agg(color_name) FROM colors WHERE item_id = items.id)
AS color_names,
(SELECT array_agg(image_file) FROM images WHERE item_id = items.id)
AS image_files
FROM items

According to my experience, not only the code is readable (no questions
about how many rows are there for each item), but it also leads to a
better query plan.

Dne 11.11.2018 v 19:23 Rob Nikander napsal(a):
> Could the CTE’s theoretically be optimized in a future version of PG, to work like the subqueries? I like to use them to give names to intermediate results, but I’ll stay away from them for now.

It is on the todo list (for quite some time already):
https://blog.2ndquadrant.com/postgresql-ctes-are-optimization-fences/
Until then, using CTEs in situations as yours leads to poor query plans
as Postgres will compute the whole CTE separately (i.e., all rows
involved) and only then picks some of them to the inner table.

Regards,
Ondřej Bouda

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Florian Bachmann 2018-11-11 19:10:35 Re: Copy data from DB2 (Linux) to PG
Previous Message Rob Nikander 2018-11-11 18:23:18 Re: query patterns for multipass aggregating