Re: Avoid sorting when doing an array_agg

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Kiriakos Georgiou <kg(dot)postgresql(at)olympiakos(dot)com>, Alexis Woo <awoo2611(at)gmail(dot)com>, "Psql_General (E-mail)" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Avoid sorting when doing an array_agg
Date: 2016-12-05 00:57:18
Message-ID: CAH2-WzkuFj7u-WC60gnwkQn37bcswyi+3=AgTyHu5dK5n2FJ=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Dec 4, 2016 at 4:09 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Of course, we would also have to teach cost_sort or someplace near there
> that non-C sorting is much more expensive than C-collation sorting. Not
> sure about exactly how to set that up without it being a kluge.

We've talked about that before, in the context of parallel query. At
the 2014 developer meeting, IIRC.

> A related problem is that if you have "GROUP BY x,y" and no particular
> ORDER BY requirement, you could sort by either x,y or y,x before the
> GroupAgg. This would matter if, say, there was an index matching one
> but not the other. Right now we're very stupid and only consider x,y,
> but if there were room to consider more than one set of target pathkeys
> it would be fairly simple to make that better.

That sounds valuable, especially because it seems natural to make the
leading group-on var the least selective within a GROUP BY; having a
matching index that you can thereby use might be less common than that
in practice, unless and until the partial sort patch is committed.

I will tend to write "GROUP BY country, province, city" -- never
"GROUP BY city, province, country". I speak a language that is written
left-to-right, but it would be the opposite way around in both
directions if I spoke a language written right-to-left, I bet. Same
difference. This might be a very prevalent habit. In general, a
tuplesort will be faster with a high cardinality leading attribute, so
this habit works against tuplesort. (Assuming a leading attribute of
pass-by-value type, or with abbreviated key support.)

--
Peter Geoghegan

In response to

Browse pgsql-general by date

  From Date Subject
Next Message John McKown 2016-12-05 01:12:45 Re: Postgres and LibreOffice's 'Base'
Previous Message Varuna Seneviratna 2016-12-05 00:57:01 Where would I be able to get instructions regarding postgresql installation on Windows 10?