From: | Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com> |
Cc: | Gerardo Herzig <gherzig(at)fmed(dot)uba(dot)ar>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: select with max functions |
Date: | 2017-10-03 21:16:37 |
Message-ID: | 002ad4b4-ec5f-4e9f-fa48-5c19405292ee@catalyst.net.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 03/10/17 04:29, Tom Lane wrote:
> Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com> writes:
>> explain analyze SELECT Ma.User_Id,
>> COUNT(*) COUNT
>> FROM Manuim Ma
>> WHERE Ma.Bb_Open_Date =
>> (SELECT Bb_Open_Date
>> FROM Manuim Man
>> WHERE Man.User_Id = Ma.User_Id order
>> by bb_open_date desc limit 1
>> )
>> GROUP BY Ma.User_Id
>> HAVING COUNT(*) > 1;
> The core problem with this query is that the sub-select has to be done
> over again for each row of the outer table, since it's a correlated
> sub-select (ie, it refers to Ma.User_Id from the outer table). Replacing
> a max() call with handmade logic doesn't do anything to help that.
> I'd try refactoring it so that you calculate the max Bb_Open_Date just
> once for each user id, perhaps along the lines of
>
> SELECT Ma.User_Id,
> COUNT(*) COUNT
> FROM Manuim Ma,
> (SELECT User_Id, max(Bb_Open_Date) as max
> FROM Manuim Man
> GROUP BY User_Id) ss
> WHERE Ma.User_Id = ss.User_Id AND
> Ma.Bb_Open_Date = ss.max
> GROUP BY Ma.User_Id
> HAVING COUNT(*) > 1;
>
> This is still not going to be instantaneous, but it might be better.
>
> It's possible that an index on (User_Id, Bb_Open_Date) would help,
> but I'm not sure.
>
> regards, tom lane
>
>
Further ideas based on Tom's rewrite: If that MAX is still expensive it
might be worth breaking
SELECT User_Id, max(Bb_Open_Date) as max
FROM Manuim Man
GROUP BY User_Id
out into a VIEW, and considering making it MATERIALIZED, or creating an
equivalent trigger based summary table (there are examples in the docs
of how to do this).
Cheers
Mark
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2017-10-03 21:44:16 | Re: How does max_parallel_workers_per_gather change load averages? |
Previous Message | Ben Nachtrieb | 2017-10-03 19:48:39 | How does max_parallel_workers_per_gather change load averages? |