Re: Trying to eliminate union and sort

From: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>
To: Brian Fehrle <brianf(at)consistentstate(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Trying to eliminate union and sort
Date: 2013-07-15 18:26:35
Message-ID: B6F6FD62F2624C4C9916AC0175D56D880CE09933@jenmbs01.ad.intershop.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

you may want to try starting with some CTE that first retrieve required subsets.
adding ordering within those CTE might also improve the timing of following sort/join operations.
(sorry for the top posting)

Something like:

WITH T1 AS (
SELECT id, typeid, backup_id, mycolumn1, mycolumn2
FROM table1 WHERE t.external_id IN ('6544', '2234', '2', '4536')
ORDER BY mycolumn2, id ?
),

TYPES AS (SELECT DISTINCT typeid FROM T1),

T3_OTHERS AS ( SELECT id, otherid FROM table3 JOIN TYPES ON table3.id = TYPES.typeid
-- Order BY id ?
),

SELECT
T1.id,
T1.mycolumn1,
T3_OTHERS.otherid,
T3_2.otherid,
T1.mycolumn2 AS mycolumn2

FROM T1
LEFT OUTER JOIN T3_OTHERS ON T1.typeid = T3_OTHERS.id
LEFT OUTER JOIN table2 t2 ON (t2.real_id = T1.backup_id OR t2.real_id = t.id
LEFT OUTER JOIN table3 T3_2 ON t2.third_id = T3_2.id
ORDER BY T1.mycolumn2,T1.id

regards,
Marc Mamin

________________________________________
Von: pgsql-performance-owner(at)postgresql(dot)org [pgsql-performance-owner(at)postgresql(dot)org]&quot; im Auftrag von &quot;Brian Fehrle [brianf(at)consistentstate(dot)com]
Gesendet: Montag, 15. Juli 2013 18:12
An: pgsql-performance(at)postgresql(dot)org
Betreff: Re: [PERFORM] Trying to eliminate union and sort

On 07/12/2013 04:43 PM, Josh Berkus wrote:
>> As for the counts on the tables:
>> table1 3,653,472
>> table2 2,191,314
>> table3 25,676,589
>>
>> I think it's safe to assume right now that any resulting joins are not
>> one-to-one
> Hmmm? How is doing a subselect in the SELECT clause even working, then?
>
Oh my, this is sad. the query in all returns 9,955,729 rows, so the sub
queries are run on each of these resulting rows, however in this entire
result set, subquery 1 returns 16 distinct rows, subquery 2 returns 63
different rows, but those sub queries are run over 10 million times to
return these few distinct rows. So it's running many times, but
returning the same small set of data over and over again.

- Brian F

--
Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Niels Kristian Schjødt 2013-07-16 15:05:07 Hstore VS. JSON
Previous Message Marcos Luis 2013-07-15 17:00:06 Thought you'd find this interesting