Re: Subselect performance question

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Subselect performance question
Date: 2002-05-02 20:47:13
Message-ID: n083du4g2eue067lrc3eqqdq01aiod5dqu@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Thu, 02 May 2002 09:37:56 -0700, Josh Berkus wrote:

>Folks,
>
>I have a view which can be designed with eithher a sub-select in the
>SELECT clause or the FROM clause. The main query data set is small
>(1000 records) but the table being filtered in the sub-select is large
>(110,000 records).

Josh,

I don't know, if it matters, but your queries are not equivalent. If
for a client_id there is no row with case_status = 'ACT', active_count
is ...

>
>Under those circumstances, is there a hard-and-fast rule for which
>query I should use? Most of the time, users will only look at one
>record at a time from the main data set.
>
>SELECT client_id, client_name,
> (SELECT count(*) FROM case_clients WHERE client_id = clients.client_id
>AND case_status = 'ACT') as active_count,
> (SELECT count(*) FROM case_clients WHERE client_id = clients.client_id
>AND case_status = 'STL') as settled_count,
>FROM clients;

... 0 (zero), because the sub-select counts 0 rows

>
>OR:
>
>
>SELECT client_id, client_name, active_count, settled_count,
>FROM clients
> LEFT OUTER JOIN (SELECT client_id, count(*) as active_count FROM
>case_clients WHERE case_status = 'ACT' GROUP BY client_id) ac ON
>ac.client_id = clients.client_id
> LEFT OUTER JOIN (SELECT client_id, count(*) as settled_count FROM
>case_clients WHERE case_status = 'STL' GROUP BY client_id) sc ON
>sc.client_id = clients.client_id;
>

NULL, because you do a LEFT JOIN to an empty sub-select.

SELECT c.client_id,
c.client_name,
count(ac.client_id) active_count,
count(sc.client_id) settled_count
FROM clients c
LEFT JOIN case_clients ac ON ac.client_id=c.client_id AND
ac.case_status = 'ACT'
LEFT JOIN case_clients sc ON sc.client_id=c.client_id AND
sc.case_status = 'STL'
GROUP BY c.client_id, c.client_name;

should match your first query and it doesn't use sub-selects at all.
I haven't tested it though, so watch out for typos ...
My personal hard-and-fast rule is "avoid sub-selects, use joins", but
YMMV.

>-Josh Berkus

Servus
Manfred

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2002-05-02 20:56:13 Re: Subselect performance question
Previous Message Josh Berkus 2002-05-02 16:37:56 Subselect performance question