Re: Subselect performance question

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Subselect performance question
Date: 2002-05-02 20:56:13
Message-ID: 18063.1020372973@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

"Josh Berkus" <josh(at)agliodbs(dot)com> writes:
> Under those circumstances, is there a hard-and-fast rule for which
> query I should use? Most of the time, users will only look at one
> record at a time from the main data set.

> SELECT client_id, client_name,
> (SELECT count(*) FROM case_clients WHERE client_id = clients.client_id
> AND case_status = 'ACT') as active_count,
> (SELECT count(*) FROM case_clients WHERE client_id = clients.client_id
> AND case_status = 'STL') as settled_count,
> FROM clients;

> OR:

> SELECT client_id, client_name, active_count, settled_count,
> FROM clients
> LEFT OUTER JOIN (SELECT client_id, count(*) as active_count FROM
> case_clients WHERE case_status = 'ACT' GROUP BY client_id) ac ON
> ac.client_id = clients.client_id
> LEFT OUTER JOIN (SELECT client_id, count(*) as settled_count FROM
> case_clients WHERE case_status = 'STL' GROUP BY client_id) sc ON
> sc.client_id = clients.client_id;

The second one looks like a sure loser to me, because of the GROUP BYs.

If you were expecting queries to retrieve many different client_ids,
it *might* be better to use the second form. But I think for a small
number of client_ids the first will be quicker.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Josh Berkus 2002-05-02 22:38:22 Re: Subselect performance question
Previous Message Manfred Koizar 2002-05-02 20:47:13 Re: Subselect performance question