Re: Subselect performance question

From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: Manfred Koizar <mkoi-pg(at)aon(dot)at>, pgsql-sql(at)postgresql(dot)org
Subject: Re: Subselect performance question
Date: 2002-05-02 22:38:22
Message-ID: web-1392296@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Manfred,

> I don't know, if it matters, but your queries are not equivalent. If
> for a client_id there is no row with case_status = 'ACT',
> active_count
> is ...

Thanks. Would have noticed that, and done a COALESCE.

> should match your first query and it doesn't use sub-selects at all.
> I haven't tested it though, so watch out for typos ...
> My personal hard-and-fast rule is "avoid sub-selects, use joins",
> but
> YMMV.

Yes, usually. There's other reasons in how I'm going to use the view,
though, that make me want to use a subselect.

Actually, somebody reminded me that if I have a fixed list of
categories, the fastest appoach performance-wise is to use a crosstab
identity martix. What I did is to create and populate a table as
follows:

case_status_crosstab
status_group act_count stl_count dis_count oth_count
---------------- --------- --------- --------- ---------
Closed\ Cases 0 0 0 0
Active\ Cases 1 0 0 0
Settled\ Cases 0 1 0 0
Dismissed\ Cases 0 0 1 0
Other\ Cases 0 0 0 1
Special\ Cases 0 0 0 1

Then I can count them as follows:

CREATE VIEW bv_clients AS
SELECT client_id, client_name, short_name, ljo.description as
office_location,
contact_name, phone_no, fax_no, status_label,
address_1, address_2, address_3, address_4, address_5, address_6,
city, state, zip_code, comments,
active_count, settled_count, dismissed_count, other_count
FROM clients JOIN status ON (clients.status = status.status AND
status.relation = 'clients')
LEFT OUTER JOIN text_list_values ljo ON (clients.jw_office =
ljo.list_value and ljo.list_name ILIKE 'office location')
LEFT OUTER JOIN (SELECT client_id, sum(act_count) as active_count,
sum(stl_count) as settled_count,
sum(dis_count) as dismissed_count, sum(oth_count) as other_count
FROM case_clients cc JOIN text_list_values tvl ON (cc.case_status =
tvl.list_value and list_name ilike 'case status')
JOIN case_status_crosstab cst ON tvl.rollup1 = cst.status_group
GROUP BY client_id) as c_count USING (client_id);

This actually runs pretty fast, once everything is indexed.

-Josh Berkus

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Joseph Barillari 2002-05-03 20:38:14 Odd behavior with timestamp/interval arithmetic
Previous Message Tom Lane 2002-05-02 20:56:13 Re: Subselect performance question