Re: Whats the most efficient query for this result?

From: "David Johnston" <polobo(at)yahoo(dot)com>
To: "'Nick'" <nboutelier(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Whats the most efficient query for this result?
Date: 2012-01-17 19:54:14
Message-ID: 002201ccd551$cc62dc40$652894c0$@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Nick
Sent: Tuesday, January 17, 2012 12:59 PM
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Whats the most efficient query for this result?

On Jan 17, 3:33 am, t(dot)(dot)(dot)(at)audioboundary(dot)com (Tom Molesworth) wrote:
> Hi Nick,
>
> On 17/01/12 00:18, Nick wrote:
>
> > I have three tables (users, books, pencils) and would like to get a
> > list of all users with a count and total price of their books and
> > pencils for 2012-01-01...
>
> > So with this data...
>
> > users (user_id)
> > 1
> > 2
> > 3
>
> > books (user_id, price, created)
> > 1 | $10 | 2012-01-01
> > 1 | $10 | 2012-01-01
> > 3 | $10 | 2012-01-01
>
> > pencils
> > 1 | $.50 | 2012-01-02
> > 3 | $.50 | 2012-01-01
> > 3 | $.50 | 2012-01-02
>
> > What is the most efficient way to get this result...
>
> > query_result (user_id, book_count, book_price_total, pencil_count,
> > pencil_price_total)
> > 1 | 2 | $20 | 0 | $0
> > 2 | 0 | $0 | 0 | $0
> > 3 | 1 | $10 | 1 | $.50
>
> Seems straightforward enough - left join the tables, group the result
> on user_id - so I'd write it as:
>
> select u.user_id, count(b.user_id) as "book_count",
> coalesce(sum(b.price), 0) as "book_price_total", count(p.user_id) as
> "pencil_count", coalesce(sum(b.price), 0) as "pencil_price_total"
> from users u
> left join books b on b.user_id = u.user_id and b.created = '2012-01-01'
> left join pencils p on p.user_id = u.user_id and p.created = '2012-01-01'
> group by u.user_id
> order by u.user_id
>
> If you need something more efficient, summary tables may help - hard
> to say without knowing more about the real data.
>
> Tom
>
> --
> Sent via pgsql-general mailing list (pgsql-gene(dot)(dot)(dot)(at)postgresql(dot)org) To
> make changes to your
> subscription:http://www.postgresql.org/mailpref/pgsql-general

I take that back, now im incorrectly not getting user 2's results

---------------------------------------------------------------------

Not Tested & Psuedo Code But...

SELECT user_id, COALESCE(books_agg.count,0), COALESCE(books_agg.sum,0.00),
... [same for pencils]
FROM user u
LEFT JOIN ( SELECT user_id, COUNT(*), SUM(price) FROM book GROUP BY user_id
) books_agg ON ( books_agg.user_id = u.user_id )
LEFT JOIN ( SELECT user_id, COUNT(*), SUM(price) FROM pencils GROUP BY
user_id ) pencils_agg USING ( pencils_agg.user_id = u.user_id )
-- NOTE: THERE IS NO GROUP BY IN THIS PART OF THE QUERY; only in the
sub-queries

Basically do all your INDEPENDENT calculations first then simply JOIN the
various results to each other while replacing missing JOINs with reasonable
default values.

David J.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jerry Richards 2012-01-17 20:10:35 Does Version 9.1 Streaming Replication Supports Multi-Master?
Previous Message Jeroen Van Dongen 2012-01-17 19:10:26 Re: Possible bug in PL/Python?