Re: 2 simple SQL questions: optimizing aggegate query

From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: Alex Rice <alrice(at)ARCplanning(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: 2 simple SQL questions: optimizing aggegate query
Date: 2003-05-06 01:28:18
Message-ID: 20030506012818.GA8384@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Mon, May 05, 2003 at 14:03:18 -0600,
Alex Rice <alrice(at)ARCplanning(dot)com> wrote:
>
> On Monday, May 5, 2003, at 01:45 PM, Bruno Wolff III wrote:
>
> >>SELECT url.rec_id, url, title, content_type, txt,
> >>sum(
> >> case
> >> when dict.word = 'wordx' then 1
> >> when dict.word = 'wordx' then 1
> >> when dict.word = 'wordy' then 1
> >> else 0
> >> end
> >>) as rank
> >>FROM dict, url
> >>WHERE url.rec_id = dict.url_id
> >>GROUP BY rec_id, url, title, content_type, txt
> >>ORDER BY rank DESC
> >
> >Is there some reason you can't put the word list in the where clause?
> >(You can use coalesce to set NULL sums to 0.)
>
> Can you give an example of what you mean use coalesce and how to use it
> to produce a count of the matched words, the rank?

You don't really need coalesce. I mentioned that because if you take
the sum of zero rows (the case where no words match for one of the returned
rows) then you get NULL instead of 0 (which may or may not be a problem
for you).

> I am converting some existing query examples that use the form
> SUM( dict.word IN ( word list...))
> which doesn't work with pgsql.

You want to do this outside the sum, not inside (and change sum to
count). For example:
select count(dict.word) from dict where word in (word list ...);
(If word list is a select rather than a list of constants, you currently
want to use exists instead for performance reasons. In 7.4 that won't
be a problem.)

By putting things in the where close you don't have to examine all of
the rows for each word - url pair. This may cut out a lot of rows that
need to be examined if there is an index on dict.word and the number of
words in a url is large relative to the number being searched on.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Rajesh Kumar Mallah 2003-05-06 06:37:54 Re: mnogosearch examples
Previous Message Jeff Eckermann 2003-05-05 22:39:11 Re: Replication for a large database