Re: efficient query help

From: "Joel Burton" <joel(at)joelburton(dot)com>
To: "Joel Burton" <joel(at)joelburton(dot)com>, "Laurette Cisneros" <laurette(at)nextbus(dot)com>, "Pgsql-Sql(at)Postgresql(dot) Org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: efficient query help
Date: 2002-04-24 14:47:12
Message-ID: JGEPJNMCKODMDHGOBKDNEEAICLAA.joel@joelburton.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

P.S. It might be faster if your change the second half of the inner query to
"SELECT DISTINCT ON(id, f.flag)" and case the numbers to int2 (assuming your
data won't overflow this); this improves things about 20% with my dataset
(flags n=26, log n=896).

By my tests, this is about 4-5x faster than your original query.

Is there anything better than this?

Joel BURTON | joel(at)joelburton(dot)com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

> -----Original Message-----
> From: Joel Burton [mailto:joel(at)joelburton(dot)com]
> Sent: Wednesday, April 24, 2002 10:06 AM
> To: Laurette Cisneros; Pgsql-Sql(at)Postgresql(dot) Org
> Subject: RE: [SQL] efficient query help
>
>
> Hmmm... Is this better? Please let me know; it looks better with
> a small set of sample data, but I'd be curious how it does with
> your real table:
>
> select id,
> flag,
> sum(count)
>
> from ( select id,
> flag,
> 1::int2 as count <- change here
> from log
> where flgtime='2002-04-16'
> and flag is not null <- change here
>
> union all
>
> select DISTINCT ON (id, f.flag) id, <- change here
> f.flag,
> 0::int2 as count <- change here
> from log l,
> flags f
>
> /* don't think this is neccessary, but might make it faster: where
flgtime <> '2002-04-16' and l.flag is not null */
>
> ) as t
>
> where flag is not null
>
> group by id, flag;
>
>
> [ for others: the point of the query is to count the number of
> unique id/flag combos on 4/16/2002, and union this with all
> possible combos of ids/flags that aren't present on non-4/16/2002 ]
>
>
> Joel BURTON | joel(at)joelburton(dot)com | joelburton.com | aim: wjoelburton
> Knowledge Management & Technology Consultant
>
> > -----Original Message-----
> > From: pgsql-sql-owner(at)postgresql(dot)org
> > [mailto:pgsql-sql-owner(at)postgresql(dot)org]On Behalf Of Laurette Cisneros
> > Sent: Monday, April 22, 2002 2:11 PM
> > To: pgsql-sql(at)postgresql(dot)org
> > Subject: [SQL] efficient query help
> >
> >
> >
> > Hi all,
> >
> > I have crafted the following query which returns the results I want but
> > runs extremely slow (I'm sure it's the "not in" part that does it) since
> > the tables can contain large amounts of data.
> >
> > Is there a better, more efficient way to build this query?
> >
> > It's important that I return zero counts for id flag combinations
> > that do not
> > have an entry in the log table hence the union with the "not in":
> >
> > id is a text field and flgtime is a timestamp. flags is a table
> > that contains
> > a single column with a row for each unique flag (text).
> >
> > select id, flag, count(*) as count
> > from log
> > where date(flgtime) = '2002-04-16'
> > and flag is not null
> > group by id, flag
> > union
> > select distinct l.id, f.flag, 0 as count
> > from log l, flags f
> > where (l.id, f.flag) not in
> > (select id, flag
> > from log
> > where date(flgtime) = '2002-04-16'
> > and fag is not null)
> > group by l.id, f.flag
> > ;
> >
> > Thanks for any suggestions.
> >
> > --
> > Laurette Cisneros
> > Database Roadie
> > (510) 420-3137
> > NextBus Information Systems, Inc.
> > www.nextbus.com
> > Where's my....bus?
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Ian Morgan 2002-04-24 18:10:18 Re: BLOB Feature Limits
Previous Message Joel Burton 2002-04-24 14:06:16 Re: efficient query help