From: | "Joel Burton" <joel(at)joelburton(dot)com> |
---|---|
To: | "Joel Burton" <joel(at)joelburton(dot)com>, "Laurette Cisneros" <laurette(at)nextbus(dot)com>, "Pgsql-Sql(at)Postgresql(dot) Org" <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: efficient query help |
Date: | 2002-04-24 14:47:12 |
Message-ID: | JGEPJNMCKODMDHGOBKDNEEAICLAA.joel@joelburton.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
P.S. It might be faster if your change the second half of the inner query to
"SELECT DISTINCT ON(id, f.flag)" and case the numbers to int2 (assuming your
data won't overflow this); this improves things about 20% with my dataset
(flags n=26, log n=896).
By my tests, this is about 4-5x faster than your original query.
Is there anything better than this?
Joel BURTON | joel(at)joelburton(dot)com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant
> -----Original Message-----
> From: Joel Burton [mailto:joel(at)joelburton(dot)com]
> Sent: Wednesday, April 24, 2002 10:06 AM
> To: Laurette Cisneros; Pgsql-Sql(at)Postgresql(dot) Org
> Subject: RE: [SQL] efficient query help
>
>
> Hmmm... Is this better? Please let me know; it looks better with
> a small set of sample data, but I'd be curious how it does with
> your real table:
>
> select id,
> flag,
> sum(count)
>
> from ( select id,
> flag,
> 1::int2 as count <- change here
> from log
> where flgtime='2002-04-16'
> and flag is not null <- change here
>
> union all
>
> select DISTINCT ON (id, f.flag) id, <- change here
> f.flag,
> 0::int2 as count <- change here
> from log l,
> flags f
>
> /* don't think this is neccessary, but might make it faster: where
flgtime <> '2002-04-16' and l.flag is not null */
>
> ) as t
>
> where flag is not null
>
> group by id, flag;
>
>
> [ for others: the point of the query is to count the number of
> unique id/flag combos on 4/16/2002, and union this with all
> possible combos of ids/flags that aren't present on non-4/16/2002 ]
>
>
> Joel BURTON | joel(at)joelburton(dot)com | joelburton.com | aim: wjoelburton
> Knowledge Management & Technology Consultant
>
> > -----Original Message-----
> > From: pgsql-sql-owner(at)postgresql(dot)org
> > [mailto:pgsql-sql-owner(at)postgresql(dot)org]On Behalf Of Laurette Cisneros
> > Sent: Monday, April 22, 2002 2:11 PM
> > To: pgsql-sql(at)postgresql(dot)org
> > Subject: [SQL] efficient query help
> >
> >
> >
> > Hi all,
> >
> > I have crafted the following query which returns the results I want but
> > runs extremely slow (I'm sure it's the "not in" part that does it) since
> > the tables can contain large amounts of data.
> >
> > Is there a better, more efficient way to build this query?
> >
> > It's important that I return zero counts for id flag combinations
> > that do not
> > have an entry in the log table hence the union with the "not in":
> >
> > id is a text field and flgtime is a timestamp. flags is a table
> > that contains
> > a single column with a row for each unique flag (text).
> >
> > select id, flag, count(*) as count
> > from log
> > where date(flgtime) = '2002-04-16'
> > and flag is not null
> > group by id, flag
> > union
> > select distinct l.id, f.flag, 0 as count
> > from log l, flags f
> > where (l.id, f.flag) not in
> > (select id, flag
> > from log
> > where date(flgtime) = '2002-04-16'
> > and fag is not null)
> > group by l.id, f.flag
> > ;
> >
> > Thanks for any suggestions.
> >
> > --
> > Laurette Cisneros
> > Database Roadie
> > (510) 420-3137
> > NextBus Information Systems, Inc.
> > www.nextbus.com
> > Where's my....bus?
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
From | Date | Subject | |
---|---|---|---|
Next Message | Ian Morgan | 2002-04-24 18:10:18 | Re: BLOB Feature Limits |
Previous Message | Joel Burton | 2002-04-24 14:06:16 | Re: efficient query help |