Re: Two optimization questions

From: "Aaron Bono" <postgresql(at)aranya(dot)com>
To: Mezei Zoltán <mezei(dot)zoltan(at)telefor(dot)hu>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Two optimization questions
Date: 2006-09-12 18:46:30
Message-ID: bf05e51c0609121146m52f7899br8dc12140fdbdd05@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On 9/12/06, Mezei Zoltán <mezei(dot)zoltan(at)telefor(dot)hu> wrote:
>
> Hi,
>
> I think it can be done better than I did and I want to learn...
>
>
> 1. I have a table that registers the history of messages:
> output_message_history(id, event_type, event_time)
> I need those ID-s from the table where there is one 'MESSAGE SENT' event
> and one 'MESSAGE SUBMITTED' event and there are no more events on that
> message.
>
> select id
> from output_message_history
> group by content_id
> having
> sum(case when event_type='MESSAGE SENT' then 1 else 0 end) = 1
> and
> sum(case when event_type='MESSAGE SUBMITTED' then 1 else 0 end) = 1
> and count(*) = 2
>
> Can it be done another, more optimal way?

You could split it into sub-queries but would that make the performance
better or worse? I guess it depends on how much data is there, and what
frequency you have ot the event_type's but indexing the event_type column
would help. This may be worth a try - use EXPLAIN to see which is better.

SELECT id
FROM output_message_history
WHERE NOT content_id IN (
SELECT content_id -- distinct(content_id)
FROM output_message_history
WHERE NOT event_type IN ('MESSAGE SENT', 'MESSAGE SUBMITTED')
)
GROUP BY content_id
HAVING
sum(case when event_type='MESSAGE SENT' then 1 else 0 end) = 1
AND
sum(case when event_type='MESSAGE SUBMITTED' then 1 else 0 end) = 1

2. I have to create reports like this one:
>
> time count
> 12:00 3
> 12:01 2
> 12:02 4
> 12:03 0 <--- now this one is problematic for me
> 12:04 5
>
> So I need something like this:
>
> select date_trunc('minute', crd), count(*) from subscriber
> where crd between '2006-09-08' and '2006-09-12'
> group by date_trunc('minute', crd)
>
> But the output of this query won't show minutes with 0 count. I searched
> the archives and found an example using a view to solve this problem,
> but creating such a view for grouping by minutes or seconds doesn't seem
> to be nice solution. Any ideas how to solve this problem?

Questions like this come up frequently and there are some nice solutions:

See if this gives you some insight:

select
'2006-01-15'::date + s.inc
from
generate_series(0, ('2006-02-20'::date - '2006-01-15'::date)::integer)
as s(inc)

See http://www.postgresql.org/docs/8.1/static/functions-srf.html and
http://www.postgresql.org/docs/8.0/interactive/functions-datetime.html for
more.

==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Mezei Zoltán 2006-09-12 19:06:03 Re: Two optimization questions
Previous Message Volkan YAZICI 2006-09-12 18:24:39 Re: Sorting items in aggregate function