difficulties with time based queries

From: "Rainer Mager" <rainer(at)vanten(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: difficulties with time based queries
Date: 2009-04-05 23:26:11
Message-ID: 003001c9b645$e8a7cec0$b9f76c40$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have a somewhat large table (more than 100 million rows) that contains log
data with start_time and end_time columns. When I try to do queries on this
table I always find them slower than what I need and what I believe should
be possible.

For example, I limited the following query to just a single day and it still
is much slower than what I would expect. In reality I need to do queries
that span a few weeks.

explain analyze select * from ad_log where date(start_time) <
date('2009-03-31') and date(start_time) >= date('2009-03-30');

Bitmap Heap Scan on ad_log (cost=73372.57..3699152.24 rows=2488252
width=32) (actual time=49792.862..64611.255 rows=2268490 loops=1)

Recheck Cond: ((date(start_time) < '2009-03-31'::date) AND
(date(start_time) >= '2009-03-30'::date))

-> Bitmap Index Scan on ad_log_date_all (cost=0.00..72750.51
rows=2488252 width=0) (actual time=49776.332..49776.332 rows=2268490
loops=1)

Index Cond: ((date(start_time) < '2009-03-31'::date) AND
(date(start_time) >= '2009-03-30'::date))

Total runtime: 65279.352 ms

The definition of the table is:

Column | Type |
Modifiers

------------+-----------------------------+---------------------------------
---------------------------

ad_log_id | integer | not null default
nextval('ad_log_ad_log_id_seq'::regclass)

channel | integer | not null

player | integer | not null

ad | integer | not null

start_time | timestamp without time zone |

end_time | timestamp without time zone |

Indexes:

"ad_log_pkey" PRIMARY KEY, btree (ad_log_id)

"ad_log_unique" UNIQUE, btree (channel, player, ad, start_time,
end_time)

"ad_log_ad" btree (ad)

"ad_log_ad_date" btree (ad, date(start_time))

"ad_log_channel" btree (channel)

"ad_log_channel_date" btree (channel, date(start_time))

"ad_log_date_all" btree (date(start_time), channel, player, ad)

"ad_log_player" btree (player)

"ad_log_player_date" btree (player, date(start_time))

Foreign-key constraints:

"ad_log_ad_fkey" FOREIGN KEY (ad) REFERENCES ads(id)

"ad_log_channel_fkey" FOREIGN KEY (channel) REFERENCES channels(id)

"ad_log_player_fkey" FOREIGN KEY (player) REFERENCES players_history(id)

Triggers:

rollup_ad_logs_daily AFTER INSERT ON ad_log FOR EACH ROW EXECUTE
PROCEDURE rollup_ad_logs_daily()

Any suggestions would be appreciated.

--Rainer

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Wilson 2009-04-05 23:46:34 Re: difficulties with time based queries
Previous Message Lists 2009-04-05 21:57:17 Re: Best replication solution?