Re: Slow query and indexes...

From: Jim Nasby <decibel(at)decibel(dot)org>
To: Jonas Henriksen <jonas(dot)f(dot)henriksen(at)gmail(dot)com>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: Slow query and indexes...
Date: 2007-05-08 15:42:24
Message-ID: D21FC8AF-1C6F-47FB-B44C-76978B612C26@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

There are other ways to influence the selection of a seqscan, notably
effective_cache_size and random_page_cost.

First, you need to find out at what point a seqscan is actually
faster than an index scan. That's going to be a trial and error
search, but eventually if you're going back far enough in time the
seqscan will be faster. EXPLAIN ANALYZE has it's own overhead, so a
better way to test this is with psql's timing command, and wrap the
query into a count so you're not shoving a bunch of data across to psql:

SELECT count(*) FROM (... your query goes here ...) a;

(SELECT 1 might work too and would be more accurate)

Once you've found the break even point, you can tweak all the cost
estimates. Start by making sure that effective_cache_size is set
approximately to how much memory you have. Increasing that will favor
an index scan. Decreasing random_page_cost will also favor an index
scan, though I'd try not to go below 2 and definitely not below 1.
You can also tweak the CPU cost estimates (lower numbers will favor
indexes). But keep in mind that doing that at a system level will
impact every query running in the system. You may have no choice but
to explicitly set custom parameters for just this statement. SET
LOCAL and wrapping the SELECT in a transaction is a less painful way
to do that.

On May 7, 2007, at 10:47 AM, Jonas Henriksen wrote:

> Well thanks, but that don't help me much.
>
> I've tried setting an extra condition using datetime>(now() - '14
> weeks'::interval)
>
> explain analyze
> SELECT max(date_time) FROM data_values
> where date_time > (now() - '14 weeks'::interval)
> GROUP BY data_logger_id;
>
> HashAggregate (cost=23264.52..23264.55 rows=2 width=12) (actual
> time=1691.447..1691.454 rows=3 loops=1)
> -> Bitmap Heap Scan on data_values (cost=7922.08..21787.31
> rows=295442 width=12) (actual time=320.643..951.043 rows=298589
> loops=1)
> Recheck Cond: (date_time > (now() - '98 days'::interval))
> -> Bitmap Index Scan on data_values_data_date_time_index
> (cost=0.00..7848.22 rows=295442 width=0) (actual time=319.708..319.708
> rows=298589 loops=1)
> Index Cond: (date_time > (now() - '98 days'::interval))
> Total runtime: 1691.598 ms
>
> However, when I switch to using datetime>(now() - '15
> weeks'::interval) I get:
> explain analyze
> SELECT max(date_time) FROM data_values
> where date_time > (now() - '15 weeks'::interval)
> GROUP BY data_logger_id;
>
> HashAggregate (cost=23798.26..23798.28 rows=2 width=12) (actual
> time=3237.816..3237.823 rows=3 loops=1)
> -> Seq Scan on data_values (cost=0.00..22084.62 rows=342728
> width=12) (actual time=0.037..2409.234 rows=344111 loops=1)
> Filter: (date_time > (now() - '105 days'::interval))
> Total runtime: 3237.944 ms
>
> Doing "SET enable_seqscan=off" speeds up the query and forces the use
> of the index, but I dont really love that solution...
>
>
> regards Jonas:))
>
>
>
>
> On 5/7/07, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> Am Montag, 7. Mai 2007 15:53 schrieb Jonas Henriksen:
>> > while if I add a GROUP BY data_logger the query uses a seq scan
>> and a
>> >
>> > lot of time:
>> > >> explain analyze SELECT max(date_time) FROM data_values GROUP BY
>> > data_logger_id;
>>
>> I don't think there is anything you can do about this.
>>
>> --
>> Peter Eisentraut
>> http://developer.postgresql.org/~petere/
>>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Simon Riggs 2007-05-08 16:01:30 Re: Continuous Archiving for Multiple Warm StandbyServers
Previous Message Jim Nasby 2007-05-08 15:35:33 Re: Slow query and indexes...