Re: Query not using index, please explain.

From: Matthew Hagerty <mhagerty(at)voyager(dot)net>
To: Richard Poole <richard(dot)poole(at)vi(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Query not using index, please explain.
Date: 2001-03-08 19:43:54
Message-ID: 5.0.2.1.2.20010308143430.02c6b048@pop.voyager.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Richard,

Thanks for the response, I guess I should have included a little more
information. The table contains 3.5 million rows. The indexes were
created after the data was imported into the table and I had just run
vacuum and vacuum analyze on the database before trying the queries and
sending this question to hackers.

When I turned the seqscan variable off and ran the query with the
'04-01-2000' date the results were literally instantaneous. Turn the
seqscan back on and it takes right around 3 minutes. Also, the query for
any date older than the '04-01-2000' returns zero rows. The actual number
of rows for the '04-01-2000' select is right around 8300.

Here is the table for more information:

pglog=# \d history_entries
Table "history_entries"
Attribute | Type | Modifier
------------+-------------+----------
domain | varchar(80) |
time_stamp | timestamp |
response | integer |
transfered | integer |
reqtime | integer |
entry | text |
Indices: hist_entries_domain,
hist_entries_timestamp

I'm also having problems with this query:

select domain from history_entries group by domain;

To me, since there is an index on domain, it seems like this should be a
rather fast thing to do? It takes a *very* long time, no matter if I turn
seqscan on or off.

pglog=# select version();
version
-------------------------------------------------------------------------
PostgreSQL 7.0.3 on i386-unknown-freebsdelf3.4, compiled by gcc 2.7.2.3
(1 row)

Thanks,
Matthew

At 07:18 PM 3/8/2001 +0000, you wrote:
>On Thu, Mar 08, 2001 at 01:49:42PM -0500, Matthew Hagerty wrote:
> > Greetings,
> >
> > I have a real simple table with a timestamp field. The timestamp field
> has
> > an index on it. But, the index does not seem to be taken into account for
> > selects that return rows:
> >
> > pglog=# explain select time_stamp from history_entries where time_stamp <
> > '03-01-2000';
> > NOTICE: QUERY PLAN:
> >
> > Index Scan using hist_entries_timestamp on
> > history_entries (cost=0.00..12810.36 rows=3246 width=8)
> >
> > EXPLAIN
> > pglog=# explain select time_stamp from history_entries where time_stamp <
> > '04-01-2000';
> > NOTICE: QUERY PLAN:
> >
> > Seq Scan on history_entries (cost=0.00..160289.71 rows=138215 width=8)
> >
> > EXPLAIN
> > pglog=# set enable_seqscan to off;
> > SET VARIABLE
> > pglog=# explain select time_stamp from history_entries where time_stamp <
> > '04-01-2000';
> > NOTICE: QUERY PLAN:
> >
> > Index Scan using hist_entries_timestamp on
> > history_entries (cost=0.00..368241.51 rows=138215 width=8)
> >
> > EXPLAIN
> > pglog=# set enable_seqscan to on;
> > SET VARIABLE
> > pglog=#
> >
> > The query where the time_stamp < '03-01-2000' does not return any rows,
> the
> > 04-01-2000 date does return rows. When I disable seqscan the query is
> > almost instant, but with it on, it takes about 3 or 4 minutes. Why can't
> > the query planner use the index in the later case?
>
>Well, it can, it just chooses not to. Your second EXPLAIN shows that
>it thinks it's going to get 138215 rows from that select; it then
>calculates that it would be more expensive to use the index than simply
>to scan the table. Presumably it actually returns many fewer rows than
>that. Have you done a VACUUM ANALYZE recently? If you get plans this
>badly wrong immediately after a VACUUM ANALYZE, *then*'s the time to
>ask -hackers about it (FAQ item 4.9).
>
>Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-03-08 19:45:48 Re: Checkpoint process signal handling seems wrong
Previous Message Mikheev, Vadim 2001-03-08 19:34:47 RE: Proposed WAL changes