From: | The Hermit Hacker <scrappy(at)hub(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: [HACKERS] Costs: Index vs Non-Index |
Date: | 2000-01-08 18:58:54 |
Message-ID: | Pine.BSF.4.21.0001081457420.18498-100000@thelab.hub.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Okay, I had remembered to VACUUM, but I always forget to VACUUM ANALYZE :(
results come out much better now:
udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
NOTICE: QUERY PLAN:
Seq Scan on url (cost=3368.58 rows=12623 width=4)
EXPLAIN
udmsearch=> select (next_index_time) from url where next_index_time <= 947317073;
next_index_time
---------------
(0 rows)
udmsearch=> create index url_next_index_time on url using btree ( next_index_time);
CREATE
udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
NOTICE: QUERY PLAN:
Index Scan using url_next_index_time on url (cost=1364.10 rows=12623 width=4)
EXPLAIN
On Sat, 8 Jan 2000, Tom Lane wrote:
> The Hermit Hacker <scrappy(at)hub(dot)org> writes:
> > Hrmmm...if I'm reading this right, its more costly to create an index then
> > to leave it as a sequential scan, but it returns more rows? Yet, it
> > returns, if I do the query with a count() around the return value, 288
> > rows, not 334 or 1154...
>
> This doesn't have anything to do with index vs sequential scan, but it
> does have to do with whether you've done a VACUUM ANALYZE lately.
> You haven't ;-)
>
> > udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
> > NOTICE: QUERY PLAN:
> > Seq Scan on url (cost=43.00 rows=334 width=4)
>
> IIRC, rows=334 is the default estimate of result rows you will get for
> this query in the absence of any information whatever. (Default table
> size guess is 1000 rows, and default selectivity guess for <= is 1/3,
> so...) If you have not vacuumed, it's sheer coincidence that this is
> even within hailing distance of the correct figure of 288.
>
> > udmsearch=> create index url_next_index_time on url using btree ( next_index_time);
> > CREATE
> > udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
> > NOTICE: QUERY PLAN:
> > Index Scan using url_next_index_time on url (cost=271.68 rows=1154 width=4)
>
> I believe that a side-effect of CREATE INDEX is to update the
> number-of-pages-and-rows statistics in pg_class for the target table.
> So after you do that, the optimizer has a correct idea of the table's
> size, but still no more info about the selectivity of the WHERE clause.
> (I infer that your table has size 1154*3 rows.) If you now drop the
> index and repeat EXPLAIN, it'll go back to a seq scan, but it will now
> say 1154 rows --- and the cost estimate will be higher, too.
>
> If you do VACUUM ANALYZE, then the optimizer will also know the min and
> max values of next_index_time, and will have some shot at making a
> correct estimate of the output row count. I'd be interested to know
> what it predicts then...
>
> regards, tom lane
>
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org
From | Date | Subject | |
---|---|---|---|
Next Message | The Hermit Hacker | 2000-01-08 19:05:12 | Re: [HACKERS] Another index "buglet"? |
Previous Message | Ed Loehr | 2000-01-08 18:29:40 | Re: [HACKERS] Re: ERROR: out of free buffers: time to abort ! |