From: | "Zeugswetter Andreas SB SD" <ZeugswetterA(at)spardat(dot)at> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Index of a table is not used (in any case) |
Date: | 2001-10-25 14:24:25 |
Message-ID: | 46C15C39FEB2C44BA555E356FBCD6FA41EB3D9@m0114.s-mxs.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane writes:
> "Zeugswetter Andreas SB SD" <ZeugswetterA(at)spardat(dot)at> writes:
> > Imho one of the biggest sources for problems is people creating new
> > indexes on populated tables when the rest of the db/table has badly
> > outdated statistics or even only default statistics in place.
> > In this situation the optimizer is badly misguided, because it now
> > sees completely inconsistent statistics to work on.
> > (e.g. old indexes on that table may seem way too cheap compared
> > to table scan)
>
> I don't think any of this is correct. We don't have per-index
> statistics. The only stats updated by CREATE INDEX are the same ones
> updated by plain VACUUM, viz the number-of-tuples and number-of-pages
> counts in pg_class.
1. Have I said anything about other stats, than relpages and reltuples ?
2. There is only limited use in the most accurate pg_statistics if
reltuples
and relpages is completely off. In the current behavior you eg get:
rel1: pages = 100000 -- updated from "create index"
index1 pages = 2 -- outdated
index2 pages = 2000 -- current
rel2: pages = 1 -- outdated
--> Optimizer will prefer join order: rel2, rel1
> I believe it's reasonable to update those stats
> more often than the pg_statistic stats (in fact, if we could keep them
> constantly up-to-date at a reasonable cost, we'd do so).
There is a whole lot of difference between keeping them constantly up to
date and modifying (part of) them in the "create index" command, so I do
not counter your above sentence, but imho the conclusion is wrong.
> The
> pg_statistic stats are designed as much as possible to be independent
> of the absolute number of rows in the table, so that it's okay if they
> are out of sync with the pg_class stats.
Independently, they can only be good for choosing whether to use an
index or seq scan. They are not sufficient to choose a good join order.
> The major reason why "you vacuumed but you never analyzed" is such a
> killer is that in the absence of any pg_statistic data, the default
> selectivity estimates are such that you may get either an index or seq
> scan depending on how big the table is. The cost estimates are
> nonlinear (correctly so, IMHO, though I wouldn't necessarily
> defend the
> exact shape of the curve) and ye olde default 0.01 will give you an
> indexscan for a small table but not for a big one. In 7.2 I have
> reduced the default selectivity estimate to 0.005, for a number of
> reasons but mostly to get it out of the range where the decision will
> flip-flop.
Yes, the new selectivity is better, imho even still too high.
Imho the strategy should be to assume a good selectivity
of values in absence of pg_statistics evidence.
If the index was not selective enough for an average query, the
dba should not have created the index in the first place.
> test71=# create table foo (f1 int);
> test71=# create index fooi on foo(f1);
> test71=# explain select * from foo where f1 = 42;
> Index Scan using fooi on foo (cost=0.00..8.14 rows=10 width=4)
> test71=# update pg_class set reltuples = 100000, relpages =
> 1000 where relname = 'foo';
> Index Scan using fooi on foo (cost=0.00..1399.04 rows=1000 width=4)
> test71=# update pg_class set reltuples = 1000000, relpages =
> 10000 where relname = 'foo';
> Seq Scan on foo (cost=0.00..22500.00 rows=10000 width=4)
> In current sources you keep getting an indexscan as you increase the
> number of tuples...
As you can see it toppeled at 10 Mio rows :-(
Andreas
From | Date | Subject | |
---|---|---|---|
Next Message | tweekie | 2001-10-25 14:46:58 | java virtual machine |
Previous Message | Hannu Krosing | 2001-10-25 14:13:58 | Re: timeout for "idle in transaction" |