From: | Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at> |
---|---|
To: | "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | AW: AW: Call for alpha testing: planner statistics revi sion s |
Date: | 2001-06-18 15:41:31 |
Message-ID: | 11C1E6749A55D411A9670001FA687963368330@sdexcsrv1.f000.d0188.sd.spardat.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> > 3. if at all, an automatic analyze should do the samples on small tables,
> > and accurate stats on large tables
>
> Other way 'round, surely? It already does that: if your table has fewer
> rows than the sampling target, they all get used.
I mean, that it is probably not useful to maintain distribution statistics
for a table that is that small at all (e.g. <= 3000 rows and less than 512 k size).
So let me reword: do the samples for medium sized tables.
> > When on the other hand the optimizer does a "mistake" on a huge table
> > the difference is easily a matter of hours, thus you want accurate stats.
>
> Not if it takes hours to get the stats. I'm more interested in keeping
> ANALYZE cheap and encouraging DBAs to run it frequently, so that the
> stats stay up-to-date. It doesn't matter how perfect the stats were
> when they were made, if the table has changed since then.
That is true, but this is certainly a tradeoff situation. For a huge table
that is quite static you would certainly want most accurate statistics even
if it takes hours to compute once a month.
My comments are based on praxis and not theory :-) Of course current
state of the art optimizer implementations might lag well behind state of
the art theory from ACM SIGMOD :-)
Andreas
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2001-06-18 15:47:34 | Re: Doc translation |
Previous Message | Tom Lane | 2001-06-18 15:31:19 | Re: AW: AW: Call for alpha testing: planner statistics revi sion s |