AW: AW: Call for alpha testing: planner statistics revi sion s

From: Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: AW: AW: Call for alpha testing: planner statistics revi sion s
Date: 2001-06-18 15:41:31
Message-ID: 11C1E6749A55D411A9670001FA687963368330@sdexcsrv1.f000.d0188.sd.spardat.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> > 3. if at all, an automatic analyze should do the samples on small tables,
> > and accurate stats on large tables
>
> Other way 'round, surely? It already does that: if your table has fewer
> rows than the sampling target, they all get used.

I mean, that it is probably not useful to maintain distribution statistics
for a table that is that small at all (e.g. <= 3000 rows and less than 512 k size).
So let me reword: do the samples for medium sized tables.

> > When on the other hand the optimizer does a "mistake" on a huge table
> > the difference is easily a matter of hours, thus you want accurate stats.
>
> Not if it takes hours to get the stats. I'm more interested in keeping
> ANALYZE cheap and encouraging DBAs to run it frequently, so that the
> stats stay up-to-date. It doesn't matter how perfect the stats were
> when they were made, if the table has changed since then.

That is true, but this is certainly a tradeoff situation. For a huge table
that is quite static you would certainly want most accurate statistics even
if it takes hours to compute once a month.

My comments are based on praxis and not theory :-) Of course current
state of the art optimizer implementations might lag well behind state of
the art theory from ACM SIGMOD :-)

Andreas

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2001-06-18 15:47:34 Re: Doc translation
Previous Message Tom Lane 2001-06-18 15:31:19 Re: AW: AW: Call for alpha testing: planner statistics revi sion s