Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Well, the problem Josh has got is exactly that a constant high
> bound doesn't work.
I thought the problem was that the high bound in the statistics fell
too far below the actual high end in the data. This tends (in my
experience) to be much more painful than an artificially extended
high end in the statistics. (YMMV, of course.)
> What I'm wondering about is why he finds that re-running ANALYZE
> isn't an acceptable solution. It's supposed to be a reasonably
> cheap thing to do.
Good point. We haven't hit this problem in PostgreSQL precisely
because we can run ANALYZE often enough to prevent the skew from
becoming pathological.
> I think the cleanest solution to this would be to make ANALYZE
> cheaper, perhaps by finding some way for it to work incrementally.
Yeah, though as you say above, it'd be good to know why frequent
ANALYZE is a problem as it stands.
-Kevin