| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
| Cc: | pgsql-general(at)postgresql(dot)org |
| Subject: | Re: strange row count estimates with conditions on multiple column |
| Date: | 2010-11-17 05:58:39 |
| Message-ID: | 16724.1289973519@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
Tomas Vondra <tv(at)fuzzy(dot)cz> writes:
> Yes, I understand why MCV is not used in case of col_b, and I do
> understand that the estimate may not be precise. But I'm wondering
> what's a better estimate in such cases - 1, 5000, any constant, or
> something related to a the histogram?
It is doing it off the histogram. The logic is actually quite good
I think for cases where the data granularity is small compared to the
histogram bucket width. For cases like we have here, the assumption
of a continuous distribution fails rather badly --- but it's pretty
hard to see how to improve it without inserting a lot of type-specific
assumptions.
> BTW I think the default estimate used to be 1000, so it was changed in
> one of the 8.x releases? Can you point me to the docs? I've even tried
> to find that in the sources, but unsuccessfully.
It's DEFAULT_RANGE_INEQ_SEL, and AFAIR it hasn't changed in quite a while.
But I wouldn't be surprised if the behavior of this example changed when
we boosted the default statistics target.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Willy-Bas Loos | 2010-11-17 07:46:22 | Re: where is pg_stat_activity (and others) in the documentation? |
| Previous Message | Tomas Vondra | 2010-11-17 05:35:34 | Re: strange row count estimates with conditions on multiple column |