From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: adaptive ndistinct estimator v3 (WAS: Re: [PERFORM] Yet another abort-early plan disaster on 9.3) |
Date: | 2014-12-23 10:28:42 |
Message-ID: | 549943DA.4090603@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On 12/07/2014 03:54 AM, Tomas Vondra wrote:
> The one interesting case is the 'step skew' with statistics_target=10,
> i.e. estimates based on mere 3000 rows. In that case, the adaptive
> estimator significantly overestimates:
>
> values current adaptive
> ------------------------------
> 106 99 107
> 106 8 6449190
> 1006 38 6449190
> 10006 327 42441
>
> I don't know why I didn't get these errors in the previous runs, because
> when I repeat the tests with the old patches I get similar results with
> a 'good' result from time to time. Apparently I had a lucky day back
> then :-/
>
> I've been messing with the code for a few hours, and I haven't found any
> significant error in the implementation, so it seems that the estimator
> does not perform terribly well for very small samples (in this case it's
> 3000 rows out of 10.000.000 (i.e. ~0.03%).
The paper [1] gives an equation for an upper bound of the error of this
GEE estimator. How do the above numbers compare with that bound?
[1]
http://ftp.cse.buffalo.edu/users/azhang/disc/disc01/cd1/out/papers/pods/towardsestimatimosur.pdf
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Arne Scheffer | 2014-12-23 11:11:47 | Re: [PATCH] explain sortorder |
Previous Message | Heikki Linnakangas | 2014-12-23 09:48:38 | Re: compress method for spgist - 2 |
From | Date | Subject | |
---|---|---|---|
Next Message | Alessandro Ipe | 2014-12-23 11:56:03 | Re: Excessive memory used for INSERT |
Previous Message | Andrew Dunstan | 2014-12-22 21:41:54 | Re: Number of Columns and Update |