From: | Nathan Boley <npboley(at)gmail(dot)com> |
---|---|
To: | Greg Stark <gsstark(at)mit(dot)edu> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: default_statistics_target WAS: max_wal_senders must die |
Date: | 2010-10-21 01:53:13 |
Message-ID: | AANLkTikVAH5dM-CD0YJacCXuEg7Tz9eCUs0V0-eN9DvO@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> Robert explained why having more MCVs might be useful because we use
> the frequency of the least common MCV as an upper bound on the
> frequency of any value in the MCV.
Where is that being used? The only non-MCV frequency estimate that I
recall seeing is ( nrows - n_ndistinct_rows )/ndistinct. Obviously
changing the number of mcv's affects this by lowering
n_ndistinct_rows, but it's always pretty coarse estimate.
> Binding the length of the MCV list to the size of the histogram is
> arbitrary but so would any other value
Wouldn't the best approach be to stop adding MCV's/histogram buckets
when adding new ones doesn't decrease your prediction error
'substantially'?
One very hacky threshold heuristic is to stop adding MCV's when a
simple equality select ( SELECT col FROM table WHERE col == VALUE )
would switch the plan from an index to a sequential scan ( or vice
versa, although with the current code this would never happen ). ie,
if the non_mcv frequency estimate is 0.1% ( producing an index scan ),
but adding the MCV gives us an estimate of 5% ( pbly producing a seq
scan ) then add that value as an mcv. More sophisticated variations
might also consider plan changes to very suboptimal joins; even more
sophisticated would be to stop when the MAX( curr - optimal plan /
optimal plan ) was below some threshold, say 20%, over a bunch of
recently executed queries.
A similar approach would work for histogram bins, except the queries
of interest are inequality rather than equality selections.
-Nathan
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-10-21 01:54:19 | Re: max_wal_senders must die |
Previous Message | Robert Haas | 2010-10-21 01:49:23 | Re: default_statistics_target WAS: max_wal_senders must die |