From: | Mischa Sandberg <mischa(dot)sandberg(at)telus(dot)net> |
---|---|
To: | Markus Schaber <schabi(at)logix-tt(dot)com> |
Cc: | pgsql-perform <pgsql-performance(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? |
Date: | 2005-05-03 21:33:10 |
Message-ID: | 1115155990.4277ee16aba34@webmail.telus.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Quoting Markus Schaber <schabi(at)logix-tt(dot)com>:
> Hi, Josh,
>
> Josh Berkus wrote:
>
> > Yes, actually. We need 3 different estimation methods:
> > 1 for tables where we can sample a large % of pages (say, >= 0.1)
> > 1 for tables where we sample a small % of pages but are "easily
> estimated"
> > 1 for tables which are not easily estimated by we can't afford to
> sample a
> > large % of pages.
> >
> > If we're doing sampling-based estimation, I really don't want
> people to lose
> > sight of the fact that page-based random sampling is much less
> expensive than
> > row-based random sampling. We should really be focusing on
> methods which
> > are page-based.
Okay, although given the track record of page-based sampling for
n-distinct, it's a bit like looking for your keys under the streetlight,
rather than in the alley where you dropped them :-)
How about applying the distinct-sampling filter on a small extra data
stream to the stats collector?
--
Engineers think equations approximate reality.
Physicists think reality approximates the equations.
Mathematicians never make the connection.
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2005-05-03 21:43:44 | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? |
Previous Message | Thomas Swan | 2005-05-03 21:33:09 | Re: Feature freeze date for 8.1 |
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2005-05-03 21:43:44 | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? |
Previous Message | Dave Cramer | 2005-05-03 19:58:59 | Re: batch inserts are "slow" |