From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | pgsql-perform <pgsql-performance(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [HACKERS] Bad n_distinct estimation; hacks suggested? |
Date: | 2005-04-25 19:13:18 |
Message-ID: | 200504251213.18565.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Simon, Tom:
While it's not possible to get accurate estimates from a fixed size sample, I
think it would be possible from a small but scalable sample: say, 0.1% of all
data pages on large tables, up to the limit of maintenance_work_mem.
Setting up these samples as a % of data pages, rather than a pure random sort,
makes this more feasable; for example, a 70GB table would only need to sample
about 9000 data pages (or 70MB). Of course, larger samples would lead to
better accuracy, and this could be set through a revised GUC (i.e.,
maximum_sample_size, minimum_sample_size).
I just need a little help doing the math ... please?
--
--Josh
Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2005-04-25 19:18:26 | Re: [HACKERS] Bad n_distinct estimation; hacks suggested? |
Previous Message | Simon Riggs | 2005-04-25 18:49:01 | Re: [HACKERS] Bad n_distinct estimation; hacks suggested? |
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2005-04-25 19:18:26 | Re: [HACKERS] Bad n_distinct estimation; hacks suggested? |
Previous Message | Simon Riggs | 2005-04-25 18:49:01 | Re: [HACKERS] Bad n_distinct estimation; hacks suggested? |