| From: | Josh Berkus <josh(at)agliodbs(dot)com> |
|---|---|
| To: | Mischa Sandberg <mischa(dot)sandberg(at)telus(dot)net> |
| Cc: | Markus Schaber <schabi(at)logix-tt(dot)com>, pgsql-perform <pgsql-performance(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? |
| Date: | 2005-05-03 21:43:44 |
| Message-ID: | 200505031443.44859.josh@agliodbs.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers pgsql-performance |
Mischa,
> Okay, although given the track record of page-based sampling for
> n-distinct, it's a bit like looking for your keys under the streetlight,
> rather than in the alley where you dropped them :-)
Bad analogy, but funny.
The issue with page-based vs. pure random sampling is that to do, for example,
10% of rows purely randomly would actually mean loading 50% of pages. With
20% of rows, you might as well scan the whole table.
Unless, of course, we use indexes for sampling, which seems like a *really
good* idea to me ....
--
--Josh
Josh Berkus
Aglio Database Solutions
San Francisco
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dann Corbit | 2005-05-03 21:46:53 | Interesting article on transactional algorithms includes PostgreSQL study |
| Previous Message | Mischa Sandberg | 2005-05-03 21:33:10 | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John A Meinel | 2005-05-04 00:45:17 | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? |
| Previous Message | Mischa Sandberg | 2005-05-03 21:33:10 | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? |