From: | "Dave Held" <dave(dot)held(at)arrayservicesgrp(dot)com> |
---|---|
To: | <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? |
Date: | 2005-04-25 16:15:22 |
Message-ID: | 49E94D0CFCD4DB43AFBA928DDD20C8F9026184D7@asg002.asg.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Monday, April 25, 2005 10:23 AM
> To: Simon Riggs
> Cc: josh(at)agliodbs(dot)com; Greg Stark; Marko Ristola; pgsql-perform;
> pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks
> suggested?
>
> [...]
> It's not just the scan --- you also have to sort, or something like
> that, if you want to count distinct values. I doubt anyone is really
> going to consider this a feasible answer for large tables.
How about an option to create a stat hashmap for the column
that maps distinct values to their number of occurrences? Obviously
the map would need to be updated on INSERT/DELETE/UPDATE, but if the
table is dominated by reads, and an accurate n_distinct is very
important, there may be people willing to pay the extra time and space
cost.
__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East, Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2005-04-25 16:34:00 | Re: Continue transactions after errors in psql |
Previous Message | Tom Lane | 2005-04-25 15:23:00 | Re: [HACKERS] Bad n_distinct estimation; hacks suggested? |