Quick Links

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

From:	"Dave Held" <dave(dot)held(at)arrayservicesgrp(dot)com>
To:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
Date:	2005-04-25 16:15:22
Message-ID:	49E94D0CFCD4DB43AFBA928DDD20C8F9026184D7@asg002.asg.local
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Monday, April 25, 2005 10:23 AM
> To: Simon Riggs
> Cc: josh(at)agliodbs(dot)com; Greg Stark; Marko Ristola; pgsql-perform;
> pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks
> suggested?
>
> [...]
> It's not just the scan --- you also have to sort, or something like
> that, if you want to count distinct values. I doubt anyone is really
> going to consider this a feasible answer for large tables.

How about an option to create a stat hashmap for the column
that maps distinct values to their number of occurrences? Obviously
the map would need to be updated on INSERT/DELETE/UPDATE, but if the
table is dominated by reads, and an accurate n_distinct is very
important, there may be people willing to pay the extra time and space
cost.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East, Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2005-04-25 16:34:00	Re: Continue transactions after errors in psql
Previous Message	Tom Lane	2005-04-25 15:23:00	Re: [HACKERS] Bad n_distinct estimation; hacks suggested?