Quick Links

Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Nathan Boley" <npboley(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Date:	2008-06-08 23:03:13
Message-ID:	18634.1212966193@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

"Nathan Boley" <npboley(at)gmail(dot)com> writes:
> ... There are two potential problems that I see with this approach:

> 1) It assumes the = is equivalent to <= and >= . This is certainly
> true for real numbers, but is it true for every equality relation that
> eqsel predicts for?

The cases that compute_scalar_stats is used in have that property, since
the < and = operators are taken from the same btree opclass.

> Do people think that the improved estimates would be worth the
> additional overhead?

Your argument seems to consider only columns having a normal
distribution. How badly does it fall apart for non-normal
distributions? (For instance, Zipfian distributions seem to be pretty
common in database work, from what I've seen.)

regards, tom lane

In response to

Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics at 2008-06-08 18:19:20 from Nathan Boley

Responses

Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics at 2008-06-09 15:11:09 from Jeff Davis
Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics at 2008-06-09 17:51:07 from Nathan Boley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2008-06-08 23:03:48	handling TOAST tables in autovacuum
Previous Message	Alvaro Herrera	2008-06-08 22:46:49	Re: GIN improvements