Quick Links

Re: improving GROUP BY estimation

From:	Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Mark Dilger <hornschnorter(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: improving GROUP BY estimation
Date:	2016-03-31 20:43:58
Message-ID:	CAEZATCWb4dThbZqEXEB06SU_dbrHe+s1WOgc0+40kzJAGRnD1A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 31 March 2016 at 20:18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Also, I wonder if it'd be a good idea to provide a guard against division
> by zero --- we know rel->tuples > 0 at this point, but I'm less sure that
> reldistinct can't be zero. In the same vein, I'm worried about the first
> argument of pow() being slightly negative due to roundoff error, leading
> to a NaN result.

Yeah, that makes sense. In fact, if we only apply the adjustment when
reldistinct > 0 and rel->rows < rel->tuples, and rewrite the first
argument to pow() as (rel->tuples - rel->rows) / rel->tuples, then it
is guaranteed to be non-negative. If rel->rows >= rel->tuples (not
sure if it can be greater), then we just want the original
reldistinct.

> Maybe we should also consider clamping the final reldistinct estimate to
> an integer with clamp_row_est(). The existing code doesn't do that but
> it seems like a good idea on general principles.

OK, that seems sensible.

Regards,
Dean

In response to

Re: improving GROUP BY estimation at 2016-03-31 19:18:23 from Tom Lane

Responses

Re: improving GROUP BY estimation at 2016-03-31 20:59:41 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dean Rasheed	2016-03-31 20:46:13	Re: improving GROUP BY estimation
Previous Message	Tom Lane	2016-03-31 20:40:53	Re: improving GROUP BY estimation