From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Mark Dilger <hornschnorter(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: improving GROUP BY estimation |
Date: | 2016-03-31 20:43:58 |
Message-ID: | CAEZATCWb4dThbZqEXEB06SU_dbrHe+s1WOgc0+40kzJAGRnD1A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 31 March 2016 at 20:18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Also, I wonder if it'd be a good idea to provide a guard against division
> by zero --- we know rel->tuples > 0 at this point, but I'm less sure that
> reldistinct can't be zero. In the same vein, I'm worried about the first
> argument of pow() being slightly negative due to roundoff error, leading
> to a NaN result.
Yeah, that makes sense. In fact, if we only apply the adjustment when
reldistinct > 0 and rel->rows < rel->tuples, and rewrite the first
argument to pow() as (rel->tuples - rel->rows) / rel->tuples, then it
is guaranteed to be non-negative. If rel->rows >= rel->tuples (not
sure if it can be greater), then we just want the original
reldistinct.
> Maybe we should also consider clamping the final reldistinct estimate to
> an integer with clamp_row_est(). The existing code doesn't do that but
> it seems like a good idea on general principles.
OK, that seems sensible.
Regards,
Dean
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2016-03-31 20:46:13 | Re: improving GROUP BY estimation |
Previous Message | Tom Lane | 2016-03-31 20:40:53 | Re: improving GROUP BY estimation |