Quick Links

Re: More stable query plans via more predictable column statistics

From:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:	"Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: More stable query plans via more predictable column statistics
Date:	2016-03-09 12:33:54
Message-ID:	1457526834.24545.53.camel@2ndquadrant.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On Wed, 2016-03-09 at 11:23 +0100, Shulgin, Oleksandr wrote:
> On Tue, Mar 8, 2016 at 8:16 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> Shulgin, Oleksandr wrote:
>
> > Alright. I'm attaching the latest version of this patch
> split in two
> > parts: the first one is NULLs-related bugfix and the second
> is the
> > "improvement" part, which applies on top of the first one.
>
> I went over patch 0001 and it seems pretty reasonable. It's
> missing
> some comment updates -- at least the large comments that talk
> about Duj1
> should be modified to indicate why the code is now subtracting
> the null
> count.
>
>
> Good point.
>
>
> Also, I can't quite figure out why the "else" now in line 2131
> is now "else if track_cnt != 0". What happens if track_cnt is
> zero?
> The comment above the "if" block doesn't provide any guidance.
>
>
> It is there only to avoid potentially dividing zero by zero when
> calculating avgcount (which will not be used after that anyway). I
> agree it deserves a comment.

That definitely deserves a comment. It's not immediately clear why
(track_cnt != 0) would prevent division by zero in the code. The only
way such error could happen is if ndistinct==0, because that's the
actual denominator. Which means this

ndistinct = ndistinct * sample_cnt

would have to evaluate to 0. But ndistinct==0 can't happen as we're in
the (nonnull_cnt > 0) branch, and that guarantees (standistinct != 0).

Thus the only possibility seems to be (nonnull_cnt==toowide_cnt). Why
not to use this condition instead?

FWIW while looking at the code I noticed that we skip wide varlena
values but not cstrings. Seems a bit suspicious.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: More stable query plans via more predictable column statistics at 2016-03-09 10:23:23 from Shulgin, Oleksandr

Responses

Re: More stable query plans via more predictable column statistics at 2016-03-09 15:02:08 from Alvaro Herrera
Re: More stable query plans via more predictable column statistics at 2016-03-09 16:22:11 from Shulgin, Oleksandr

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jesper Pedersen	2016-03-09 12:40:00	Re: WAL log only necessary part of 2PC GID
Previous Message	Amit Kapila	2016-03-09 12:30:22	Re: WIP: Upper planner pathification