Re: BUG #16329: Valgrind detects an invalid read when building a gist index with buffering

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16329: Valgrind detects an invalid read when building a gist index with buffering
Date: 2020-10-22 05:50:48
Message-ID: 0931D3ED-77BA-407C-8C5B-2FBD624ED3A0@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> 13 окт. 2020 г., в 03:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> написал(а):
>
> I pushed the bug fix, but not yet the test addition, because I'm not
> very happy about the latter:
>
> 1. It nearly triples the runtime of gist.sql, from ~650 ms to ~1700 ms
> on my machine. That's a pretty bad increase for something we're
> proposing to drop into the core regression tests. Given that this is
> hardly the only index build in that test, I wonder why it's so much
> (but I did not look for the reason).
>
> 2. The test exposed the gistRelocateBuildBuffersOnSplit bug only about
> one time in ten for me. I suppose this is due to the random split
> choices gistchoose makes for equally-good tuples, so it's not terribly
> surprising; but it is problematic for a test that we're hoping to use
> to provide reliable code coverage.
>
> I'm not really sure what we could do about #2. Perhaps, instead of
> relying on random(), we could make gistchoose() use our own PRNG and
> then invent a debugging function that forces the seed to a known value.
> (GEQO already does something similar, although I'd not go as far as
> exposing the seed as a GUC. Resetting it via some quick-hack C
> function in regress.c would be enough IMO.) Or perhaps gist.sql could
> be adjusted so that the test data is less full of equally-good tuples.

I think we should use not entropy-based tie breaker in GiST. We can extract some randomness from tuples using hash.
I'd be much happier if GiST behaviour was deterministic.

> This seems like a spectacularly bad idea from a testing standpoint,
> even if it's the right thing for most users. Basically, it is now
> impossible to test buffering builds at all, unless you find a gist
> opclass that lacks GIST_SORTSUPPORT_PROC. Although there are a
> few candidates to pick from, someone could at any time add such
> a support proc and silently break your testing plan, just as
> 16fa9b2b3 did by adding such a proc to point_ops.
>
> So I think 16fa9b2b3 has to be reconsidered: if we have a
> buffering=on index parameter, we must go with that independently
> of the availability of sort support procs. Unless I hear very
> loud squawks very quickly, I'll go make that happen.
FWIW I think forcing buffered build on buffering=on is a good idea. Not just from testing point of view, it simply does what user asked for.

Thanks!

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Geier 2020-10-22 06:30:52 Re: BUG #16673: Stack depth limit exceeded error while running sysbench TPC-C
Previous Message David Christensen 2020-10-21 19:12:29 Re: BUG #16676: SELECT ... FETCH FIRST ROW WITH TIES FOR UPDATE SKIP LOCKED locks rows it doesn't return