From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Artem Anisimov <artem(dot)anisimov(dot)255(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Subject: | Re: BUG #17949: Adding an index introduces serialisation anomalies. |
Date: | 2023-07-16 22:04:29 |
Message-ID: | CA+hUKGKOqpuHx_tx7qpbTX4o49YnCFrnB2uE3B+PUy03bBTPBA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Sat, Jul 15, 2023 at 1:05 AM Artem Anisimov
<artem(dot)anisimov(dot)255(at)gmail(dot)com> wrote:
> thank you for the fixes. I've looked up the patches in pg's git repo,
> and they got me wondering: where is the repo with pg tests? I'd be
> really uneasy to make changes to concurrency-related code without a
> decent testsuite to verify them.
Generally, the tests for SSI are in:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=tree;f=src/test/isolation/specs
... and see also ../expected. Typically they are created as features
are developed, but we'll add new tests to cover complicated bugfixes
if we can see how to do it. There are also non-SSI related tests in
there because the "isolation" infrastructure turned out to be so
useful.
For the problems discovered in this thread, I couldn't see how to do
it. These required unlucky scheduling to go wrong -- whereas the
existing test infrastructure is based on deterministic behaviour with
wait points at the statement level. It has been suggested before that
we could perhaps have a way to insert test-harness-controlled
waitpoints. But even if we had such infrastructure, the relevant wait
points are actually gone after the fixes (ie the window where you have
to do something in another thread to cause problems has been closed so
there is no candidate wait point left). Such infrastructure might
have been useful for demonstrating the bugs deterministically while
the windows existed. One of the basic techniques we often use when
trying to understand what is going on in such cases is to insert
sleeps into interesting places to widen windows and make failures
"almost" deterministic, as I did for one of the cases here.
I suppose we could in theory have a suite of 'high load' tests of a
more statistical nature that could include things like the repro you
sent in. It would burn a whole bunch of CPU trying to break
complicated concurrency stuff in ways that have been known to be
broken in the past. I'm not sure it's worth it though. Sometimes
it's OK for tests to be temporarily useful, too...
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2023-07-17 00:49:05 | Re: BUG #17928: Standby fails to decode WAL on termination of primary |
Previous Message | Alexander Lakhin | 2023-07-16 20:00:01 | Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used |