From: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data |
Date: | 2021-10-24 11:45:38 |
Message-ID: | 64835CF9-D9AA-49BF-A685-01C23B1023C1@yandex-team.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
> 24 окт. 2021 г., в 08:00, Noah Misch <noah(at)leadboat(dot)com> написал(а):
>
> On Mon, Oct 18, 2021 at 08:02:12PM -0700, Noah Misch wrote:
>> On Mon, Oct 18, 2021 at 06:23:05PM +0500, Andrey Borodin wrote:
>>>> 17 окт. 2021 г., в 20:12, Noah Misch <noah(at)leadboat(dot)com> написал(а):
>>>> I think the attached version is ready for commit. Notable differences
>>>> vs. v14:
>
> Pushed.
Wow, that's great! Thank you!
> Buildfarm member conchuela (DragonFly BSD 6.0) has gotten multiple
> "IPC::Run: timeout on timer" in the new tests. No other animal has.
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=conchuela&dt=2021-10-24%2003%3A05%3A09
> is an example run. The pgbench queries finished quickly, but the
> $pgbench_h->finish() apparently timed out after 180s. I guess this would be
> consistent with pgbench blocking in write(), waiting for something to empty a
> pipe buffer so it can write more. I thought finish() will drain any incoming
> I/O, though. This phenomenon has been appearing regularly via
> src/test/recovery/t/017_shm.pl[1], so this thread doesn't have a duty to
> resolve it. A stack trace of the stuck pgbench should be informative, though.
Some thoughts:
0. I doubt that psql\pgbench is stuck in these failures.
1. All observed similar failures seem to be related to finish() sub of IPC::Run harness
2. Finish must pump any pending data from process [0]. But it can hang if process is waiting for something.
3. There is reported bug of finish [1]. But the description is slightly different.
>
> Compared to my last post, the push included two more test changes. I removed
> sleeps from a test. They could add significant time on a system with coarse
> sleep granularity. This did not change test sensitivity on my system.
> Second, I changed background_pgbench to include stderr lines in $stdout, as it
> had documented. This becomes important during the back-patch to v11, where
> server errors don't cause a nonzero pgbench exit status. background_psql
> still has the same bug, and I can fix it later. (The background_psql version
> of the bug is not affecting current usage.)
>
> FYI, the non-2PC test is less sensitive in older branches. It reproduces
> master's bug in 25-50% of runs, but it took about six minutes on v11 and v12.
It seem like loading Relation Descr to relcache becomes more expensive?
>>>> One thing not done here is to change the tests to use CREATE INDEX
>>>> CONCURRENTLY instead of REINDEX CONCURRENTLY, so they're back-patchable to v11
>>>> and earlier. I may do that before pushing, or I may just omit the tests from
>>>> older branches.
>>>
>>> The tests refactors PostgresNode.pm and some tests. Back-patching this would be quite invasive.
>>
>> That's fine with me. Back-patching a fix without its tests is riskier than
>> back-patching test infrastructure changes.
>
> Back-patching the tests did end up tricky, for other reasons. Before v12
> (d3c09b9), a TAP suite in a pgxs module wouldn't run during check-world.
> Before v11 (7f563c0), amcheck lacks the heapallindexed feature that the tests
> rely on. Hence, for v11, v10, and v9.6, I used a plpgsql implementation of
> the heapallindexed check, and I moved the tests to src/bin/pgbench.
Cool!
Thanks!
Best regards, Andrey Borodin.
[0] https://metacpan.org/dist/IPC-Run/source/lib/IPC/Run.pm#L3481
[1] https://github.com/toddr/IPC-Run/issues/57
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2021-10-24 16:19:42 | conchuela timeouts since 2021-10-09 system upgrade |
Previous Message | Andrew Gierth | 2021-10-24 11:44:37 | Re: BUG #17245: Index corruption involving deduplicated entries |