From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | deadlock-hard flakiness |
Date: | 2023-02-08 01:10:21 |
Message-ID: | 20230208011021.winlfnypdbzpr3ic@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On cfbot / CI, we've recently seen a lot of spurious test failures due to
src/test/isolation/specs/deadlock-hard.spec changing output. Always on
freebsd, when running tests against a pre-existing instance.
I'm fairly sure I've seen this failure on the buildfarm as well, but I'm too
impatient to wait for the buildfarm database query (it really should be
updated to use lz4 toast compression).
Example failures:
1)
https://cirrus-ci.com/task/5307793230528512?logs=test_running#L211
https://api.cirrus-ci.com/v1/artifact/task/5307793230528512/testrun/build/testrun/isolation-running/isolation/regression.diffs
https://api.cirrus-ci.com/v1/artifact/task/5307793230528512/testrun/build/testrun/runningcheck.log
2)
https://cirrus-ci.com/task/6137098198056960?logs=test_running#L212
https://api.cirrus-ci.com/v1/artifact/task/6137098198056960/testrun/build/testrun/isolation-running/isolation/regression.diffs
https://api.cirrus-ci.com/v1/artifact/task/6137098198056960/testrun/build/testrun/runningcheck.log
So far the diff always is:
diff -U3 /tmp/cirrus-ci-build/src/test/isolation/expected/deadlock-hard.out /tmp/cirrus-ci-build/build/testrun/isolation-running/isolation/results/deadlock-hard.out
--- /tmp/cirrus-ci-build/src/test/isolation/expected/deadlock-hard.out 2023-02-07 05:32:34.536429000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/isolation-running/isolation/results/deadlock-hard.out 2023-02-07 05:40:33.833908000 +0000
@@ -25,10 +25,11 @@
step s6a7: <... completed>
step s6c: COMMIT;
step s5a6: <... completed>
-step s5c: COMMIT;
+step s5c: COMMIT; <waiting ...>
step s4a5: <... completed>
step s4c: COMMIT;
step s3a4: <... completed>
+step s5c: <... completed>
step s3c: COMMIT;
step s2a3: <... completed>
step s2c: COMMIT;
Commit 741d7f1047f fixed a similar issue in deadlock-hard. But it looks like
we need something more. But perhaps this isn't an output ordering issue:
How can we end up with s5c getting reported as waiting? I don't see how s5c
could end up blocking on anything?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2023-02-08 01:28:52 | windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED |
Previous Message | Stephen Frost | 2023-02-08 01:02:05 | Re: RLS makes COPY TO process child tables |