From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Subject: | Re: Recent 027_streaming_regress.pl hangs |
Date: | 2024-06-04 10:00:00 |
Message-ID: | f748ee55-9e73-3f5e-e879-8865c5e9933a@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Andres,
>> So it looks like the issue resolved, but there is another apparently
>> performance-related issue: deadlock-parallel test failures.
> I reduced test concurrency a bit. I hadn't quite realized how the buildfarm
> config and meson test concurrency interact. But there's still something off
> with the frequency of fsyncs during replay, but perhaps that doesn't qualify
> as a bug.
It looks like that set of animals is still suffering from extreme load.
Please take a look at the today's failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-06-04%2002%3A44%3A19
1/1 postgresql:regress-running / regress-running/regress TIMEOUT 3000.06s killed by signal 15 SIGTERM
inst/logfile ends with:
2024-06-04 03:39:24.664 UTC [3905755][client backend][5/1787:16793] ERROR: column "c2" of relation "test_add_column"
already exists
2024-06-04 03:39:24.664 UTC [3905755][client backend][5/1787:16793] STATEMENT: ALTER TABLE test_add_column
ADD COLUMN c2 integer, -- fail because c2 already exists
ADD COLUMN c3 integer primary key;
2024-06-04 03:39:30.815 UTC [3905755][client backend][5/0:0] LOG: could not send data to client: Broken pipe
2024-06-04 03:39:30.816 UTC [3905755][client backend][5/0:0] FATAL: connection to client lost
"ALTER TABLE test_add_column" is from the alter_table test, which executed
in the group 21 out of 25.
Another similar failure:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skink&dt=2024-05-24%2002%3A22%3A26&stg=install-check-C
1/1 postgresql:regress-running / regress-running/regress TIMEOUT 3000.06s killed by signal 15 SIGTERM
inst/logfile ends with:
2024-05-24 03:18:51.469 UTC [998579][client backend][7/1792:16786] ERROR: could not change table "logged1" to unlogged
because it references logged table "logged2"
2024-05-24 03:18:51.469 UTC [998579][client backend][7/1792:16786] STATEMENT: ALTER TABLE logged1 SET UNLOGGED;
(This is the alter_table test again.)
I've analyzed duration of the regress-running/regress test for the recent
167 runs on skink and found that the average duration is 1595 seconds, but
there were much longer test runs:
2979.39:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skink&dt=2024-05-01%2004%3A15%3A29&stg=install-check-C
2932.86:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skink&dt=2024-04-28%2018%3A57%3A37&stg=install-check-C
2881.78:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skink&dt=2024-05-15%2020%3A53%3A30&stg=install-check-C
So it seems that the default timeout is not large enough for these
conditions. (I've counted 10 such timeout failures of 167 test runs.)
Also, 027_stream_regress still fails due to the same reason:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2024-05-22%2021%3A55%3A03
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-05-22%2021%3A54%3A50
(It's remarkable that these two animals failed at the same time.)
Best regards,
Alexander
From | Date | Subject | |
---|---|---|---|
Next Message | Long Song | 2024-06-04 10:03:17 | Re:Re: [PATCH]A minor improvement to the error-report in SimpleLruWriteAll() |
Previous Message | Peter Smith | 2024-06-04 09:30:49 | Re: Pgoutput not capturing the generated columns |