From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Why is subscription/t/031_column_list.pl failing so much? |
Date: | 2024-02-03 07:00:00 |
Message-ID: | a1010953-7d65-225a-0afa-f11de5d5e011@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Tom and Noah,
03.02.2024 04:24, Tom Lane wrote:
> I'm still wondering how come the failure seems to have suddenly gotten
> way more common. The only changes that are in vaguely-related places
> and fit the time frame are Amit's 732924043 and 776621a5e, but I sure
> don't see a connection.
I think the failure rate increased due to tamandua, calliphoridae,
flaviventris, and kestrel were switched from make to meson recently.
The last `make` builds for them:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-01-31%2016%3A51%3A31
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-31%2016%3A51%3A38
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-01-31%2016%3A52%3A37
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-01-31%2016%3A51%3A53
and since that switch the 031_column_list duration increased significantly,
e.g., on the last tamandua `make` run it executed for 7 seconds, but
successful `meson` runs give much longer duration:
280/283 postgresql:subscription / subscription/031_column_list OK 38.27s 36 subtests passed
280/283 postgresql:subscription / subscription/031_column_list OK 126.13s 36 subtests passed
280/283 postgresql:subscription / subscription/031_column_list OK 31.93s 36 subtests passed
279/283 postgresql:subscription / subscription/031_column_list OK 99.76s 36 subtests passed
So, looking at the tamandua's failure log, I see:
2024-02-02 19:41:19.750 UTC [1579219][postmaster][:0] LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by
gcc-12.3.0, 64-bit
...
2024-02-02 19:42:19.973 UTC [1629333][client backend][4/213:0] LOG: statement: ALTER SUBSCRIPTION sub1 SET PUBLICATION pub7
2024-02-02 19:42:20.131 UTC [1625765][logical replication apply worker][3/122:0] LOG: logical replication worker for
subscription "sub1" will restart because of a parameter change
2024-02-02 19:42:20.137 UTC [1629333][client backend][:0] LOG: disconnection: session time: 0:00:00.212 user=bf
database=postgres host=[local]
2024-02-02 19:42:20.191 UTC [1629535][logical replication apply worker][3/124:0] LOG: logical replication apply worker
for subscription "sub1" has started
...
2024-02-02 19:42:20.445 UTC [1629535][logical replication apply worker][3/0:0] ERROR: could not receive data from WAL
stream: ERROR: publication "pub7" does not exist
CONTEXT: slot "sub1", output plugin "pgoutput", in the change callback, associated LSN 0/15C7698
(The interval between subscriber start and the error is ~ 4 * 15 seconds.)
Thus it still may be explained by bgwriter activity, though perhaps
autovacuum/checkpointer can add something as well.
Best regards,
Alexander
From | Date | Subject | |
---|---|---|---|
Next Message | Hannu Krosing | 2024-02-03 07:54:16 | Re: pgbench - adding pl/pgsql versions of tests |
Previous Message | Amit Kapila | 2024-02-03 06:47:08 | Re: Documentation: warn about two_phase when altering a subscription |