From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | dikkop failed the pg_combinebackupCheck/006_db_file_copy.pl test |
Date: | 2024-07-29 04:00:00 |
Message-ID: | 877b1f23-35d2-31b2-2fcd-d176fd3d05c4@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Tomas,
Please take a look at a recent dikkop's failure [1]. The
regress_log_006_db_file_copy file from that run shows:
[02:08:57.929](0.014s) # initializing database system by copying initdb template
...
[02:09:22.511](24.583s) ok 1 - full backup
...
[02:10:35.758](73.247s) not ok 2 - incremental backup
006_db_file_copy_primary.log contains:
2024-07-28 02:09:29.441 UTC [67785:12] 006_db_file_copy.pl LOG: received replication command: START_REPLICATION SLOT
"pg_basebackup_67785" 0/4000000 TIMELINE 1
2024-07-28 02:09:29.441 UTC [67785:13] 006_db_file_copy.pl STATEMENT: START_REPLICATION SLOT "pg_basebackup_67785"
0/4000000 TIMELINE 1
2024-07-28 02:09:29.441 UTC [67785:14] 006_db_file_copy.pl LOG: acquired physical replication slot "pg_basebackup_67785"
2024-07-28 02:09:29.441 UTC [67785:15] 006_db_file_copy.pl STATEMENT: START_REPLICATION SLOT "pg_basebackup_67785"
0/4000000 TIMELINE 1
2024-07-28 02:10:29.487 UTC [67785:16] 006_db_file_copy.pl LOG: terminating walsender process due to replication timeout
2024-07-28 02:10:29.487 UTC [67785:17] 006_db_file_copy.pl STATEMENT: START_REPLICATION SLOT "pg_basebackup_67785"
0/4000000 TIMELINE 1
It looks like this incremental backup operation was performed slower than
usual (it took more than 60 seconds and apparently was interrupted due to
wal_sender_timeout). But looking at regress_log_006_db_file_copy from the
6 previous (successful) test runs, we can see:
[14:22:16.841](43.215s) ok 2 - incremental backup
[02:14:42.888](34.595s) ok 2 - incremental backup
[17:51:16.152](43.708s) ok 2 - incremental backup
[04:07:16.757](31.087s) ok 2 - incremental backup
[12:15:01.256](49.432s) ok 2 - incremental backup
[01:06:02.482](52.364s) ok 2 - incremental backup
Thus reaching 60s (e.g., due to some background activity) on this animal
seems pretty possible. So maybe it would make sense to increase
wal_sender_timeout for it, say, to 120s?
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop&dt=2024-07-27%2023%3A22%3A57
Best regards,
Alexander
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2024-07-29 04:01:07 | Re: Conflict detection and logging in logical replication |
Previous Message | Hayato Kuroda (Fujitsu) | 2024-07-29 03:58:20 | RE: [Proposal] Add foreign-server health checks infrastructure |