From: | "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add two missing tests in 035_standby_logical_decoding.pl |
Date: | 2023-05-02 11:22:26 |
Message-ID: | 8a13f0d8-71f3-773b-7135-5974fd53724b@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 5/2/23 8:28 AM, Amit Kapila wrote:
> On Fri, Apr 28, 2023 at 2:24 PM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>>
>> I can see V7 failing on "Cirrus CI / macOS - Ventura - Meson" only (other machines are not complaining).
>>
>> It does fail on "invalidated logical slots do not lead to retaining WAL", see https://cirrus-ci.com/task/4518083541336064
>>
>> I'm not sure why it is failing, any idea?
>>
>
> I think the reason for the failure is that on standby, the test is not
> able to remove the file corresponding to the invalid slot. You are
> using pg_switch_wal() to generate a switch record and I think you need
> one more WAL-generating statement after that to achieve your purpose
> which is that during checkpoint, the tes removes the WAL file
> corresponding to an invalid slot. Just doing checkpoint on primary may
> not serve the need as that doesn't lead to any new insertion of WAL on
> standby. Is your v6 failing in the same environment?
Thanks for the feedback!
No V6 was working fine.
> If not, then it
> is probably due to the reason that the test is doing insert after
> pg_switch_wal() in that version. Why did you change the order of
> insert in v7?
>
I thought doing the insert before the switch was ok and as my local test
was running fine I did not re-consider the ordering.
> BTW, you can confirm the failure by changing the DEBUG2 message in
> RemoveOldXlogFiles() to LOG. In the case, where the test fails, it may
> not remove the WAL file corresponding to an invalid slot whereas it
> will remove the WAL file when the test succeeds.
Yeah, I added more debug information and what I can see is that the WAL file
we want to see removed is "000000010000000000000003" while the standby emits:
"
2023-05-02 10:03:28.351 UTC [16971][checkpointer] LOG: attempting to remove WAL segments older than log file 000000000000000000000002
2023-05-02 10:03:28.351 UTC [16971][checkpointer] LOG: recycled write-ahead log file "000000010000000000000002"
"
As per your suggestion, changing the insert ordering (like in V8 attached) makes it now work on the failing environment too.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v8-0001-Add-a-test-to-verify-that-invalidated-logical-slo.patch | text/plain | 2.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2023-05-02 11:43:53 | Re: [PoC] pg_upgrade: allow to upgrade publisher node |
Previous Message | Amit Kapila | 2023-05-02 10:57:19 | Re: Logging parallel worker draught |