From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: logical decoding and replication of sequences |
Date: | 2022-03-11 12:53:15 |
Message-ID: | 4c1e9b47-f00c-8a60-d10b-a42995ccc5e5@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 3/11/22 12:34, Amit Kapila wrote:
> On Tue, Mar 8, 2022 at 11:59 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> On 3/7/22 22:25, Tomas Vondra wrote:
>>>>
>>>> Interesting. I can think of one reason that might cause this - we log
>>>> the first sequence increment after a checkpoint. So if a checkpoint
>>>> happens in an unfortunate place, there'll be an extra WAL record. On
>>>> slow / busy machines that's quite possible, I guess.
>>>>
>>>
>>> I've tweaked the checkpoint_interval to make checkpoints more aggressive
>>> (set it to 1s), and it seems my hunch was correct - it produces failures
>>> exactly like this one. The best fix probably is to just disable decoding
>>> of sequences in those tests that are not aimed at testing sequence decoding.
>>>
>>
>> I've pushed a fix for this, adding "include-sequences=0" to a couple
>> test_decoding tests, which were failing with concurrent checkpoints.
>>
>> Unfortunately, I realized we have a similar issue in the "sequences"
>> tests too :-( Imagine you do a series of sequence increments, e.g.
>>
>> SELECT nextval('s') FROM generate_sequences(1,100);
>>
>> If there's a concurrent checkpoint, this may add an extra WAL record,
>> affecting the decoded output (and also the data stored in the sequence
>> relation itself). Not sure what to do about this ...
>>
>
> I am also not sure what to do for it but maybe if in some way we can
> increase checkpoint timeout or other parameters for these tests then
> it would reduce the chances of such failures. The other idea could be
> to perform checkpoint before the start of tests to reduce the
> possibility of another checkpoint.
>
Yeah, I had the same ideas, but I'm not sure I like any of them. I doubt
we want to make checkpoints extremely rare, and even if we do that it'll
still fail on slow machines (e.g. with valgrind, clobber cache etc.).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2022-03-11 13:55:16 | Re: role self-revocation |
Previous Message | Tomas Vondra | 2022-03-11 12:50:29 | Re: Column Filtering in Logical Replication |