From: | Vladimir Gordiychuk <folyga(at)gmail(dot)com> |
---|---|
To: | Dave Cramer <pg(at)fastcrypt(dot)com> |
Cc: | Stefan Smith <stefan(dot)smith(dot)work(at)gmail(dot)com>, List <pgsql-jdbc(at)postgresql(dot)org> |
Subject: | Re: Logical Replication: adjacent COMMIT messages with the wrong StartLSN |
Date: | 2017-04-17 19:41:19 |
Message-ID: | CAFgjRd2jc97xAP+_LXgG79XCQAQ+Z7QZ9ULu3GH6Ys2jC9A95Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-jdbc |
> Logical replication slot always streams COMMIT messages with message
position set to TXN's end LSN, which is the location where the COMMIT
message ends in the WAL. This is existing behaviour in v9.6.
No. It depends on callback that was execute for output plugin.
typedef struct OutputPluginCallbacks
{
LogicalDecodeStartupCB startup_cb;
LogicalDecodeBeginCB begin_cb;
LogicalDecodeChangeCB change_cb;
LogicalDecodeCommitCB commit_cb;
LogicalDecodeMessageCB message_cb;
LogicalDecodeFilterByOriginCB filter_by_origin_cb;
LogicalDecodeShutdownCB shutdown_cb;
} OutputPluginCallbacks;
If output plugin send message to logical replication stream inside
commit_cb will be use transaction end lsn[1].
If output plugin send message to logical replication stream inside
change_cb(insert/update/delete) will be use lsn of start particular
change[2].
Also you should be careful if add to buffer changes from change_cb
callback and then send multiple message on commit_cb, because each message
send from commit_cb will have lsn equal to transaction end that can lead to
lost data after restart if some part of buffer wan't send.
[1]
https://github.com/postgres/postgres/blob/master/src/backend/replication/logical/logical.c#L643
[2]
https://github.com/postgres/postgres/blob/master/src/backend/replication/logical/logical.c#L679
> In JDBC driver streaming replication code, StartLSN = streamed message
position. Therefore, for COMMIT messages, StartLSN = TXN end LSN.
Therefore, for each COMMIT message, if I call
stream.setAppliedLSN(stream.getLastReceiveLSN()),
then stream.setFlushedLSN(stream.getLastReceiveLSN()), then
stream.forceUpdateStatus() this will set confirmed_flush_lsn to end
location of TXN but no further. Therefore, on reconnection, we will receive
next unprocessed TXN, even if its COMMIT message follows immediately after
previous TXN COMMIT message in WAL (like in my original post example).
Yes. Because logical replication will find all not completed transaction on
specified lsn during startup and start stream them.
>When do you think 42.1.0 will be released?
2017-04-17 15:50 GMT+03:00 Dave Cramer <pg(at)fastcrypt(dot)com>:
>
>
>
> On 12 April 2017 at 14:53, Stefan Smith <stefan(dot)smith(dot)work(at)gmail(dot)com>
> wrote:
>
>> Thank you for fixing this bug so quickly. I would like to confirm my
>> understanding of the solution:
>>
>> - Logical replication slot always streams COMMIT messages with
>> message position set to TXN's end LSN, which is the location where the
>> COMMIT message ends in the WAL. This is existing behaviour in v9.6.
>> - In JDBC driver streaming replication code, StartLSN = streamed
>> message position. Therefore, for COMMIT messages, StartLSN = TXN end LSN.
>> Therefore, for each COMMIT message, if I call stream.setAppliedLSN(
>> stream.getLastReceiveLSN()), then stream.setFlushedLSN(stream.getLastReceiveLSN()),
>> then stream.forceUpdateStatus() this will set confirmed_flush_lsn to end
>> location of TXN but no further. Therefore, on reconnection, we will receive
>> next unprocessed TXN, even if its COMMIT message follows immediately after
>> previous TXN COMMIT message in WAL (like in my original post example).
>>
>> Is this correct?
>>
>> One thing I wanted to clarify: from https://github.com/postgres/po
>> stgres/blob/master/src/backend/replication/logical/logical.c, I see that
>> BEGIN and DML messages will be streamed with StartLSN = message start
>> position (not message end position). So meaning of
>> stream.getLastReceiveLSN() is different depending on whether last message
>> received by stream.read() was a BEGIN, DML or COMMIT message? Guess it
>> doesn't matter since we only call setAppliedLSN() and setFlushedLSN() for
>> COMMIT messages anyway.
>>
>> When do you think 42.1.0 will be released?
>>
>
>
> Soon!
>
> Dave Cramer
>
> davec(at)postgresintl(dot)com
> www.postgresintl.com
>
>
>>
>> Thanks again.
>>
>> Stefan
>>
>> On Wed, Apr 5, 2017 at 11:16 PM, Vladimir Gordiychuk <folyga(at)gmail(dot)com>
>> wrote:
>>
>>> Patch for fix this problem https://github.com/pgj
>>> dbc/pgjdbc/pull/801/commits/4b4a521178b5a3cab63bd586982f6191ce32c240
>>>
>>>
>>> 2017-04-05 23:58 GMT+03:00 Vladimir Gordiychuk <folyga(at)gmail(dot)com>:
>>>
>>>> The problem on pgjdbc side. The problem affect not only parallel
>>>> transaction but any transaction that use own implementation output plugin.
>>>> Logical replication for onCommit callback already send LSN end transaction,
>>>> but we still add to this LSN payload. Add payload not correct for startLSN
>>>> for calculate endLSN because logical replication can change size of message
>>>> from WAL and we will calculate not valid LSN(for example LSN for future
>>>> transaction).
>>>> I will prepare a patch for fix this problem today or tomorrow.
>>>>
>>>> 2017-04-04 22:24 GMT+03:00 Vladimir Gordiychuk <folyga(at)gmail(dot)com>:
>>>>
>>>>> Hi.
>>>>>
>>>>> Dave, thanks that invite me.
>>>>>
>>>>> Stafan, I reproduce this issue in PR https://github.com/pgjdbc/p
>>>>> gjdbc/pull/801
>>>>> Right know *lastReceiveLSN* calculates as *startLSN* *+ payloadSize*.
>>>>> We add this changes after this comment https://github.com/pgj
>>>>> dbc/pgjdbc/pull/550#issuecomment-254427102
>>>>> I think we should analyze it comment again. Because if use
>>>>> *lastReceiveLsn* as *startLSN* without add payloadSize your testcase
>>>>> works as you want.
>>>>>
>>>>>
>>>>> 2017-04-04 15:34 GMT+03:00 Dave Cramer <pg(at)fastcrypt(dot)com>:
>>>>>
>>>>>> + Vladimir as he is much more familiar with the replication protocol,
>>>>>> etc
>>>>>>
>>>>>>
>>>>>>
>>>>>> Dave Cramer
>>>>>>
>>>>>> davec(at)postgresintl(dot)com
>>>>>> www.postgresintl.com
>>>>>>
>>>>>> On 4 April 2017 at 08:11, Stefan Smith <stefan(dot)smith(dot)work(at)gmail(dot)com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have been evaluating the logical replication feature in the JDBC
>>>>>>> driver v42.0.0, against postgres v9.6. One scenario I'm testing is
>>>>>>> streaming a high volume of transactions, where each transaction contains
>>>>>>> multiple INSERTs.
>>>>>>>
>>>>>>> Sometimes, two transaction COMMIT messages follow each other, one
>>>>>>> after the other, in the DB transaction log, and so the transactions arrive
>>>>>>> one after the other in the logical replication stream. This is expected
>>>>>>> behaviour.
>>>>>>>
>>>>>>> I apply and flush the LastReceivedLSN and force an update to the
>>>>>>> server after every COMMIT message.
>>>>>>>
>>>>>>> To stress-test the recovery behaviour, I'm also disconnecting and
>>>>>>> reconnecting on every fourth INSERT received.
>>>>>>>
>>>>>>> In most cases I see the following expected behaviour:
>>>>>>>
>>>>>>> The WAL might have (with made-up LSNs for illustrative purposes):
>>>>>>>
>>>>>>> LSN=10 BEGIN (TXN 1)
>>>>>>> LSN=20 BEGIN (TXN 2)
>>>>>>> LSN=30 INSERT (TXN 1)
>>>>>>> LSN=40 INSERT (TXN 2)
>>>>>>> LSN=50 INSERT (TXN 1)
>>>>>>> LSN=60 INSERT (TXN 2)
>>>>>>> LSN=70 INSERT (TXN 1)
>>>>>>> LSN=80 INSERT (TXN 2)
>>>>>>> LSN=90 COMMIT (TXN 1)
>>>>>>> LSN=100 COMMIT (TXN 2)
>>>>>>>
>>>>>>> And so the stream receives:
>>>>>>>
>>>>>>> BEGIN (TXN 1) LastReceivedLSN=19
>>>>>>> INSERT (TXN 1) LastReceivedLSN=39
>>>>>>> INSERT (TXN 1) LastReceivedLSN=59
>>>>>>> INSERT (TXN 1) LastReceivedLSN=79
>>>>>>> COMMIT (TXN 1) LastReceivedLSN=99
>>>>>>> <JDBC APPLY & FLUSH 99, FORCE UPDATE>
>>>>>>> BEGIN (TXN 2) LastReceivedLSN=29
>>>>>>> INSERT (TXN 2) LastReceivedLSN=49
>>>>>>> <JDBC RECONNECT>
>>>>>>> BEGIN (TXN 2) LastReceivedLSN=29
>>>>>>> INSERT (TXN 2) LastReceivedLSN=49
>>>>>>> INSERT (TXN 2) LastReceivedLSN=69
>>>>>>> INSERT (TXN 2) LastReceivedLSN=89
>>>>>>> COMMIT (TXN 2) LastReceivedLSN=109
>>>>>>>
>>>>>>> The above behaviour makes sense since the replication slot's
>>>>>>> confirmed_flush_lsn=99 upon reconnect.
>>>>>>>
>>>>>>> My issue: occasionally after reconnecting, I observe that the
>>>>>>> INSERTs for TXN 2 are not resent and instead the stream moves on to TXN 3.
>>>>>>>
>>>>>>> With the same WAL as above, the stream looks like:
>>>>>>>
>>>>>>> BEGIN (TXN 1) LastReceivedLSN=19
>>>>>>> INSERT (TXN 1) LastReceivedLSN=39
>>>>>>> INSERT (TXN 1) LastReceivedLSN=59
>>>>>>> INSERT (TXN 1) LastReceivedLSN=79
>>>>>>> COMMIT (TXN 1) LastReceivedLSN=109 <-- This LSN is wrong!
>>>>>>> <JDBC APPLY & FLUSH 109, FORCE UPDATE>
>>>>>>> BEGIN (TXN 2) LastReceivedLSN=29
>>>>>>> INSERT (TXN 2) LastReceivedLSN=49
>>>>>>> <JDBC RECONNECT>
>>>>>>> BEGIN (TXN 3) ...
>>>>>>>
>>>>>>> Debugging the JDBC driver and comparing it with the WAL (via
>>>>>>> pg_xlogdump), it seems that on these occasions COMMIT (TXN 1) arrives with
>>>>>>> the correct payload but with StartLSN set to the StartLSN of COMMIT (TXN
>>>>>>> 2)! The JDBC driver computes COMMIT (TXN 1) LastReceivedLSN = COMMIT (TXN
>>>>>>> 2) StartLSN + length of COMMIT (TXN 1) payload. This causes us to
>>>>>>> unwittingly set confirmed_flush_lsn to the end WAL position of COMMIT (TXN
>>>>>>> 2) before disconnecting, meaning I don't decode TXN 2 after reconnection.
>>>>>>>
>>>>>>> Is this a known issue? Is it caused by the JDBC driver or something
>>>>>>> in the server?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Migowski | 2017-04-18 08:36:26 | Re: 42.0.0.jre7, driver logging takes 30-40% of my server's time |
Previous Message | Dave Cramer | 2017-04-17 12:50:02 | Re: Logical Replication: adjacent COMMIT messages with the wrong StartLSN |