Re: Logical Replication - "invalid ordering of speculative insertion changes"

From: "Joe Wildish" <joe(at)lateraljoin(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Logical Replication - "invalid ordering of speculative insertion changes"
Date: 2023-02-02 20:11:45
Message-ID: 8e285370-0c29-4a08-99f3-b8c0107eaf43@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Just a bump on this --- perhaps the error is a bug with the DBMS?

From what I can see "speculative insertion changes" in this context means INSERT..ON CONFLICT DML. Although I have some experience writing extensions and simple patches for the code base, I don't know anything as a developer about the transaction log. I dug around a bit and it appears that there is a specific record type inside the WAL that differentiates INSERT from INSERT..ON CONFLICT changes, and it is these changes that cannot be re-ordered when trying to emit the message to the output plugin for the logical replication slot (which in this case is the internal pgoutput one). Given that, I don't see how it can be user error. Unless anyone else knows differently?

-Joe

On Tue, 31 Jan 2023, at 18:10, Joe Wildish wrote:
> Hello,
>
> We have a logical replication publisher (13.7) and subscriber (14.6)
> where we are seeing the following error on the subscriber. IP address
> and publication name changed, otherwise verbatim:
>
> 2023-01-31 15:24:49 UTC:x.x.x.x(56276):super(at)pubdb:[1040971]: WARNING:
> tables were not subscribed, you will have to run ALTER SUBSCRIPTION ...
> REFRESH PUBLICATION to subscribe the tables
> 2023-01-31 15:24:50 UTC::@:[1040975]: LOG: logical replication apply
> worker for subscription "pub" has started
> 2023-01-31 15:24:50 UTC::@:[1040975]: ERROR: could not receive data
> from WAL stream: ERROR: invalid ordering of speculative insertion
> changes
>
> This error occurs during the initial set up of the subscription. We
> hit REFRESH, and then immediately it goes into this error state. It
> then repeats as it is retrying from here onwards and keeps hitting the
> same error.
>
> My understanding is that the subscriber is performing some kind of
> reordering of the events contained within the WAL message. As it cannot
> then consume the message, it aborts, retries, and gets the same message
> and errors again. Looking in the source code it seems there is only
> one place where this error can be emitted --- reorderbuffer.c:2179.
> Moreover I can't tell if this is an error that I can be expected to
> recover from as a user.
>
> We see this error only sometimes. Other times, we REFRESH the
> subscription and it makes progress as one would expect.
>
> Can anyone advise on what we are doing wrong here?
>
> -Joe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter J. Holzer 2023-02-02 23:11:20 Re: Sequence vs UUID
Previous Message veem v 2023-02-02 19:47:15 Re: Sequence vs UUID