BUG #13451: Logical decoding / replication - WAL rows are streamed more than once

From: Hillel(dot)Eilat(at)attunity(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #13451: Logical decoding / replication - WAL rows are streamed more than once
Date: 2015-06-17 14:09:10
Message-ID: 20150617140910.2730.41532@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 13451
Logged by: Hillel Eilat
Email address: Hillel(dot)Eilat(at)attunity(dot)com
PostgreSQL version: 9.4.2
Operating system: Windows 7
Description:

Attunity R & D 17-Jun-2015

PostgreSQL Logical decoding / replication - WAL rows are streamed more than
once?

My software uses "Logical Decoding" capabilities for harvesting changes made
at PostgreSQL database and forwarding them to an application processing
unit.
The application can provide a START_LSN - which is typically the last
processed / committed LSN recorded on a previous run.
In case where no START_LSN is given - the internal value sampled by
IDENTIFY_SYSTEM call is taken.

I face problems in "resume" scenarios that start at a middle of a
transaction.

These scenarios consist of 2 consecutive runs:
A. First - start at BEGIN LSN, stream the WAL to an INSERT LSN of choice and
stop.
B. Next - resume at the last streamed LSN obtained from A, and stream the
WAL to the ending COMMIT LSN.

Problems:
1. Providing a START_LSN that points to a WAL row which is in a middle of a
transaction, actually conducts WAL streaming that starts at the LSN pointing
to the BEGIN row of that transaction. Is it a bug? Is it meant to be so?

2. Starting at a middle of a large transaction (say - 500000 rows), rows
having LSN-s which are prior to the specified START_LSN, occasionally show
up also. These unsolicited rows were already read on a previous run. Thus -
target consistency is violated. This smells to me like a bug.

3. For smaller transactions (~1000 rows TXN which was simulated), this
misbehavior is not observed. Under smaller volumes - "resume" scenarios
seem to work fine.

4. Does this mean that the only "resume-able" LSN-s are those of BEGIN
events?

This report was obtained when working with a 9.4.2 version on Windows
platform:

select version();
"PostgreSQL 9.4.2, compiled by Visual C++ build 1800, 64-bit"

Did I miss something?

Kindest regards.

Hillel Eilat
Attunity R & D

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Frost 2015-06-17 15:39:33 Re: [GENERAL] pg_xlog on a hot_standby slave filling up
Previous Message Christoph Berg 2015-06-17 10:22:11 Re: [GENERAL] pg_xlog on a hot_standby slave filling up