Re: [HACKERS] Duplicated row after promote in synchronous streaming replication

From: Dang Minh Huong <kakalot49(at)gmail(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Duplicated row after promote in synchronous streaming replication
Date: 2014-03-26 22:07:01
Message-ID: 70346ADC-37F7-4453-977C-8BF884FD22E5@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

2014/03/27 0:18、Thom Brown <thom(at)linux(dot)com> のメッセージ:

>> On 26 March 2014 15:08, Dang Minh Huong <kakalot49(at)gmail(dot)com> wrote:
>> Hi all,
>>
>> I'm using PostgreSQL 9.1.10 for my HA project and have found this problem.
>>
>> I did (multiple times) the following sequence in my primary/standby
>> synchronous replication environment,
>>
>> 1. Update rows in a table (which have primary key constraint column) in
>> active DB
>>
>> 2. Stop active DB
>>
>> 3. Promote standby DB
>>
>> 4. Confirm the updated table in promoted standby (new primary) and found
>> that, there's a duplicate updated row (number of row was increased).
>>
>> I think it is a replication bug but wonder if it was fixed yet.
>> Can somebody help me?
>>
>> I'm not yet confirm PostgreSQL source, but here is my investigation result.
>>
>> Updated table before promoted were HOT update (index file was not changed).
>>
>> After promote i continue update that duplicated row (it returned two row
>> updated), and confirm with pg_filedump, i found the duplicated row and only
>> one is related to primary key index constraint.
>>
>> Compare with old active DB, i saw that after promote line pointer of updated
>> row (duplicated row) is broken into two line pointer, the new one is related
>> to primary index constraint and the other is not related to. Some thing like
>> below,
>>
>> Old active DB:
>> ctid(0,3)->ctid(0,6)->ctid(0,7)
>>
>> New active DB (after promote and update):
>> ctid(0,3)->ctid(0,9)
>> ctid(0,7)->ctid(0,10)
>>
>> ctid(0,10) is not related to primary key index constraint.
>>
>> Is something was wrong in redo log in standby DB? Or line pointer in HOT
>> update feature?
>
> It sounds like you're hitting a bug that was introduced in that
> exact minor version, and has since been fixed:
>
> http://www.postgresql.org/docs/9.1/static/release-9-1-11.html
>

Thanks for your prompt response. I will confirm and revision-up if it is needed.

> You should update to the latest minor version, then re-base your
> standbys from the primary.
>
> --
> Thom

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message maxim.boguk 2014-03-27 09:33:57 BUG #9741: Mininal case for the BUG #9735: Error: "ERROR: tuple offset out of range: 0" during bitmap scan
Previous Message kochismo 2014-03-26 17:32:28 BUG #9737: Trigram Regex degenerate case

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2014-03-26 22:23:51 Re: "Conditional jump or move depends on uninitialised value(s)" within jsonfuncs.c
Previous Message Peter Geoghegan 2014-03-26 21:01:28 "Conditional jump or move depends on uninitialised value(s)" within jsonfuncs.c