From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1 |
Date: | 2013-11-22 12:22:56 |
Message-ID: | 528F4CA0.30408@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 19.11.2013 16:20, Andres Freund wrote:
> On 2013-11-18 23:15:59 +0100, Andres Freund wrote:
>> Afaics it's likely a combination/interaction of bugs and fixes between:
>> * the initial HS code
>> * 5a031a5556ff83b8a9646892715d7fef415b83c3
>> * f44eedc3f0f347a856eea8590730769125964597
>
> Yes, the combination of those is guilty.
>
> Man, this is (to a good part my) bad.
>
>> But that'd mean nobody noticed it during 9.3's beta...
>
> It's fairly hard to reproduce artificially since a) there have to be
> enough transactions starting and committing from the start of the
> checkpoint the standby is starting from to the point it does
> LogStandbySnapshot() to cross a 32768 boundary b) hint bits often save
> the game by not accessing clog at all anymore and thus not noticing the
> corruption.
> I've reproduced the issue by having an INSERT ONLY table that's never
> read from. It's helpful to disable autovacuum.
For the archive, here's what I used to reproduce this. It creates master
and a standby, and also uses an INSERT only table. To make it trigger
more easily, it helps to insert sleeps in CreateCheckpoint(), around the
LogStandbySnapshot() call.
- Heikki
Attachment | Content-Type | Size |
---|---|---|
test-hot-standby-bug.sh | application/x-sh | 1.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Fabrízio de Royes Mello | 2013-11-22 12:25:49 | Re: [PATCH] Store Extension Options |
Previous Message | Marko Kreen | 2013-11-22 11:27:40 | Re: PL/Python: domain over array support |