From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1 |
Date: | 2013-11-20 23:41:41 |
Message-ID: | 20131120234141.GI18801@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2013-11-20 10:48:41 -0800, Josh Berkus wrote:
> > Presumably a replica created while all traffic was halted on the master
> > would be clean, correct? This bug can only be triggered if there's
> > heavy write load on the master, right?
Kinda. It's unfortunately necessary to understand how HS works to some
degree:
Everytime a server is (re-)started with a recovery.conf present and
hot_standby=on (be it streaming, archive based replication or PITR) the
Hot Standby code is used.
(Crash|Replication)-Recovery starts by reading the last checkpoint (from
pg_control or, if present, backup.label) and then replays WAL from the
'redo' point included in the checkpoint. The bug then occurs when it
first (or, in some case second time) replays a 'xl_running_xacts'
record. That's used to reconstruct information needed to allow queries.
Everytime the server in HS mode allows connections ("consistent recovery state
reached at ..." and "database system is ready to accept read only
connections" in the log), the bug can be triggered. If there weren't too
many transactions at that point, the problem won't occur until the
standby is restarted.
> If someone is doing PITR based on a snapshot taken with pg_basebackup,
> that will only trip this corruption bug if the user has hot_standby=on
> in their config *while restoring*? Or is it critical if they have
> hot_standby=on while backing up?
hot_standby=on only has an effect while starting up with a recovery.conf
present. So, if you have an old base backup around and all WAL files,
you can start from that.
Does that answer your questsions?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2013-11-20 23:45:41 | Can we trust fsync? |
Previous Message | Peter Geoghegan | 2013-11-20 23:37:36 | Re: Storing pg_stat_statements query texts externally, pg_stat_statements in core |