Quick Links

Re: Potential data loss of 2PC files

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc:	PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Potential data loss of 2PC files
Date:	2016-12-30 13:45:34
Message-ID:	CAB7nPqSV2w9DJ+PHH+vAvkF_mJ2nPbd=_fc5pQp3wOm4owgBNA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Dec 30, 2016 at 5:20 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> As per the prologue of the function, it doesn't expect any 2PC files
> to be written out in the function i.e. between two checkpoints. Most
> of those are created and deleted between two checkpoints. Same would
> be true for recovery as well. Thus in most of the cases we shouldn't
> need to flush the two phase directory in this function whether during
> normal operation or during the recovery. So, we should avoid flushing
> repeatedly when it's not needed. I agree that serialized_xacts > 0 is
> not the right condition during recovery on standby to flush the two
> phase directory.

This is assuming that 2PC transactions are not long-lived, which is
likely true for anything doing sharding, like XC, XL or Citus (?). So
yes that's true to expect that.

> During crash recovery, 2PC files are present on the disk, which means
> the two phase directory has correct record of it. This record can not
> change. So, we shouldn't need to flush it again. If that's true
> serialized_xacts will be 0 during recovery thus serialized_xacts > 0
> condition will still hold.
>
> On a standby however we will have to flush the two phase directory as
> part of checkpoint if there were any files left behind in that
> directory. We need a different condition there.

Well, flushing the meta-data of pg_twophase is really going to be far
cheaper than the many pages done until CheckpointTwoPhase is reached.
There should really be a check on serialized_xacts for the
non-recovery code path, but considering how cheap that's going to be
compared to the rest of the restart point stuff it is not worth the
complexity of adding a counter, for example in shared memory with
XLogCtl (the counter gets reinitialized at each checkpoint,
incremented when replaying a 2PC prepare, decremented with a 2PC
commit). So to reduce the backpatch impact I would just do the fsync
if (serialized_xact > 0 || RecoveryInProgress()) and call it a day.
--
Michael

In response to

Re: Potential data loss of 2PC files at 2016-12-30 08:20:53 from Ashutosh Bapat

Responses

Re: Potential data loss of 2PC files at 2016-12-30 13:59:07 from Ashutosh Bapat

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fabien COELHO	2016-12-30 13:45:47	Re: proposal: session server side variables
Previous Message	Petr Jelinek	2016-12-30 13:12:25	Re: multivariate statistics (v19)