Re: Transactions involving multiple postgres foreign servers

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transactions involving multiple postgres foreign servers
Date: 2015-01-08 18:00:56
Message-ID: 1355046515.3912543.1420740056866.JavaMail.yahoo@jws10049.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jan 8, 2015 at 10:19 AM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> Andres is talking in my other ear suggesting that we ought to
>>> reuse the 2PC infrastructure to do all this.
>>
>> If you mean that the primary transaction and all FDWs in the
>> transaction must use 2PC, that is what I was saying, although
>> apparently not clearly enough. All nodes *including the local one*
>> must be prepared and committed with data about the nodes saved
>> safely off somewhere that it can be read in the event of a failure
>> of any of the nodes *including the local one*. Without that, I see
>> this whole approach as a train wreck just waiting to happen.
>
> Clearly, all the nodes other than the local one need to use 2PC. I am
> unconvinced that the local node must write a 2PC state file only to
> turn around and remove it again almost immediately thereafter.

The key point is that the distributed transaction data must be
flagged as needing to commit rather than roll back between the
prepare phase and the final commit. If you try to avoid the
PREPARE, flagging, COMMIT PREPARED sequence by building the
flagging of the distributed transaction metadata into the COMMIT
process, you still have the problem of what to do on crash
recovery. You really need to use 2PC to keep that clean, I think.

>> I'm not really clear on the mechanism that is being proposed for
>> doing this, but one way would be to have the PREPARE of the local
>> transaction be requested explicitly and to have that cause all FDWs
>> participating in the transaction to also be prepared. (That might
>> be what Andres meant; I don't know.)
>
> We want this to be client-transparent, so that the client just says
> COMMIT and everything Just Works.

What about the case where one or more nodes doesn't support 2PC.
Do we silently make the choice, without the client really knowing?

>> That doesn't strike me as the
>> only possible mechanism to drive this, but it might well be the
>> simplest and cleanest. The trickiest bit might be to find a good
>> way to persist the distributed transaction information in a way
>> that survives the failure of the main transaction -- or even the
>> abrupt loss of the machine it's running on.
>
> I'd be willing to punt on surviving a loss of the entire machine. But
> I'd like to be able to survive an abrupt reboot.

As long as people are aware that there is an urgent need to find
and fix all data stores to which clusters on the failed machine
were connected via FDW when there is a hard machine failure, I
guess it is OK. In essence we just document it and declare it to
be somebody else's problem. In general I would expect a
distributed transaction manager to behave well in the face of any
single-machine failure, but if there is one aspect of a
full-featured distributed transaction manager we could give up, I
guess that would be it.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2015-01-08 18:57:49 Re: Possible typo in create_policy.sgml
Previous Message Robert Haas 2015-01-08 17:31:55 Re: Transactions involving multiple postgres foreign servers