From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | chris(dot)travers(at)adjust(dot)com |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Two proposed modifications to the PostgreSQL FDW |
Date: | 2018-08-21 06:38:21 |
Message-ID: | CAD21AoBdBg6PABcLc1T1sdt_Y0Aj=COx-DScw_VXf75JpqCstw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 21, 2018 at 1:47 AM Chris Travers <chris(dot)travers(at)adjust(dot)com> wrote:
>
>
>
> On Mon, Aug 20, 2018 at 4:41 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>> Hi,
>>
>> On 2018-08-20 16:28:01 +0200, Chris Travers wrote:
>> > 2. TWOPHASECOMMIT=[off|on] option
>>
>> > The second major issue that I see with PostgreSQL's foreign database
>> > wrappers is the fact that there is no two phase commit which means that a
>> > single transaction writing to a group of tables has no expectation that all
>> > backends will commit or rollback together. With this patch an option would
>> > be applied to foreign tables such that they could be set to use two phase
>> > commit When this is done, the first write to each backend would register a
>> > connection with a global transaction handler and a pre-commit and commit
>> > hooks would be set up to properly process these.
>> >
>> > On recommit a per-global-transaction file would be opened in the data
>> > directory and prepare statements logged to the file. On error, we simply
>> > roll back our local transaction.
>> >
>> > On commit hook , we go through and start to commit the remote global
>> > transactions. At this point we make a best effort but track whether or not
>> > we were successfully on all. If successful on all, we delete the file. If
>> > unsuccessful we fire a background worker which re-reads the file and is
>> > responsible for cleanup. If global transactions persist, a SQL
>> > administration function will be made available to restart the cleanup
>> > process. On rollback, we do like commit but we roll back all transactions
>> > in the set. The file has enough information to determine whether we should
>> > be committing or rolling back on cleanup.
>> >
>> > I would like to push these both for Pg 12. Is there any feedback on the
>> > concepts and the problems first
>>
Thank you for the proposal. I agree that it's a major problem that
postgres_fdw (or PostgreSQL core API) doesn't support two-phase
commit.
>> There's been *substantial* work on this. You should at least read the
>> discussion & coordinate with the relevant developers.
>
>
> I suppose I should forward this to them directly also.
>
> Yeah. Also the transaction manager code for this I wrote while helping with a proof of concept for this copy-to-remote extension.
>
> There are a few big differences in implementation with the patches you mention and the disagreement was part of why I thought about going this direction.
>
> First, discussion of differences in implementation:
>
> 1. I treat the local and remote transactions symmetrically and I make no assumptions about what might happen between prepare and an attempted local commit.
> prepare goes into the precommit hook
> commit goes into the commit hook and we never raise errors if it fails (because you cannot rollback at that point). Instead a warning is raised and cleanup commences.
> rollback goes into the rollback hook and we never raise errors if it fails (because you are already rolling back).
>
> 2. By treating this as a property of a table rather than a property of a foreign data wrapper or a server, we can better prevent prepared transactions where they have not been enabled.
> This also ensures that we know whether we are guaranteeing two phase commit or not by looking at the table.
>
> 3. By making this opt-in it avoids a lot of problems with regards to incorrect configuration etc since if the DBA says "use two phase commit" and failed to enable prepared transactions on the other side...
>
> On to failure modes:
> 1. Its possible that under high load too many foreign transactions are prepared and things start rolling back instead of committing. Oh well....
> 2. In the event that a foreign server goes away between prepare and commit, we continue to retry via the background worker. The background worker is very pessimistic and checks every remote system for the named transaction.
If some participant servers fail during COMMIT PREPARED, will the
client get a "committed"? or an "aborted"? If the client gets
"aborted", that's not correct because the local changes are already
committed at that point. On the other hand, if the client get
"committed" it might break the current user semantics because the
subsequent reads may not be able to see the own committed writes. Also
since we don't want to wait for COMMIT PREPARED to complete we need to
consider that users could cancel the query anytime. To not break the
current semantics we cannot raise error during 2nd phase of two-phase
commit but it's not realistic because even the palloc() can raise an
error.
The design the patch chose is making backends do only PREPARE and wait
for the background worker to complete COMMIT PREPARED. In this design
the clients get a "committed" only either when successful in commit on
all participants or when they cancel the query explicitly. In other
words, the client will wait for completion of 2nd phase of two-phase
commit forever unless it cancels.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Haribabu Kommi | 2018-08-21 06:55:47 | Re: Pluggable Storage - Andres's take |
Previous Message | Michael Paquier | 2018-08-21 06:22:50 | Re: NLS handling fixes. |