Re: [HACKERS] Transactions involving multiple postgres foreign servers

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Transactions involving multiple postgres foreign servers
Date: 2017-11-28 03:31:09
Message-ID: CAFjFpRdyHM77eJBioqOAL6JOS91hNbCaibToJT5qMCYRpxc6iA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 28, 2017 at 3:04 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
>> Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>
>>> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat
>>> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>> > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> >>
>>> >> Because I don't want to break the current user semantics. that is,
>>> >> currently it's guaranteed that the subsequent reads can see the
>>> >> committed result of previous writes even if the previous transactions
>>> >> were distributed transactions. And it's ensured by writer side. If we
>>> >> can make the reader side ensure it, the backend process don't need to
>>> >> wait for the resolver process.
>>> >>
>>> >> The waiting backend process are released by resolver process after the
>>> >> resolver process tried to resolve foreign transactions. Even if
>>> >> resolver process failed to either connect to foreign server or to
>>> >> resolve foreign transaction the backend process will be released and
>>> >> the foreign transactions are leaved as dangling transaction in that
>>> >> case, which are processed later. Also if resolver process takes a long
>>> >> time to resolve foreign transactions for whatever reason the user can
>>> >> cancel it by Ctl-c anytime.
>>> >>
>>> >
>>> > So, there's no guarantee that the next command issued from the
>>> > connection *will* see the committed data, since the foreign
>>> > transaction might not have committed because of a network glitch
>>> > (say). If we go this route of making backends wait for resolver to
>>> > resolve the foreign transaction, we will have add complexity to make
>>> > sure that the waiting backends are woken up in problematic events like
>>> > crash of the resolver process OR if the resolver process hangs in a
>>> > connection to a foreign server etc. I am not sure that the complexity
>>> > is worth the half-guarantee.
>>> >
>>>
>>> Hmm, maybe I was wrong. I now think that the waiting backends can be
>>> woken up only in following cases;
>>> - The resolver process succeeded to resolve all foreign transactions.
>>> - The user did the cancel (e.g. ctl-c).
>>> - The resolver process failed to resolve foreign transaction for a
>>> reason of there is no such prepared transaction on foreign server.
>>>
>>> In other cases the resolver process should not release the waiters.
>>
>> I'm not sure I see consensus here. What Ashutosh says seems to be: "Special
>> effort is needed to ensure that backend does not keep waiting if the resolver
>> can't finish it's work in forseable future. But this effort is not worth
>> because by waking the backend up you might prevent the next transaction from
>> seeing the changes the previous one tried to make."
>>
>> On the other hand, your last comments indicate that you try to be even more
>> stringent in letting the backend wait. However even this stringent approach
>> does not guarantee that the next transaction will see the data changes made by
>> the previous one.
>>
>
> What I'd like to guarantee is that the subsequent read can see the
> committed result of previous writes if the transaction involving
> multiple foreign servers is committed without cancellation by user. In
> other words, the backend should not be waken up and the resolver
> should continue to resolve at certain intervals even if the resolver
> fails to connect to the foreign server or fails to resolve it. This is
> similar to what synchronous replication guaranteed today. Keeping this
> semantics is very important for users. Note that the reading a
> consistent result by concurrent reads is a separated problem.

The question I have is how would we deal with a foreign server that is
not available for longer duration due to crash, longer network outage
etc. Example is the foreign server crashed/got disconnected after
PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain
blocked for much longer duration without user having an idea of what's
going on. May be we should add some timeout.

>
> The read result including foreign servers can be inconsistent if the
> such transaction is cancelled or the coordinator server crashes during
> two-phase commit processing. That is, if there is in-doubt transaction
> the read result can be inconsistent, even for subsequent reads. But I
> think this behaviour can be accepted by users. For the resolution of
> in-doubt transactions, the resolver process will try to resolve such
> transactions after the coordinator server recovered. On the other
> hand, for the reading a consistent result on such situation by
> subsequent reads, for example, we can disallow backends to inquiry SQL
> to the foreign server if a foreign transaction of the foreign server
> is remained.

+1 for the last sentence. If we do that, we don't need the backend to
be blocked by resolver since a subsequent read accessing that foreign
server would get an error and not inconsistent data.

>
> For the concurrent reads, the reading an inconsistent result can be
> happen even without in-doubt transaction because we can read data on a
> foreign server between PREPARE and COMMIT PREPARED while other foreign
> servers have committed. I think we should deal with this problem by
> other feature on reader side, for example, atomic visibility. If we
> have atomic visibility feature, we also can solve the above problem.
>
+1.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-11-28 03:53:41 Re: simplehash: tb->sizemask = 0
Previous Message Michael Paquier 2017-11-28 02:28:37 Re: Commit fest 2017-11