Re: Transactions involving multiple postgres foreign servers

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Vinayak Pokale <pokale_vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transactions involving multiple postgres foreign servers
Date: 2017-09-28 03:15:20
Message-ID: CAD21AoAPi3XsJCZW3Fb8BTG2p7dgCoQH0k9SBeFF8uz3R9j4Zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 27, 2017 at 4:05 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> On Wed, Sep 27, 2017 at 12:11 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Tue, Sep 26, 2017 at 9:50 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Tue, Sep 26, 2017 at 5:06 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>>> Based on the review comment from Robert, I'm planning to do the big
>>>> change to the architecture of this patch so that a backend process
>>>> work together with a dedicated background worker that is responsible
>>>> for resolving the foreign transactions. For the usage of this feature,
>>>> it will be almost the same as what this patch has been doing except
>>>> for adding a new GUC paramter that controls the number of resovler
>>>> process launch. That is, we can have multiple resolver process to keep
>>>> latency down.
>>>
>>> Multiple resolver processes is useful but gets a bit complicated. For
>>> example, if process 1 has a connection open to foreign server A and
>>> process 2 does not, and a request arrives that needs to be handled on
>>> foreign server A, what happens? If process 1 is already busy doing
>>> something else, probably we want process 2 to try to open a new
>>> connection to foreign server A and handle the request. But if process
>>> 1 and 2 are both idle, ideally we'd like 1 to get that request rather
>>> than 2. That seems a bit difficult to get working though. Maybe we
>>> should just ignore such considerations in the first version.
>>
>> I understood. I keep it simple in the first version.
>
> While a resolver process is useful for resolving transaction later, it
> seems performance effective to try to resolve the prepared foreign
> transaction, in post-commit phase, in the same backend which prepared
> those for two reasons 1. the backend already has a connection to that
> foreign server 2. it has just run some commands to completion on that
> foreign server, so it's highly likely that a COMMIT PREPARED would
> succeed too. If we let a resolver process do that, we will spend time
> in 1. signalling resolver process 2. setting up a connection to the
> foreign server and 3. by the time resolver process tries to resolve
> the prepared transaction the foreign server may become unavailable,
> thus delaying the resolution.

I think that making a resolver process have connection caches to each
foreign server for a while can reduce the overhead of connection to
foreign servers. These connections will be invalidated by DDLs. Also,
most of the time we spend to commit a distributed transaction is the
interaction between the coordinator and foreign servers using
two-phase commit protocal. So I guess the time in signalling to a
resolver process would not be a big overhead.

> Said that, I agree that post-commit phase doesn't have a transaction
> of itself, and thus any catalog lookup, error reporting is not
> possible. We will need some different approach here, which may not be
> straight forward. So, we may need to delay this optimization for v2. I
> think we have discussed this before, but I don't find a mail off-hand.
>
>>
>>>> * Resovler processes
>>>> 1. Fetch PGPROC entry from the shmem queue and get its XID (say, XID-a).
>>>> 2. Get the fdw_xact_state entry from shmem hash by XID-a.
>>>> 3. Iterate fdw_xact entries using the index, and resolve the foreign
>>>> transactions.
>>>> 3-a. If even one foreign transaction failed to resolve, raise an error.
>>>> 4. Change the waiting backend state to FDWXACT_COMPLETED and release it.
>>>
>>> Comments:
>>>
>>> - Note that any error we raise here won't reach the user; this is a
>>> background process. We don't want to get into a loop where we just
>>> error out repeatedly forever -- at least not if there's any other
>>> reasonable choice.
>>
>> Thank you for the comments.
>>
>> Agreed.
>
> We should probably log an error message in the server log, so that
> DBAs are aware of such a failure. Is that something you are
> considering to do?

Yes, a resolver process logs an error message in that case.

>
>>
>>> - I suggest that we ought to track the status for each XID separately
>>> on each server rather than just track the XID status overall. That
>>> way, if transaction resolution fails on one server, we don't keep
>>> trying to reconnect to the others.
>>
>> Agreed. In the current patch we manage fdw_xact entries that track the
>> status for each XID separately on each server. I'm going to use the
>> same mechanism. The resolver process get an target XID from shmem
>> queue and get the all fdw_xact entries associated with the XID from
>> the fdw_xact array in shmem. But since the scanning the whole fdw_xact
>> entries could be slow because the number of entry of fdw_xact array
>> could be a large number (e.g, max_connections * # of foreign servers),
>> I'm considering to have a linked list of the all fdw_xact entries
>> associated with same XID, and to have a shmem hash pointing to the
>> first fdw_xact entry of the linked lists for each XID. That way, we
>> can find the target fdw_xact entries from the array in O(1).
>>
>
> If we want to do something like this, would it be useful to use a data
> structure similar to what is used for maintaining subtrasactions? Just
> a thought.

Thank you for the advise, I'll consider that. But what I want to do is
just grouping the fdw_xact entries by XID and fetching the group of
fdw_xact in O(1) so we might not need to have the group as using a
stack like that is used for maintaining subtransactions.

>
>>> - If we go to resolve a remote transaction and find that no such
>>> remote transaction exists, what should we do? I'm inclined to think
>>> that we should regard that as if we had succeeded in resolving the
>>> transaction. Certainly, if we've retried the server repeatedly, it
>>> might be that we previously succeeded in resolving the transaction but
>>> then the network connection was broken before we got the success
>>> message back from the remote server. But even if that's not the
>>> scenario, I think we should assume that the DBA or some other system
>>> resolved it and therefore we don't need to do anything further. If we
>>> assume anything else, then we just go into an infinite error loop,
>>> which isn't useful behavior. We could log a message, though (for
>>> example, LOG: unable to resolve foreign transaction ... because it
>>> does not exist).
>>
>> Agreed.
>>
>
> Yes. I think the current patch takes care of this, except probably the
> error message.
>

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-09-28 04:01:53 Re: SendRowDescriptionMessage() is slow for queries with a lot of columns
Previous Message Tom Lane 2017-09-28 02:16:19 Early-adopter report for macOS 10.13