RE: Transactions involving multiple postgres foreign servers, take 2

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Fujii Masao' <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Muhammad Usama <m(dot)usama(at)gmail(dot)com>, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, amul sul <sulamul(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ildar Musin <ildar(at)adjust(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Chris Travers <chris(dot)travers(at)adjust(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
Subject: RE: Transactions involving multiple postgres foreign servers, take 2
Date: 2020-09-11 08:15:08
Message-ID: TYAPR01MB2990DAB5AA66CD5FEC24D2DAFE240@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
> Originally start(), commit() and rollback() are supported as FDW interfaces.
> As far as I and Sawada-san discussed this upthread, to support MySQL,
> another type of start() would be necessary to issue "XA START id" command.
> end() might be also necessary to issue "XA END id", but that command can be
> issued via prepare() together with "XA PREPARE id".

Yeah, I think we can call xa_end and xa_prepare in the FDW's prepare function.

The issue is when to call xa_start, which requires XID as an argument. We don't want to call it in transactions that access only one node...?

> With his patch, prepare() is supported. What other interfaces need to be
> supported per XA/JTA?
>
> I'm not familiar with XA/JTA and XA transaction interfaces on other major
> DBMS. So I'd like to know what other interfaces are necessary additionally?

I think xa_start, xa_end, xa_prepare, xa_commit, xa_rollback, and xa_recover are sufficient. The XA specification is here:

https://pubs.opengroup.org/onlinepubs/009680699/toc.pdf

You can see the function reference in Chapter 5, and the concept in Chapter 3. Chapter 6 was probably showing the state transition (function call sequence.)

> IMO Sawada-san's version of 2PC is less performant, but it's because his
> patch provides more functionality. For example, with his patch, WAL is written
> to automatically complete the unresolve foreign transactions in the case of
> failure. OTOH, Alexey patch introduces no new WAL for 2PC.
> Of course, generating more WAL would cause more overhead.
> But if we need automatic resolution feature, it's inevitable to introduce new
> WAL whichever the patch we choose.

Please do not get me wrong. I know Sawada-san is trying to ensure durability. I just wanted to know what each patch does in how much cost in terms of disk and network I/Os, and if one patch can take something from another for less cost. I'm simply guessing (without having read the code yet) that each transaction basically does:

- two round trips (prepare, commit) to each remote node
- two WAL writes (prepare, commit) on the local node and each remote node
- one write for two-phase state file on each remote node
- one write to record participants on the local node

It felt hard to think about the algorithm efficiency from the source code. As you may have seen, the DBMS textbook and/or papers describe disk and network I/Os to evaluate algorithms. I thought such information would be useful before going deeper into the source code. Maybe such things can be written in the following Sawada-san's wiki or README in the end.

Atomic Commit of Distributed Transactions
https://wiki.postgresql.org/wiki/Atomic_Commit_of_Distributed_Transactions

Regards
Takayuki Tsunakawa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-09-11 08:36:19 Re: Implement UNLOGGED clause for COPY FROM
Previous Message Fujii Masao 2020-09-11 08:13:37 Re: New statistics for tuning WAL buffer size