Re: Transactions involving multiple postgres foreign servers, take 2

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "m(dot)usama(at)gmail(dot)com" <m(dot)usama(at)gmail(dot)com>, "ikedamsh(at)oss(dot)nttdata(dot)com" <ikedamsh(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "sulamul(at)gmail(dot)com" <sulamul(at)gmail(dot)com>, "alvherre(at)2ndquadrant(dot)com" <alvherre(at)2ndquadrant(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "ildar(at)adjust(dot)com" <ildar(at)adjust(dot)com>, "horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "chris(dot)travers(at)adjust(dot)com" <chris(dot)travers(at)adjust(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "ishii(at)sraoss(dot)co(dot)jp" <ishii(at)sraoss(dot)co(dot)jp>
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Date: 2021-06-24 23:40:48
Message-ID: CAD21AoBtHLD73HG2mCTkdBOfhrUS8myf+XeU5HL8G+mKEow6OA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 24, 2021 at 10:11 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sat, Jun 12, 2021 at 1:25 AM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> >
> >
> >
> > (5)
> > +2. Pre-Commit phase (1st phase of two-phase commit)
> > +we record the corresponding WAL indicating that the foreign server is involved
> > +with the current transaction before doing PREPARE all foreign transactions.
> > +Thus, in case we loose connectivity to the foreign server or crash ourselves,
> > +we will remember that we might have prepared tranascation on the foreign
> > +server, and try to resolve it when connectivity is restored or after crash
> > +recovery.
> >
> > So currently FdwXactInsertEntry() calls XLogInsert() and XLogFlush() for
> > XLOG_FDWXACT_INSERT WAL record. Additionally we should also wait there
> > for WAL record to be replicated to the standby if sync replication is enabled?
> > Otherwise, when the failover happens, new primary (past-standby)
> > might not have enough XLOG_FDWXACT_INSERT WAL records and
> > might fail to find some in-doubt foreign transactions.
>
> But even if we wait for the record to be replicated, this problem
> isn't completely resolved, right?

Ah, I misunderstood the order of writing WAL records and preparing
foreign transactions. You're right. Combining your suggestion below,
perhaps we need to write all WAL records, call XLogFlush(), wait for
those records to be replicated, and prepare all foreign transactions.
Even in cases where the server crashes during preparing a foreign
transaction and the failover happens, the new master has all foreign
transaction entries. Some of them might not actually be prepared on
the foreign servers but it should not be a problem.

> > (6)
> > XLogFlush() is called for each foreign transaction. So if there are many
> > foreign transactions, XLogFlush() is called too frequently. Which might
> > cause unnecessary performance overhead? Instead, for example,
> > we should call XLogFlush() only at once in FdwXactPrepareForeignTransactions()
> > after inserting all WAL records for all foreign transactions?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-06-24 23:43:45 Re: cleanup temporary files after crash
Previous Message Thomas Munro 2021-06-24 23:38:01 Re: seawasp failing, maybe in glibc allocator