RE: Transactions involving multiple postgres foreign servers, take 2

From: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
To: 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com>, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "m(dot)usama(at)gmail(dot)com" <m(dot)usama(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "sulamul(at)gmail(dot)com" <sulamul(at)gmail(dot)com>, "alvherre(at)2ndquadrant(dot)com" <alvherre(at)2ndquadrant(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "ildar(at)adjust(dot)com" <ildar(at)adjust(dot)com>, "horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "chris(dot)travers(at)adjust(dot)com" <chris(dot)travers(at)adjust(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "ishii(at)sraoss(dot)co(dot)jp" <ishii(at)sraoss(dot)co(dot)jp>
Subject: RE: Transactions involving multiple postgres foreign servers, take 2
Date: 2021-06-24 12:46:47
Message-ID: TYCPR01MB64009E283C2590640412FD45EF079@TYCPR01MB6400.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Sawada-san,

I also tried to play a bit with the latest patches similar to Ikeda-san,
and with foreign 2PC parameter enabled/required.

> > >> b. about performance bottleneck (just share my simple benchmark
> > >> results)
> > >>
> > >> The resolver process can be performance bottleneck easily although
> > >> I think some users want this feature even if the performance is not so
> good.
> > >>
> > >> I tested with very simple workload in my laptop.
> > >>
> > >> The test condition is
> > >> * two remote foreign partitions and one transaction inserts an
> > >> entry in each partitions.
> > >> * local connection only. If NW latency became higher, the
> > >> performance became worse.
> > >> * pgbench with 8 clients.
> > >>
> > >> The test results is the following. The performance of 2PC is only
> > >> 10% performance of the one of without 2PC.
> > >>
> > >> * with foreign_twophase_commit = requried
> > >> -> If load with more than 10TPS, the number of unresolved foreign
> > >> -> transactions
> > >> is increasing and stop with the warning "Increase
> > >> max_prepared_foreign_transactions".
> > >
> > > What was the value of max_prepared_foreign_transactions?
> >
> > Now, I tested with 200.
> >
> > If each resolution is finished very soon, I thought it's enough
> > because 8clients x 2partitions = 16, though... But, it's difficult how
> > to know the stable values.
>
> During resolving one distributed transaction, the resolver needs both one
> round trip and fsync-ing WAL record for each foreign transaction.
> Since the client doesn’t wait for the distributed transaction to be resolved,
> the resolver process can be easily bottle-neck given there are 8 clients.
>
> If foreign transaction resolution was resolved synchronously, 16 would
> suffice.

I tested the V36 patches on my 16-core machines.
I setup two foreign servers (F1, F2) .
F1 has addressbook table.
F2 has pgbench tables (scale factor = 1).
There is also 1 coordinator (coor) server where I created user mapping to access the foreign servers.
I executed the benchmark measurement on coordinator.
My custom scripts are setup in a way that queries from coordinator
would have to access the two foreign servers.

Coordinator:
max_prepared_foreign_transactions = 200
max_foreign_transaction_resolvers = 1
foreign_twophase_commit = required

Other external servers 1 & 2 (F1 & F2):
max_prepared_transactions = 100

[select.sql]
\set int random(1, 100000)
BEGIN;
SELECT ad.name, ad.age, ac.abalance
FROM addressbook ad, pgbench_accounts ac
WHERE ad.id = :int AND ad.id = ac.aid;
COMMIT;

I then executed:
pgbench -r -c 2 -j 2 -T 60 -f select.sql coor

While there were no problems with 1-2 clients, I started having problems
when running the benchmark with more than 3 clients.

pgbench -r -c 4 -j 4 -T 60 -f select.sql coor

I got the following error on coordinator:

[95396] ERROR: could not prepare transaction on server F2 with ID fx_151455979_1216200_16422
[95396] STATEMENT: COMMIT;
WARNING: there is no transaction in progress
pgbench: error: client 1 script 0 aborted in command 3 query 0: ERROR: could not prepare transaction on server F2 with ID fx_151455979_1216200_16422

Here's the log on foreign server 2 <F2> matching the above error:
<F2> LOG: statement: PREPARE TRANSACTION 'fx_151455979_1216200_16422'
<F2> ERROR: maximum number of prepared transactions reached
<F2> HINT: Increase max_prepared_transactions (currently 100).
<F2> STATEMENT: PREPARE TRANSACTION 'fx_151455979_1216200_16422'

So I increased the max_prepared_transactions of <F1> and <F2> from 100 to 200.
Then I got the error:

[146926] ERROR: maximum number of foreign transactions reached
[146926] HINT: Increase max_prepared_foreign_transactions: "200".

So I increased the max_prepared_foreign_transactions to "300",
and got the same error of need to increase the max_prepared_transactions of foreign servers.

I just can't find the right tuning values for this.
It seems that we always run out of memory in FdwXactState insert_fdwxact
with multiple concurrent connections during PREPARE TRANSACTION.
This one I only encountered for SELECT benchmark.
Although I've got no problems with multiple connections for my custom scripts for
UPDATE and INSERT benchmarks when I tested up to 30 clients.

Would the following possibly solve this bottleneck problem?

> > > To speed up the foreign transaction resolution, some ideas have been
> > > discussed. As another idea, how about launching resolvers for each
> > > foreign server? That way, we resolve foreign transactions on each
> > > foreign server in parallel. If foreign transactions are concentrated
> > > on the particular server, we can have multiple resolvers for the one
> > > foreign server. It doesn’t change the fact that all foreign
> > > transaction resolutions are processed by resolver processes.
> >
> > Awesome! There seems to be another pros that even if a foreign server
> > is temporarily busy or stopped due to fail over, other foreign
> > server's transactions can be resolved.
>
> Yes. We also might need to be careful about the order of foreign transaction
> resolution. I think we need to resolve foreign transactions in arrival order at
> least within a foreign server.

Regards,
Kirk Jamison

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-06-24 13:08:22 Re: Weird use of parentheses in the manual
Previous Message Simon Riggs 2021-06-24 12:39:46 Re: Using indexUnchanged with nbtree