Quick Links

Re: Transactions involving multiple postgres foreign servers, take 2

From:	Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To:	"tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Muhammad Usama <m(dot)usama(at)gmail(dot)com>, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, amul sul <sulamul(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ildar Musin <ildar(at)adjust(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Chris Travers <chris(dot)travers(at)adjust(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
Subject:	Re: Transactions involving multiple postgres foreign servers, take 2
Date:	2020-10-09 11:01:23
Message-ID:	CA+fd4k5aFADyeNe4_Rk-W+Y2a1J7LWikj5zzUk7+5XYepmYO5g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 9 Oct 2020 at 11:33, tsunakawa(dot)takay(at)fujitsu(dot)com
<tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:
>
> From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
> > What about temporary network failures? I think there are users who
> > don't want to give up resolving foreign transactions failed due to a
> > temporary network failure. Or even they might want to wait for
> > transaction completion until they send a cancel request. If we want to
> > call the commit routine only once and therefore want FDW to retry
> > connecting the foreign server within the call, it means we require all
> > FDW implementors to write a retry loop code that is interruptible and
> > ensures not to raise an error, which increases difficulty.
> >
> > Yes, but if we don’t retry to resolve foreign transactions at all on
> > an unreliable network environment, the user might end up requiring
> > every transaction to check the status of foreign transactions of the
> > previous distributed transaction before starts. If we allow to do
> > retry, I guess we ease that somewhat.
>
> OK. As I said, I'm not against trying to cope with temporary network failure. I just don't think it's mandatory. If the network failure is really temporary and thus recovers soon, then the resolver will be able to commit the transaction soon, too.

Well, I agree that it's not mandatory. I think it's better if the user
can choose.

I also doubt how useful the per-foreign-server timeout setting you
mentioned before. For example, suppose the transaction involves with
three foreign servers that have different timeout setting, what if the
backend failed to commit on the first one of the server due to
timeout? Does it attempt to commit on the other two servers? Or does
it give up and return the control to the client? In the former case,
what if the backend failed again on one of the other two servers due
to timeout? The backend might end up waiting for all timeouts and in
practice the user is not aware of how many servers are involved with
the transaction, for example in a sharding. So It seems to be hard to
predict the total timeout. In the latter case, the backend might
succeed to commit on the other two nodes. Also, the timeout setting of
the first foreign server virtually is used as the whole foreign
transaction resolution timeout. However, the user cannot control the
order of resolution. So again it seems to be hard for the user to
predict the timeout. So If we have a timeout mechanism, I think it's
better if the user can control the timeout for each transaction.
Probably the same is true for the retry.

>
> Then, we can have a commit retry timeout or retry count like the following WebLogic manual says. (I couldn't quickly find the English manual, so below is in Japanese. I quoted some text that got through machine translation, which appears a bit strange.)
>
> https://docs.oracle.com/cd/E92951_01/wls/WLJTA/trxcon.htm
> --------------------------------------------------
> Abandon timeout
> Specifies the maximum time (in seconds) that the transaction manager attempts to complete the second phase of a two-phase commit transaction.
>
> In the second phase of a two-phase commit transaction, the transaction manager attempts to complete the transaction until all resource managers indicate that the transaction is complete. After the abort transaction timer expires, no attempt is made to resolve the transaction. If the transaction enters a ready state before it is destroyed, the transaction manager rolls back the transaction and releases the held lock on behalf of the destroyed transaction.
> --------------------------------------------------

Yeah per-transaction timeout for 2nd phase of 2PC seems a good idea.

>
>
>
> > Also, what if the user sets the statement timeout to 60 sec and they
> > want to cancel the waits after 5 sec by pressing ctl-C? You mentioned
> > that client libraries of other DBMSs don't have asynchronous execution
> > functionality. If the SQL execution function is not interruptible, the
> > user will end up waiting for 60 sec, which seems not good.
>
> FDW functions can be uninterruptible in general, aren't they? We experienced that odbc_fdw didn't allow cancellation of SQL execution.

For example in postgres_fdw, it executes a SQL in asynchronous manner
using by PQsendQuery(), PQconsumeInput() and PQgetResult() and so on
(see do_sql_command() and pgfdw_get_result()). Therefore it the user
pressed ctl-C, the remote query would be canceled and raise an ERROR.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

RE: Transactions involving multiple postgres foreign servers, take 2 at 2020-10-09 02:33:37 from tsunakawa.takay@fujitsu.com

Responses

RE: Transactions involving multiple postgres foreign servers, take 2 at 2020-10-12 02:07:54 from tsunakawa.takay@fujitsu.com

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	John Naylor	2020-10-09 11:08:26	Re: [PATCH] ecpg: fix progname memory leak
Previous Message	Greg Nancarrow	2020-10-09 10:57:38	Re: Parallel INSERT (INTO ... SELECT ...)