From: | Önder Kalacı <onderkalaci(at)gmail(dot)com> |
---|---|
To: | "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com" <Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com>, "zyu(at)yugabyte(dot)com" <zyu(at)yugabyte(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Subject: | Re: [Proposal] Add foreign-server health checks infrastructure |
Date: | 2022-10-19 15:18:14 |
Message-ID: | CACawEhUzpqYJ8mQmSjYgX0ePtPpvb2u9Onjf6pCjUGkoZ=-xSg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
> As far as I can think of, it should probably be a single background task
> > checking whether the server is down. If so, sending an invalidation
> message
> > to all the backends such that related backends could act on the
> > invalidation and throw an error. This is to cover the use-case you
> > described on [1].
>
> Indeed your approach covers the use case I said, but I'm not sure whether
> it is really good.
> In your approach, once the background worker process will manage all
> foreign servers.
> It may be OK if there are a few servers, but if there are hundreds of
> servers,
> the time interval during checks will be longer.
>
I expect users typically will have a lot more backends than the servers. We
can have a threshold for spinning a new bg worker (e.g., every 10 servers
gets a new bg worker etc.). Still, I think that'd be an optimization that
is probably not necessary for the majority of the users?
> Currently, each FDW can decide whether we do health checks or not per the
> backend.
> For example, we can skip health checks if the foreign server is not used
> now.
> The background worker cannot control such a way.
> Based on the above, I do not agree that we introduce a new background
> worker and make it to do a health check.
>
Again, the definition of "health check" is probably different for me. I'd
expect the health check to happen continuously, ideally keeping track of
how many consecutive times it succeeded and/or last time it
failed/succeeded etc.
A transaction failing with a bad error message (or holding some resources
locally until the transaction is committed) doesn't sound essential to me.
Is there any specific workload are you referring for optimizing to rollback
a transaction earlier if a remote server dies? What kind of workload would
benefit from that? Maybe there is, but not clear to me and haven't seen
discussed on the thread (sorry if I missed).
I'm trying to understand if we are trying to solve a problem that does not
really exists. I'm bringing this up, because I often deal with
architectures where there is a local node and remote transaction on
different Postgres servers. And, I have not encountered many (or any)
pattern that'd benefit from this change much. In fact, I think, on the
contrary, this might add significant overhead for OLTP type of high query
throughput systems.
> Moreover, methods to connect to foreign servers and check health are
> different per FDW.
> In terms of mysql_fdw [1], we must do mysql_init() and
> mysql_real_connect().
> About file_fdw, we do not have to connect, but developers may want to
> calculate checksum and compare.
> Therefore, we must provide callback functions anyway.
>
>
I think providing callback functions is useful for any case. Each fdw (or
in general extension) should be able to provide its own "health check"
function.
Thanks,
Onder KALACI
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2022-10-19 15:37:04 | Re: Move backup-related code to xlogbackup.c/.h |
Previous Message | Melih Mutlu | 2022-10-19 14:59:22 | Re: Mingw task for Cirrus CI |