From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | Amit kapila <amit(dot)kapila(at)huawei(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: BUG #7534: walreceiver takes long time to detect n/w breakdown |
Date: | 2012-10-01 16:57:34 |
Message-ID: | CAHGQGwEd34=Z7=t9q8Xf11pmQS5a216ug7NW4V6qpuawG1crOA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Mon, Oct 1, 2012 at 7:38 PM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> Hmm, I think we need to step back a bit. I've never liked the way
> replication_timeout works, where it's the user's responsibility to set
> wal_receiver_status_interval < replication_timeout. It's not very
> user-friendly. I'd rather not copy that same design to this walreceiver
> timeout. If there's two different timeouts like that, it's even worse,
> because it's easy to confuse the two.
Agreed.
I'd like to specify the replication timeout like we do TCP keepalives, i.e.,
what about introducing something like following parameters?
walsender_keepalives_idle
walsender_keepalives_interval
walsender_keeaplives_count
walreceiver_keepalives_idle
walreceiver_keepalives_interval
walreceiver_keepalives_count
I believe many users are basically familiar with TCP keepalives and how to
specify it. So I think that this approach would be intuitive to users. Also
this approach includes your proposal. If you specify
walsender_keepalives_idle = walsender_timeout / 2
walsender_keepalives_interval = -1 (disable; Ping is never sent
again if there is no reply after first Ping is sent)
walsender_keepalives_count = 1
the replication timeout works as you proposed. But of course the downside
of this approach is that the number of parameter for replication timeout is
increased from two (replication_timeout and
wal_receiver_status_interval) to six,
and those parameters are confusingly similar to existing
tcp_keepalives parameters,
which might cause another confusion to users. One idea to solve this problem is
to use existing tcp_keepalives paramters values for the replication timeout.
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Freddie Burgess | 2012-10-01 20:18:42 | Postgres 9.2 with Postgis 1.5.3 Upgrade |
Previous Message | Robert Haas | 2012-10-01 15:06:12 | Re: BUG #7534: walreceiver takes long time to detect n/w breakdown |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2012-10-01 17:04:09 | Re: WIP checksums patch |
Previous Message | Andres Freund | 2012-10-01 16:53:47 | Re: embedded list v3 |