From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Review for GetWALAvailability() |
Date: | 2020-06-17 02:40:56 |
Message-ID: | 20200617024056.GA21726@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020-Jun-17, Fujii Masao wrote:
> On 2020/06/17 3:50, Alvaro Herrera wrote:
> So InvalidateObsoleteReplicationSlots() can terminate normal backends.
> But do we want to do this? If we want, we should add the note about this
> case into the docs? Otherwise the users would be surprised at termination
> of backends by max_slot_wal_keep_size. I guess that it's basically rarely
> happen, though.
Well, if we could distinguish a walsender from a non-walsender process,
then maybe it would make sense to leave backends alive. But do we want
that? I admit I don't know what would be the reason to have a
non-walsender process with an active slot, so I don't have a good
opinion on what to do in this case.
> > > + /*
> > > + * Signal to terminate the process using the replication slot.
> > > + *
> > > + * Try to signal every 100ms until it succeeds.
> > > + */
> > > + if (!killed && kill(active_pid, SIGTERM) == 0)
> > > + killed = true;
> > > + ConditionVariableTimedSleep(&slot->active_cv, 100,
> > > + WAIT_EVENT_REPLICATION_SLOT_DROP);
> > > + } while (ReplicationSlotIsActive(slot, NULL));
> >
> > Note that here you're signalling only once and then sleeping many times
> > in increments of 100ms -- you're not signalling every 100ms as the
> > comment claims -- unless the signal fails, but you don't really expect
> > that. On the contrary, I'd claim that the logic is reversed: if the
> > signal fails, *then* you should stop signalling.
>
> You mean; in this code path, signaling fails only when the target process
> disappears just before signaling. So if it fails, slot->active_pid is
> expected to become 0 even without signaling more. Right?
I guess kill() can also fail if the PID now belongs to a process owned
by a different user. I think we've disregarded very quick reuse of
PIDs, so we needn't concern ourselves with it.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2020-06-17 03:10:31 | Re: Review for GetWALAvailability() |
Previous Message | Fujii Masao | 2020-06-17 02:30:41 | Re: Review for GetWALAvailability() |