From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, michael(at)paquier(dot)xyz, thomas(dot)munro(at)gmail(dot)com, tomas(dot)vondra(at)2ndquadrant(dot)com, a(dot)zakirov(at)postgrespro(dot)ru, ah(at)cybertec(dot)at, magnus(at)hagander(dot)net, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: shared-memory based stats collector |
Date: | 2020-03-09 18:47:54 |
Message-ID: | 20200309184754.yvrgzqpzs3iynszq@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2020-03-09 15:37:05 -0300, Alvaro Herrera wrote:
> Tom Lane escribió:
>
> In patch 0003,
>
> > /*
> > - * Was it the archiver? If so, just try to start a new one; no need
> > - * to force reset of the rest of the system. (If fail, we'll try
> > - * again in future cycles of the main loop.). Unless we were waiting
> > - * for it to shut down; don't restart it in that case, and
> > - * PostmasterStateMachine() will advance to the next shutdown step.
> > + * Was it the archiver? Normal exit can be ignored; we'll start a new
> > + * one at the next iteration of the postmaster's main loop, if
> > + * necessary. Any other exit condition is treated as a crash.
> > */
> > if (pid == PgArchPID)
> > {
> > PgArchPID = 0;
> > if (!EXIT_STATUS_0(exitstatus))
> > - LogChildExit(LOG, _("archiver process"),
> > - pid, exitstatus);
> > - if (PgArchStartupAllowed())
> > - PgArchPID = pgarch_start();
> > + HandleChildCrash(pid, exitstatus,
> > + _("archiver process"));
> > continue;
> > }
>
> I'm worried that we're causing all processes to terminate when an
> archiver dies in some ugly way; but in the current coding, it's pretty
> harmless and we'd just start a new one. I think this needs to be
> reconsidered. As far as I know, pgarchiver remains unconnected to
> shared memory so a crash-restart cycle is not necessary. We should
> continue to just log the error message and move on.
Why is it worth having the archiver be "robust" that way? Except that
random implementation details led to it not being connected to shared
memory, and thus allowing a restart for any exit code, I don't see a
need? It doesn't have exit paths that could validly trigger another exit
code, as far as I can see.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-03-09 18:59:35 | Re: Bug in pg_restore with EventTrigger in parallel mode |
Previous Message | Alvaro Herrera | 2020-03-09 18:37:05 | Re: shared-memory based stats collector |