From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, tgl(at)sss(dot)pgh(dot)pa(dot)us, michael(at)paquier(dot)xyz, thomas(dot)munro(at)gmail(dot)com, tomas(dot)vondra(at)2ndquadrant(dot)com, a(dot)zakirov(at)postgrespro(dot)ru, ah(at)cybertec(dot)at, magnus(at)hagander(dot)net, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: shared-memory based stats collector |
Date: | 2020-03-10 21:32:42 |
Message-ID: | 20200310213242.bvkuykpswgqgjcpq@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020-03-10 09:48:07 -0300, Alvaro Herrera wrote:
> On 2020-Mar-10, Kyotaro Horiguchi wrote:
>
> > At Mon, 9 Mar 2020 20:34:20 -0700, Andres Freund <andres(at)anarazel(dot)de> wrote in
> > > On 2020-03-10 12:27:25 +0900, Kyotaro Horiguchi wrote:
> > > > That's true, but I have the same concern with Tom. The archive bacame
> > > > too-tightly linked with other processes than actual relation.
> > >
> > > What's the problem here? We have a number of helper processes
> > > (checkpointer, bgwriter) that are attached to shared memory, and it's
> > > not a problem.
> >
> > That theoretically raises the chance of server-crash by a small amount
> > of probability. But, yes, it's absurd to prmise that archiver process
> > crashes.
>
> The case I'm worried about is a misconfigured archive_command that
> causes the archiver to misbehave (exit with a code other than 0); if
> that already doesn't happen, or we can make it not happen, then I'm okay
> with the changes to archiver.
Well, an exit(1) is also fine, afaict. No?
The archive command can just trigger either a FATAL or a LOG:
rc = system(xlogarchcmd);
if (rc != 0)
{
/*
* If either the shell itself, or a called command, died on a signal,
* abort the archiver. We do this because system() ignores SIGINT and
* SIGQUIT while waiting; so a signal is very likely something that
* should have interrupted us too. Also die if the shell got a hard
* "command not found" type of error. If we overreact it's no big
* deal, the postmaster will just start the archiver again.
*/
int lev = wait_result_is_any_signal(rc, true) ? FATAL : LOG;
if (WIFEXITED(rc))
{
ereport(lev,
(errmsg("archive command failed with exit code %d",
WEXITSTATUS(rc)),
errdetail("The failed archive command was: %s",
xlogarchcmd)));
}
else if (WIFSIGNALED(rc))
{
#if defined(WIN32)
ereport(lev,
(errmsg("archive command was terminated by exception 0x%X",
WTERMSIG(rc)),
errhint("See C include file \"ntstatus.h\" for a description of the hexadecimal value."),
errdetail("The failed archive command was: %s",
xlogarchcmd)));
#else
ereport(lev,
(errmsg("archive command was terminated by signal %d: %s",
WTERMSIG(rc), pg_strsignal(WTERMSIG(rc))),
errdetail("The failed archive command was: %s",
xlogarchcmd)));
#endif
}
else
{
ereport(lev,
(errmsg("archive command exited with unrecognized status %d",
rc),
errdetail("The failed archive command was: %s",
xlogarchcmd)));
}
snprintf(activitymsg, sizeof(activitymsg), "failed on %s", xlog);
set_ps_display(activitymsg, false);
return false;
}
I.e. there's only normal ways to shut down the archiver due to a failing
archvie command.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2020-03-10 21:32:47 | Re: Berserk Autovacuum (let's save next Mandrill) |
Previous Message | Daniel Gustafsson | 2020-03-10 20:49:49 | Re: [PATCH] Use PKG_CHECK_MODULES to detect the libxml2 library |