Re: improving wraparound behavior

From: Andres Freund <andres(at)anarazel(dot)de>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: improving wraparound behavior
Date: 2019-05-04 02:47:42
Message-ID: 20190504024742.y2cvkf6qohazlxk2@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-03 22:41:11 -0400, Stephen Frost wrote:
> I suppose it is a pretty big change in the base autovacuum launcher to
> be something that's run per database instead and then deal with the
> coordination between the two... but I can't help but feel like it
> wouldn't be that much *work*. I'm not against doing something smaller
> but was something smaller actually proposed for this specific issue..?

I think it'd be fairly significant. And that we should redo it from
scratch if we go there - because what we have isn't worth using as a
basis.

> > I'm thinking that we'd do something roughly like (in actual code) for
> > GetNewTransactionId():
> >
> > TransactionId dat_limit = ShmemVariableCache->oldestXid;
> > TransactionId slot_limit = Min(replication_slot_xmin, replication_slot_catalog_xmin);
> > Transactionid walsender_limit;
> > Transactionid prepared_xact_limit;
> > Transactionid backend_limit;
> >
> > ComputeOldestXminFromProcarray(&walsender_limit, &prepared_xact_limit, &backend_limit);
> >
> > if (IsOldest(dat_limit))
> > ereport(elevel,
> > errmsg("close to xid wraparound, held back by database %s"),
> > errdetail("current xid %u, horizon for database %u, shutting down at %u"),
> > errhint("..."));
> > else if (IsOldest(slot_limit))
> > ereport(elevel, errmsg("close to xid wraparound, held back by replication slot %s"),
> > ...);
> >
> > where IsOldest wouldn't actually compare plainly numerically, but would
> > actually prefer showing the slot, backend, walsender, prepared_xact, as
> > long as they are pretty close to the dat_limit - as in those cases
> > vacuuming wouldn't actually solve the issue, unless the other problems
> > are addressed first (as autovacuum won't compute a cutoff horizon that's
> > newer than any of those).
>
> Where the errhint() above includes a recommendation to run the SRF
> described below, I take it?

Not necessarily. I feel conciseness is important too, and this would be
the most imporant thing to tackle.

> Also, should this really be an 'else if', or should it be just a set of
> 'if()'s, thereby giving users more info right up-front?

Possibly? But it'd also make it even harder to read the log / the system
to keep up with logging, because we already log *so* much when close to
wraparound.

If we didn't order it, it'd be hard for users to figure out which to
address first. If we ordered it, people have to further up in the log to
figure out which is the most urgent one (unless we reverse the order,
which is odd too).

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2019-05-04 03:08:44 Re: improving wraparound behavior
Previous Message Stephen Frost 2019-05-04 02:41:11 Re: improving wraparound behavior