Re: Add last commit LSN to pg_last_committed_xact()

From: James Coleman <jtc331(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add last commit LSN to pg_last_committed_xact()
Date: 2022-01-18 14:47:44
Message-ID: CAAaqYe8Mz_jQDVPftw8dk6Nwes7AAn9gJ2J4iuQ-O3E2nwJUag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 18, 2022 at 9:25 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Mon, Jan 17, 2022 at 8:39 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > I wondered about that, but commit_ts already does more than commit
> > timestamps by recording the xid of the last commit.
>
> Well, if you're maintaining an SLRU, you do kind of need to know where
> the leading and lagging ends are.

As far as I can tell the data in commitTsShared is used purely as an
optimization for the path looking up the timestamp for an arbitrary
xid when that xid happens to be the most recent one so that we don't
have to look up in the SLRU for that specific case. Maybe I'm missing
something else you're seeing?

> > For that matter, keeping a cache of last commit metadata in shared
> > memory is arguably not obviously implied by "track_commit_timestamps",
> > which leads to the below...
>
> I suppose that's true in the strictest sense, but tracking information
> does seem to imply having a way to look it up.

Looking up for an arbitrary commit, sure, (that's how I understand the
commit timestamps feature anyway) but it seems to me that the "most
recent' is distinct. Reading the code it seems the only usage (besides
the boolean activation status also stored there) is in
TransactionIdGetCommitTsData, and the only consumers of that in core
appear to be the SQL callable functions to get the latest commit info.
It is in commit_ts.h though, so I'm guessing someone is using this
externally (and maybe that's why the feature has the shape it does).

> > I'm curious, though: I realize it's in the hot path, and I realize
> > that there's an accretive cost to even small features, but given we're
> > already paying the lock cost and updating memory in what is presumably
> > the same cache line, would you expect this cost to be clearly
> > measurable?
>
> If you'd asked me ten years ago, I would have said "no, can't matter,"
> but Andres has subsequently demonstrated that a lot of things that I
> thought were well-optimized were actually able to be optimized a lot
> better than I thought possible, and some of them were in this area.
> Still, I think it's unlikely that your patch would have a measurable
> effect for the reasons that you state. Wouldn't hurt to test, though.

If we get past your other main concern I'd be happy to spin something
up to prove that out.

> As far as performance goes, I'm more concerned about Alvaro's patch.
> My concern with this one is more around whether it's too much of a
> kludge.

As far as the kludginess factor: do you think additional GUCs would
help clarify that? And/or are the earlier comments on the right path?

Thanks,
James Coleman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-01-18 14:53:05 Re: generic plans and "initial" pruning
Previous Message Jeevan Ladhe 2022-01-18 14:42:44 Re: refactoring basebackup.c