Re: Extending SMgrRelation lifetimes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Extending SMgrRelation lifetimes
Date: 2023-09-18 15:19:36
Message-ID: CA+TgmobtcRkhDAx8UpSJDU6UtaTfTqxQ05WNrPpvrM6iTJYb8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I think that if you believe 0001 to be correct you should go ahead and
commit it sooner rather than later. If you're wrong and something
weird starts happening we'll then have a chance to notice that before
too much other stuff gets changed on top of this and confuses the
matter.

On Wed, Aug 23, 2023 at 12:55 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Right, let's one find one good place. I think smgropen() would be best.

I think it would be a good idea to give this comment a bit more oomph.
In lieu of this:

+ * This does not attempt to actually open the underlying files. The returned
+ * object remains valid at least until AtEOXact_SMgr() is called, or until
+ * smgrdestroy() is called in non-transactional backends.

I would leave the existing "This does not attempt to actually open the
underlying files." comment as a separate comment, and add something
like this as a new paragraph:

In versions of PostgreSQL prior to 17, this function returned an
object with no defined lifetime. Now, however, the object remains
valid for the lifetime of the transaction, up to the point where
AtEOXact_SMgr() is called, making it much easier for callers to know
for how long they can hold on to a pointer to the returned object. If
this function is called outside of a transaction, the object remains
valid until smgrdestroy() or smgrdestroyall() is called. Background
processes that use smgr but not transactions typically do this once
per checkpoint cycle.

Apart from that, the main thing that is bothering me is that the
justification for redefining smgrclose() to do what smgrrelease() now
does isn't really spelled out anywhere. You mentioned some reasons and
Heikki mentioned the benefit to extensions, but I feel like somebody
should be able to understand the reasoning clearly from reading the
commit message or the comments in the patch, rather than having to
visit the mailing list discussion, and I'm not sure we're there yet. I
feel like I understood why we were doing this and was convinced it was
a good idea at some point, but now the reasoning has gone out of my
head and I can't recover it. If somebody does smgropen() .. stuff ...
smgrclose() as in heapam_relation_copy_data() or index_copy_data(),
this change has the effect of making the SmgrRelation remain valid
until eoxact whereas before it would have been destroyed instantly. Is
that what we want? Presumably yes, or this patch wouldn't be shaped
like this, but I don't know why we want that...

Another thing that seems a bit puzzling is how this is intended to, or
does, interact with the ownership mechanism. Intuitively, I feel like
a Relation owns an SMgrRelation *because* the SMgrRelation has no
defined lifetime. But I think that's not quite right. I guess the
ownership mechanism doesn't guarantee anything about the lifetime of
the object, just that the pointer in the Relation won't hang around
any longer than the object to which it's pointing. So does that mean
that we're free to redefine the object lifetime to be pretty much
anything we want and that mechanism doesn't really care? Huh,
interesting.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-09-18 15:20:03 Re: Add 'worker_type' to pg_stat_subscription
Previous Message Matthias van de Meent 2023-09-18 14:50:13 Re: GenBKI emits useless open;close for catalogs without rows