Re: Eliminate redundant tuple visibility check in vacuum

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, David Geier <geidav(dot)pg(at)gmail(dot)com>
Subject: Re: Eliminate redundant tuple visibility check in vacuum
Date: 2023-09-07 19:29:48
Message-ID: CA+TgmoZWRSHgz+a9Dz2W==xRK9unxA9wcBEF0PWRw-cJXfkrgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 7, 2023 at 3:10 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> I can fix it by changing the type of PruneResult->off_loc to be an
> OffsetNumber pointer. This does mean that PruneResult will be
> initialized partially by heap_page_prune() callers. I wonder if you
> think that undermines the case for making a new struct.

I think that it undermines the case for including that particular
argument in the struct. It's not really a Prune*Result* if the caller
initializes it in part. It seems fairly reasonable to still have a
PruneResult struct for the other stuff, though, at least to me. How do
you see it?

(One could also argue that this is a somewhat more byzantine way of
doing error reporting than would be desirable, but fixing that problem
doesn't seem too straightforward so perhaps it's prudent to leave it
well enough alone.)

> I still want to eliminate the NULL check of off_loc in
> heap_page_prune() by making it a required parameter. Even though
> on-access pruning does not have an error callback mechanism which uses
> the offset, it seems better to have a pointless local variable in
> heap_page_prune_opt() than to do a NULL check for every tuple pruned.

It doesn't seem important to me unless it improves performance. If
it's just stylistic, I don't object, but I also don't really see a
reason to care.

> > I haven't run the regression suite with 0001 applied. I'm guessing
> > that you did, and that they passed, which if true means that we don't
> > have a test for this, which is a shame, although writing such a test
> > might be a bit tricky. If there is a test case for this and you didn't
> > run it, woops.
>
> There is no test coverage for the vacuum error callback context
> currently (tests passed for me). I looked into how we might add such a
> test. First, I investigated what kind of errors might occur during
> heap_prune_satisfies_vacuum(). Some of the multixact functions called
> by HeapTupleSatisfiesVacuumHorizon() could error out -- for example
> GetMultiXactIdMembers(). It seems difficult to trigger the errors in
> GetMultiXactIdMembers(), as we would have to cause wraparound. It
> would be even more difficult to ensure that we hit those specific
> errors from a call stack containing heap_prune_satisfies_vacuum(). As
> such, I'm not sure I can think of a way to protect future developers
> from repeating my mistake--apart from improving the comment like you
> mentioned.

004_verify_heapam.pl has some tests that intentionally corrupt pages
and then use pg_amcheck to detect the corruption. Such an approach
could probably also be used here. But it's a pain to get such tests
right, because any change to the page format due to endianness,
different block size, or whatever can make carefully-written tests go
boom.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2023-09-07 19:33:57 Re: Document that server will start even if it's unable to open some TCP/IP ports
Previous Message Robert Haas 2023-09-07 19:14:50 Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()