From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Emit fewer vacuum records by reaping removable tuples during pruning |
Date: | 2024-01-09 16:35:34 |
Message-ID: | CAAKRu_YK8MwE+u8R=TRfCWv5un483BopLuNOEQcBuoXacjCPoQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jan 8, 2024 at 3:51 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Jan 5, 2024 at 3:34 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
>
> This part of 0002 makes me very, very uncomfortable:
>
> + /*
> + * Update all line pointers per the record, and repair
> fragmentation.
> + * We always pass no_indexes as true, because we don't
> know whether or
> + * not this option was used when pruning. This reduces
> the validation
> + * done on replay in an assert build.
> + */
> + heap_page_prune_execute(buffer, true,
>
> redirected, nredirected,
> nowdead, ndead,
>
> nowunused, nunused);
>
> The problem that I have with this is that we're essentially saying
> that it's ok to lie to heap_page_prune_execute because we know how
> it's going to use the information, and therefore we know that the lie
> is harmless. But that's not how things are supposed to work. We should
> either find a way to tell the truth, or change the name of the
> parameter so that it's not a lie, or change the function so that it
> doesn't need this parameter in the first place, or something. I can
> occasionally stomach this sort of lying as a last resort when there's
> no realistic way of adapting the code being called, but that's surely
> not the case here -- this is a newborn parameter, and it shouldn't be
> a lie on day 1. Just imagine if some future developer thought that the
> no_indexes parameter meant that the relation actually didn't have
> indexes (the nerve of them!).
I agree that this is an issue.
The easiest solution would be to change the name of the parameter to
heap_page_prune_execute()'s from "no_indexes" to something like
"validate_unused", since it is only used in assert builds for
validation.
However, though I wish a name change was the right way to solve this
problem, my gut feeling is that it is not. It seems like we should
rely only on the WAL record itself in recovery. Right now the
parameter is used exclusively for validation, so it isn't so bad. But
what if someone uses this parameter in the future in heap_xlog_prune()
to decide how to modify the page?
It seems like the right solution would be to add a flag into the prune
record indicating what to pass to heap_page_prune_execute(). In the
future, I'd like to add flags for updating the VM to each of the prune
and vacuum records (eliminating the separate VM update record). Thus,
a new flags member of the prune record could have future use. However,
this would add a uint8 to the record. I can go and look for some
padding if you think this is the right direction?
- Melanie
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-01-09 16:35:46 | Re: abi-compliance-checker |
Previous Message | Peter Eisentraut | 2024-01-09 16:29:22 | Re: pg_dump: Remove obsolete trigger support |