Re: Disabling Heap-Only Tuples

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disabling Heap-Only Tuples
Date: 2023-09-19 10:26:36
Message-ID: 202309191026.khrawpnb54yg@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2023-Sep-18, Robert Haas wrote:

> On Tue, Sep 5, 2023 at 11:15 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> > I don't think that is a good comparison. While most people probably
> > never need to touch "local_update_limit", "work_mem" is something everybody
> > has to consider.
> >
> > And it is not so hard to tune: the setting would be the desired table
> > size, and you could use pgstattuple to find a good value.
>
> What I suspect would happen, though, is that you'd end up tuning the
> value over and over. You'd set it to some value and after some number
> of vacuums maybe you'd realize that you could save even more disk
> space if you reduced it a bit further or maybe your data set would
> grow a bit and you'd have to increase it a little (or a lot). And if
> you didn't keep adjusting it then maybe something quite bad would
> happen to your database.

As I understand it, the setting being proposed is useful as an emergency
for removing excessive bloat -- a substitute for VACUUM FULL when you
don't want to lock the table for long. Trying to use it as a permanent
gadget is going to be misguided. So my first thought is that we should
tell people to use it that way: if you're not in the irrecoverable-space
situation, just do not use this. Then we don't have to worry about
people misusing it the way you imagine.

Second, I think we should make it auto-reset. That is, have the user
set some value; later, when some condition triggers (say, the table size
is 1.2x the limit value you configured), then the local_update_limit is
automatically removed from the table options. From that point onwards,
the table is operated normally.

This removes the other concern that makes the system behaves
suboptimally because some DBA in the past decade left this set for no
good reason: if you run into an emergency, then you activate the
emergency escape hatch, and it will close on its own as soon as the
emergency is over.

This also dissuades people from using it for these other things you
describe. It just won't work.

The point here is that third-party tools such as pg_repack or pg_squeeze
exist, which work in a way we don't like, yet we offer no alternative.
This proposal is a mechanism that essentially replaces those tools with
a simple in-core feature, without having to include the tool itself in
core.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Thou shalt check the array bounds of all strings (indeed, all arrays), for
surely where thou typest "foo" someone someday shall type
"supercalifragilisticexpialidocious" (5th Commandment for C programmers)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2023-09-19 10:36:11 Re: pg16: XX000: could not find pathkey item to sort
Previous Message Aleksander Alekseev 2023-09-19 10:23:54 Re: Bug fix for psql's meta-command \ev