Re: Lowering the ever-growing heap->pd_lower

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lowering the ever-growing heap->pd_lower
Date: 2021-03-09 21:54:17
Message-ID: CAH2-WznB5ULqV36gHYD8EOrdBa6QZcDe8cenYCi0FiewqyN7fQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 9, 2021 at 1:36 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> writes:
> > The only two existing mechanisms that I could find (in the access/heap
> > directory) that possibly could fail on shrunken line pointer arrays;
> > being xlog recovery (I do not have enough knowledge on recovery to
> > determine if that may touch pages that have shrunken line pointer
> > arrays, or if those situations won't exist due to never using dirtied
> > pages in recovery) and backwards table scans on non-MVCC snapshots
> > (which would be fixed in the attached patch).
>
> I think you're not visualizing the problem properly.
>
> The case I was concerned about back when is that there are various bits of
> code that may visit a page with a predetermined TID in mind to look at.
> An index lookup is an obvious example, and another one is chasing an
> update chain's t_ctid link. You might argue that if the tuple was dead
> enough to be removed, there should be no such in-flight references to
> worry about, but I think such an assumption is unsafe. There is not, for
> example, any interlock that ensures that a process that has obtained a TID
> from an index will have completed its heap visit before a VACUUM that
> subsequently removed that index entry also removes the heap tuple.
>
> So, to accept a patch that shortens the line pointer array, what we need
> to do is verify that every such code path checks for an out-of-range
> offset before trying to fetch the target line pointer. I believed
> back in 2007 that there were, or once had been, code paths that omitted
> such a range check, assuming that they could trust the TID they had
> gotten from $wherever to point at an extant line pointer array entry.
> Maybe things are all good now, but I think you should run around and
> examine every place that checks for tuple deadness to see if the offset
> it used is known to be within the current page bounds.

It occurs to me that we should also mark the hole in the middle of the
page (which includes the would-be LP_UNUSED line pointers at the end
of the original line pointer array space) as undefined to Valgrind
within PageRepairFragmentation(). This is not to be confused with
marking them inaccessible to Valgrind, which just poisons the bytes.
Rather, it represents that the bytes in question are considered safe
to copy around but not safe to rely on being any particular value. So
Valgrind will complain if the bytes in question influence control
flow, directly or indirectly.

Obviously the code should also be audited. Even then, there may still
be bugs. I think that we need to bite the bullet here -- line pointer
bloat is a significant problem.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-09 22:10:26 Re: Lowering the ever-growing heap->pd_lower
Previous Message Mark Dilger 2021-03-09 21:47:18 Re: Lowering the ever-growing heap->pd_lower