Re: AIO writes vs hint bits vs checksums

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Noah Misch <noah(at)leadboat(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: AIO writes vs hint bits vs checksums
Date: 2024-10-30 18:11:52
Message-ID: yvcwpovzlfmmrnwlnlwb3g26fcogluolgjrqarotwxxxi4z2o4@5vx7p3lczkpf
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-10-30 10:47:35 -0700, Jeff Davis wrote:
> On Tue, 2024-09-24 at 11:55 -0400, Andres Freund wrote:
> > What I suspect we might want instead is something inbetween a share
> > and an
> > exclusive lock, which is taken while setting a hint bit and which
> > conflicts
> > with having an IO in progress.
>
> I am starting to wonder if a shared content locks are really the right
> concept at all. It makes sense for simple mutexes, but we are already
> more complex than that, and you are suggesting adding on to that
> complexity.

What I am proposing isn't making the content lock more complicated, it's
orthogonal to the content lock.

> Which I agree is a good idea, I'm just wondering if we could go even
> further.
>
> The README states that a shared lock is necessary for visibility
> checking, but can we just be more careful with the ordering and
> atomicity of visibility changes in the page?
>
> * carefully order reads and writes of xmin/xmax/hints (would
> that create too many read barriers in the tqual.c code?)
> * write line pointer after tuple is written

It's possible, but it'd be a lot of work. And you wouldn't need to just do
this for heap, but all the indexes too, to make progress on the
don't-set-hint-bits-during-io front. So I don't think it makes sense to tie
these things together.

I do think that it's an argument for not importing all the complexity into
lwlock.c though.

> We would still have pins and cleanup locks to prevent data removal.

As-is cleanup locks only work in coordination with content locks. While
cleanup is ongoing we need to prevent anybody from starting to look at the
page - without acquiring something like a shared lock that's not easy.

> We'd have the logic you suggest that would prevent modification during
> IO. And there would still need to be an exclusive content locks so that
> two inserts don't try to allocate the same line pointer, or lock the
> same tuple.
>
> If PD_ALL_VISIBLE is set it's even simpler.
>
> Am I missing some major hazards?

I don't think anything fundamental, but it's a decidedly nontrivial amount of
work.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Пополитов Владлен 2024-10-30 19:11:20 Re: [PATCH] Add array_reverse() function
Previous Message Jesper Pedersen 2024-10-30 18:03:48 Re: protocol-level wait-for-LSN