Re: WAL Insertion Lock Improvements

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WAL Insertion Lock Improvements
Date: 2023-07-21 05:59:17
Message-ID: ZLoetVXkyvS5lryz@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 20, 2023 at 02:38:29PM +0530, Bharath Rupireddy wrote:
> On Fri, Jul 14, 2023 at 4:17 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>> I think this commit does too many things at once.
>
> I've split the patch into three - 1) Make insertingAt 64-bit atomic.
> 2) Have better commenting on why there's no memory barrier or spinlock
> in and around LWLockWaitForVar call sites. 3) Have a quick exit for
> LWLockUpdateVar.

FWIW, I was kind of already OK with 0001, as it shows most of the
gains observed while 0003 had a limited impact:
https://www.postgresql.org/message-id/CALj2ACWgeOPEKVY-TEPvno%3DVatyzrb-vEEP8hN7QqrQ%3DyPRupA%40mail.gmail.com

It is kind of a no-brainer to replace the spinlocks with atomic reads
and writes there.

>> I don't think:
>> * NB: LWLockConflictsWithVar (which is called from
>> * LWLockWaitForVar) relies on the spinlock used above in this
>> * function and doesn't use a memory barrier.
>>
>> helps to understand why any of this is safe to a meaningful degree.
>>
>> The existing comments aren't obviously aren't sufficient to explain this, but
>> the reason it's, I think, safe today, is that we are only waiting for
>> insertions that started before WaitXLogInsertionsToFinish() was called. The
>> lack of memory barriers in the loop means that we might see locks as "unused"
>> that have *since* become used, which is fine, because they only can be for
>> later insertions that we wouldn't want to wait on anyway.
>
> Right.

FWIW, I always have a hard time coming back to this code and see it
rely on undocumented assumptions with code in lwlock.c while we need
to keep an eye in xlog.c (we take a spinlock there while the variable
wait logic relies on it for ordering @-@). So the proposal of getting
more documentation in place via 0002 goes in the right direction.

>> This comment seems off to me. Using atomics doesn't guarantee not needing
>> locking. It just guarantees that we are reading a non-torn value.
>
> Modified the comment.

- /*
- * Read value using the lwlock's wait list lock, as we can't generally
- * rely on atomic 64 bit reads/stores. TODO: On platforms with a way to
- * do atomic 64 bit reads/writes the spinlock should be optimized away.
- */
- LWLockWaitListLock(lock);
- value = *valptr;
- LWLockWaitListUnlock(lock);
+ /* Reading atomically avoids getting a torn value */
+ value = pg_atomic_read_u64(valptr);

Should this specify that this is specifically important for platforms
where reading a uint64 could lead to a torn value read, if you apply
this term in this context? Sounding picky, I would make that a bit
longer, say something like that:
"Reading this value atomically is safe even on platforms where uint64
cannot be read without observing a torn value."

Only xlogprefetcher.c uses the term "torn" for a value by the way, but
for a write.

>>> @@ -1605,9 +1602,14 @@ LWLockConflictsWithVar(LWLock *lock,
>>> *
>>> * Note: this function ignores shared lock holders; if the lock is held
>>> * in shared mode, returns 'true'.
>>> + *
>>> + * Be careful that LWLockConflictsWithVar() does not include a memory barrier,
>>> + * hence the caller of this function may want to rely on an explicit barrier or
>>> + * a spinlock to avoid memory ordering issues.
>>> */
>>
>> s/careful/aware/?
>>
>> s/spinlock/implied barrier via spinlock or lwlock/?
>
> Done.

Okay to mention a LWLock here, even if the sole caller of this routine
relies on a spinlock.

0001 looks OK-ish seen from here. Thoughts?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2023-07-21 06:05:55 Re: Synchronizing slots from primary to standby
Previous Message Bharath Rupireddy 2023-07-21 05:54:08 Re: Support worker_spi to execute the function dynamically.