From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de>, Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish()) |
Date: | 2023-01-24 13:30:00 |
Message-ID: | CALj2ACUc=Hcd3X+mBNAF2o9Y3r=cxVoz4JBayBTz=jw74KopHQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 6, 2022 at 12:00 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi
Thanks for reviewing.
> FWIW, I don't see an advantage in 0003. If it allows us to make something else
> simpler / faster, cool, but on its own it doesn't seem worthwhile.
I've discarded this change.
> On 2022-12-02 16:31:58 -0800, Nathan Bossart wrote:
> > On Fri, Dec 02, 2022 at 04:32:38PM +0530, Bharath Rupireddy wrote:
> > > On Fri, Dec 2, 2022 at 6:10 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >> I'm not sure this is quite right - don't we need a memory barrier. But I don't
> > >> see a reason to not just leave this code as-is. I think this should be
> > >> optimized entirely in lwlock.c
> > >
> > > Actually, we don't need that at all as LWLockWaitForVar() will return
> > > immediately if the lock is free. So, I removed it.
> >
> > I briefly looked at the latest patch set, and I'm curious how this change
> > avoids introducing memory ordering bugs. Perhaps I am missing something
> > obvious.
>
> I'm a bit confused too - the comment above talks about LWLockWaitForVar(), but
> the patches seem to optimize LWLockUpdateVar().
>
> I think it'd be safe to optimize LWLockConflictsWithVar(), due to some
> pre-existing, quite crufty, code. LWLockConflictsWithVar() says:
>
> * Test first to see if it the slot is free right now.
> *
> * XXX: the caller uses a spinlock before this, so we don't need a memory
> * barrier here as far as the current usage is concerned. But that might
> * not be safe in general.
>
> which happens to be true in the single, indirect, caller:
>
> /* Read the current insert position */
> SpinLockAcquire(&Insert->insertpos_lck);
> bytepos = Insert->CurrBytePos;
> SpinLockRelease(&Insert->insertpos_lck);
> reservedUpto = XLogBytePosToEndRecPtr(bytepos);
>
> I think at the very least we ought to have a comment in
> WaitXLogInsertionsToFinish() highlighting this.
Done.
> It's not at all clear to me that the proposed fast-path for LWLockUpdateVar()
> is safe. I think at the very least we could end up missing waiters that we
> should have woken up.
>
> I think it ought to be safe to do something like
>
> pg_atomic_exchange_u64()..
> if (!(pg_atomic_read_u32(&lock->state) & LW_FLAG_HAS_WAITERS))
> return;
>
> because the pg_atomic_exchange_u64() will provide the necessary memory
> barrier.
Done.
I'm attaching the v3 patch with the above review comments addressed.
Hopefully, no memory ordering issues now. FWIW, I've added it to CF
https://commitfest.postgresql.org/42/4141/.
Test results with the v3 patch and insert workload are the same as
that of the earlier run - TPS starts to scale at higher clients as
expected after 512 clients and peaks at 2X with 2048 and 4096 clients.
HEAD:
1 1380.411086
2 1358.378988
4 2701.974332
8 5925.380744
16 10956.501237
32 20877.513953
64 40838.046774
128 70251.744161
256 108114.321299
512 120478.988268
768 99140.425209
1024 93645.984364
2048 70111.159909
4096 55541.804826
v3 PATCHED:
1 1493.800209
2 1569.414953
4 3154.186605
8 5965.578904
16 11912.587645
32 22720.964908
64 42001.094528
128 78361.158983
256 110457.926232
512 148941.378393
768 167256.590308
1024 155510.675372
2048 147499.376882
4096 119375.457779
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Optimize-WAL-insertion-lock-acquisition-and-relea.patch | application/octet-stream | 9.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Geier | 2023-01-24 13:30:34 | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
Previous Message | Dean Rasheed | 2023-01-24 13:10:03 | to_hex() for negative inputs |