Re: MAX_BACKENDS size (comment accuracy)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jacob Brazeal <jacob(dot)brazeal(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MAX_BACKENDS size (comment accuracy)
Date: 2025-01-26 18:10:35
Message-ID: wptizm4qt6yikgm2pt52xzyv6ycmqiutloyvypvmagn7xvqkce@d4xuv3mylpg4
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-01-25 16:06:29 -0800, Jacob Brazeal wrote:
> > In lwlocks.c, we have the following comment, related to LWLock state:
> >
> >
> > */* Must be greater than MAX_BACKENDS - which is 2^23-1, so we're fine.
> > */#define LW_SHARED_MASK ((uint32) ((1 << 24)-1))*
> >
> > However, MAX_BACKENDS is set to 2^18-1. Here is the comment in
> > postmaster.h:

> > */* * Note: MAX_BACKENDS is limited to 2^18-1 because that's the width
> > reserved * for buffer references in buf_internals.h. This limitation could
> > be lifted * by using a 64bit state; but it's unlikely to be worthwhile as
> > 2^18-1 * backends exceed currently realistic configurations. Even if that
> > limitation * were removed, we still could not a) exceed 2^23-1 because
> > inval.c stores * the ProcNumber as a 3-byte signed integer, b) INT_MAX/4
> > because some places * compute 4*MaxBackends without any overflow check.
> > This is rechecked in the * relevant GUC check hooks and in
> > RegisterBackgroundWorker(). */#define MAX_BACKENDS 0x3FFFF*
> >
> > 2^23-1 is noted as an additional upper limit, but I wonder if it'd be
> > correct to update the comment in lwlocks.c to

> Thinking a bit further about this, the purpose of the LW_SHARED_MASK
> section of the state is to count the number of lock-sharers. Thus, we only
> care about the actual number of backends (up to 2^18-1) here and not the
> size of the ProcNumber data type. So I do think the comment should read
> 2^18-1 and not 2^23-1. Here is a patch to that effect.

At the time that comment was written MAX_BACKENDS was 2^23-1. Alexander and I
lowered MAX_BACKENDS in 48354581a49c to 2^18-1 and apparently didn't think
about the comment in lwlock.h.

commit 48354581a49c30f5757c203415aa8412d85b0f70
Author: Andres Freund <andres(at)anarazel(dot)de>
Date: 2016-04-10 20:12:32 -0700

Allow Pin/UnpinBuffer to operate in a lockfree manner.
...

To allow atomic operations to be used, cram BufferDesc's flags,
usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable;
that allows to manipulate them together using 32bit compare-and-swap
operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could
be lifted by using a 64bit field, but it's not a realistic configuration
atm).

> > */* Must be greater than MAX_BACKENDS - which is 2^18-1, so we're fine. */*
> >
> > I'm not sure if this could lead to us actually saving some bits in the
> > lwlock state, or if we could do anything useful with them anyway.

It's not quite enough bits to be useful I think. If we could free 16 bits (or
we defined LWLock.tranche to be narrower), we could store the tranche as part
of the atomic, and save 4 bytes (2 bytes of alignment padding, 2 bytes for the
tranche). But unless we reduce the size of the tranche field a decent bit
there's not enough space.

I'd really like to reduce the size of struct LWLock, but I think that'll need
a bit more changes. I think we should use a single 64bit atomic and store the
list of waiters inside the atomic.

flags: 3
exclusively locked: 1 bit
share locks: 18 bits
head of wait list: 18 bits (proc number offset)
tail of wait list: 18 bits (proc number offset)
= 58

That doesn't leave enough space for the tranche unfortunately. But if we
reduced MAX_BACKENDS to 2^16-1 and reduced the size of the tranche to to 12
bits, it'd work!

I continue to believe that MAX_BACKENDS of 2^16-1 would be sufficient - we're
far from that being a realistic limit. Halfing the size of LWLock and laying
the ground work for making the wait-list lock-free imo would be very well
worth the reduction in an unrealistic limit...

Anyway, that's really just a periodic rant / project suggestion...

> diff --git a/src/backend/storage/lmgr/lwlock.c
> b/src/backend/storage/lmgr/lwlock.c index 2f558ffea1..d3a2619072 100644 ---
> a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c
> @@ -99,7 +99,7 @@ #define LW_VAL_SHARED 1
>
> #define LW_LOCK_MASK ((uint32) ((1 << 25)-1))
> -/* Must be greater than MAX_BACKENDS - which is 2^23-1, so we're fine. */
> +/* Must be greater than MAX_BACKENDS - which is 2^18-1, so we're fine. */
> #define LW_SHARED_MASK ((uint32) ((1 << 24)-1))
>
> StaticAssertDecl(LW_VAL_EXCLUSIVE > (uint32) MAX_BACKENDS,

I'm inclined to think that we should adjust the various constants at the same
time as the comment? It's imo somewhat weird to have bits LW_SHARED_MASK that
can never be set...

I wonder if we should instead define LW_VAL_EXCLUSIVE and LW_SHARED_MASK using
MAX_BACKENDS and have a static assertion ensuring it doesn't overlap with flag
bits? That way we don't have two sets of defines to keep in sync.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-01-26 18:15:11 Re: MAX_BACKENDS size (comment accuracy)
Previous Message Yura Sokolov 2025-01-26 18:09:32 Re: New process of getting changes into the commitfest app