Re: DSA_ALLOC_NO_OOM doesn't work

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DSA_ALLOC_NO_OOM doesn't work
Date: 2024-02-13 14:23:20
Message-ID: 6030bdec-0de1-4da2-b0b3-335007eae97f@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

(moving to pgsql-hackers)

On 29/01/2024 14:06, Heikki Linnakangas wrote:
> If you call dsa_allocate_extended(DSA_ALLOC_NO_OOM), it will still
> ereport an error if you run out of space (originally reported at [0]).
>
> Attached patch adds code to test_dsa.c to demonstrate that:
>
> postgres=# select test_dsa_basic();
> ERROR: could not resize shared memory segment "/PostgreSQL.1312700148"
> to 1075843072 bytes: No space left on device
>
> [0] https://github.com/pgvector/pgvector/issues/434#issuecomment-1912744489

I wrote the attached patch to address this, in a fairly straightforward
or naive way. The problem was that even though dsa_allocate() had code
to check the return value of dsm_create(), and return NULL instead of
ereport(ERROR) if the DSA_ALLOC_NO_OOM was set, dsm_create() does not in
fact return NULL on OOM. To fix, I added a DSM_CREATE_NO_OOM option to
dsm_create(), and I also had to punch that through to dsm_impl_op().

This is a little unpolished, but if we want to backpatch something
narrow now, this would probably be the right approach.

However, I must say that the dsm_impl_op() interface is absolutely
insane. It's like someone looked at ioctl() and thought, "hey that's a
great idea!". It mixes all different operations like creating or
destroying a DSM segment together into one function call, and the return
value is just a boolean, even though the function could fail for many
different reasons, and the callers do in fact care about the reason. In
a more natural interface, the different operations would have very
different function signatures.

I think we must refactor that. It might be best to leave this
DSA_ALLOC_NO_OOM bug unfixed in backpatches, and fix it on top of the
refactorings on master only. Later, we can backpatch the refactorings
too if we're comfortable with it; extensions shouldn't be using the
dsm_impl_op() interface directly.

(I skimmed through the thread where the DSM code was added, but didn't
see any mention of why dsm_impl_op() signature is like that:
https://www.postgresql.org/message-id/CA%2BTgmoaDqDUgt%3D4Zs_QPOnBt%3DEstEaVNP%2B5t%2Bm%3DFPNWshiPR3A%40mail.gmail.com)

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
0001-Add-test.patch text/x-patch 1.0 KB
0002-Fix-DSA_ALLOC_NO_OOM.patch text/x-patch 19.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-02-13 15:34:32 Re: BUG #18340: BitString may break nodetoString() conversion for a raw parse tree
Previous Message PG Bug reporting form 2024-02-13 09:00:00 BUG #18340: BitString may break nodetoString() conversion for a raw parse tree

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2024-02-13 14:34:46 Re: When extended query protocol ends?
Previous Message Ashutosh Bapat 2024-02-13 14:02:28 Re: Printing backtrace of postgres processes