Re: Crash during backend start when low on memory

From: Mats Kindahl <mats(at)timescale(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, daniel(at)yesql(dot)se, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Crash during backend start when low on memory
Date: 2023-01-16 20:52:41
Message-ID: CA+14426M6j-YrkNLu9Hsjerbh3nXaz1gQ-Di+L+EaDoZtkdMVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Jan 13, 2023 at 4:01 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > On 2023-Jan-13, Mats Kindahl wrote:
> >> I have an improved patch based on raising an error when malloc fails.
>
> > Hmm, but I'm not sure I like what you did to BackendStartup:
>
> Yeah, the BackendStartup change is 100% wrong; it is replacing
> perfectly good code that recovers correctly with bad code that
> will take down the postmaster (not a backend child!) on OOM.
>

AFAICT, the error is caught by the caller (using PG_TRY), executes some
cleanup code, and then continues executing, so it shouldn't take down the
postmaster.

>
> By and large I don't like anything about this patch. You could
> get the same results (i.e. elog(ERROR) on OOM) by replacing the
> malloc calls with pallocs.
>

Well... as Alvaro pointed out, just replacing it with palloc will blow out
part of the structure being allocated for the port information when the
backend starts running.

However, continuing on an approach using memory contexts instead of
directly allocating memory, something like this could be done:

1. There is already a memory context PostmasterContext for the
postmaster startup code, so only allocate stuff here that is not needed
after a backend has started.
2. Add a new memory context for "backend startup" (say,
BackendStartupContext) and use that for allocating stuff that should exist
after the backend starts. For example, the port information. (It might be
possible to move more information here, such as the HBA structures, which
would make it clearer that this is startup memory for the backend. Not sure
if there are some potential issues with this.)
3. Create the BackendStartupContext when starting the backend.
4. Replace previous malloc and friends with palloc in the
PostmasterContext (where suitable).
5. Replace allocation of port structure using malloc with allocation in
BackendStartupContext
6. When the backend starts, it would toss the memory context from the
postmaster, but that would leave the port structure intact. The backend
already tosses the PostmasterContext, but not until after it has done
authentication.
7. When the postmaster has spawned the backend, it should toss the
BackendStartupContext since it is not relevant any more.
8. Since creating the BackendStartupContext can fail with an error, we
need to wrap this in a PG_TRY block. Otherwise the postmaster will exit on
error.

This provides a few advantages compared to allocating memory using malloc
and friends:

1. We allocate memory for the postmaster context in one place, when the
actual startup is running. If we're low on memory at this point, we can
abort early.
2. We allocate memory for the backend startup in one place, when
starting the backend. If we're low on memory at this point, we can abort
early.
3. Code is significantly easier to read and will not require a lot of
checks during startup.

I have a tentative patch, but it contains some problems so I will solve
these first.

Best wishes,
Mats Kindahl

> regards, tom lane
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2023-01-16 20:58:43 Re: Crash during backend start when low on memory
Previous Message Pavel Stehule 2023-01-16 20:34:51 Re: BUG #17698: On SIGTERM, psql terminates, but leaves the statement running