Re: PG in container w/ pid namespace is init, process exits cause restart

From: ilmari(at)ilmari(dot)org (Dagfinn Ilmari Mannsåker )
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PG in container w/ pid namespace is init, process exits cause restart
Date: 2021-05-03 21:13:29
Message-ID: 87eeeno70m.fsf@wibble.ilmari.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
>> On 2021-May-03, Andres Freund wrote:
>>> The issue turns out to be that postgres was in a container, with pid
>>> namespaces enabled. Because postgres was run directly in the container,
>>> without a parent process inside, it thus becomes pid 1. Which mostly
>>> works without a problem. Until, as the case here with the archive
>>> command, a sub-sub process exits while it still has a child. Then that
>>> child gets re-parented to postmaster (as init).
>
>> Hah .. interesting. I think we should definitely make this work, since
>> containerized stuff is going to become more and more prevalent.
>
> How would we make it "work"? The postmaster can't possibly be expected
> to know the right thing to do with unexpected children.
>
>> I guess we can do that in older releases, but do we really need it? As
>> I understand, the only thing we need to do is verify that the dying PID
>> is a backend PID, and not cause a crash cycle if it isn't.

> Maybe we should put in a startup-time check, analogous to the
> can't-run-as-root test, that the postmaster mustn't be PID 1.

Given that a number of minimal `init`s already exist specifically for
the case of running a single application in a container, I don't think
Postgres should to reinvent that wheel. A quick eyball of the output of
`apt search container init` on a Debian Bullseyse system reveals at
least four:

- https://github.com/Yelp/dumb-init
- https://github.com/krallin/tini
- https://github.com/fpco/pid1
- https://github.com/openSUSE/catatonit

The first one also explains why there's more to being PID 1 than just
handling reparented children.

- ilmari
--
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
the consequences of." -- Skud's Meta-Law

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-05-03 21:13:49 Re: PG in container w/ pid namespace is init, process exits cause restart
Previous Message Jeff Davis 2021-05-03 21:03:31 Re: MaxOffsetNumber for Table AMs