RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

From: "Sisson, David" <David(dot)Sisson(at)dell(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christophe Pettus <xof(at)thebuild(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, "Howell, Stephen" <Stephen(dot)Howell(at)dell(dot)com>, "Sisson, David" <David(dot)Sisson(at)dell(dot)com>
Subject: RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes
Date: 2023-01-23 21:41:15
Message-ID: LV2PR19MB576597174A60B2687D79534B8EC89@LV2PR19MB5765.namprd19.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The controllers generally always pull in the latest PostgreSQL.
It is easy to get the latest version with PostgreSQL updated.

Unfortunately, getting a bug fix is a lot harder.
One controller currently holding this defect for over a year with no end in sight.

Found this:
https://github.com/opencontainers/runtime-spec/issues/1050

Looks like a PR exists for it but the solution is invalid.
https://github.com/kailun-qin/runtime-spec/commit/a6505339204535150260d8e4f0bc112628f1fa87

More info:
https://www.postgresql.org/message-id/flat/20200218093240.jd3lgoxmisyl2tt5%40localhost#61c2c7fc3d3dd80512c9130b6967be16

It would be nice if "try" worked as expected.
I totally understand it is not a PostgreSQL issue but any assistance would be very appreciated.

Thanks,
David Angel

Internal Use - Confidential

-----Original Message-----
From: Andres Freund <andres(at)anarazel(dot)de>
Sent: Monday, January 23, 2023 3:10 PM
To: Sisson, David
Cc: Tom Lane; Christophe Pettus; Tomas Vondra; pgsql-bugs(at)lists(dot)postgresql(dot)org; Howell, Stephen
Subject: Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

[EXTERNAL EMAIL]

Hi,

On 2023-01-23 20:35:17 +0000, Sisson, David wrote:
> A quick and dirty solution could be to alter initdb to catch the exception and retry using a copy of the sample with "huge_pages=false".
> Would that be acceptable?

This is a kubernetes or postgres-operator bug (setting up the wrong cgroup limit, which the docs explicitly warn against doing). I don't think we want to accumulate workarounds like that in postgres.

> Passing in a config setting into initdb would still require a rebuild of all controllers.
> That could take months to years at best.

Huh. I don't know anything about the controller, but that seems problematic independent of this specific issue. And you'd still need to deploy a new version of postgres to get such changes...

> Internal Use - Confidential

Hardly.

Greetings,

Andres Freund

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Sam.Mesh 2023-01-23 22:46:59 Re: index not used for bigint without explicit cast
Previous Message Andres Freund 2023-01-23 21:10:13 Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes