Re: Potential ABI breakage in upcoming minor releases

From: Mats Kindahl <mats(at)timescale(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christoph Berg <myon(at)debian(dot)org>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Potential ABI breakage in upcoming minor releases
Date: 2024-11-15 13:21:40
Message-ID: CA+144253W6LPBZyOK9qHfghaTbpkK9h-ASwdCij0cNiP5K=Q3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 14, 2024 at 11:41 PM Noah Misch <noah(at)leadboat(dot)com> wrote:

> On Thu, Nov 14, 2024 at 09:33:32PM +0100, Alvaro Herrera wrote:
> > On 2024-Nov-14, Tom Lane wrote:
> > > Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > > > But timescale did crash:
> > >
> > > So what is timescale doing differently?
>
> > src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo *
> > src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel)
> > src/ts_catalog/catalog.c-{
> > src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo;
> > src/ts_catalog/catalog.c-
> > src/ts_catalog/catalog.c: resultRelInfo = makeNode(ResultRelInfo);
> > src/ts_catalog/catalog.c- resultRelInfo->ri_RangeTableIndex = 0; /*
> dummy */
> > src/ts_catalog/catalog.c- resultRelInfo->ri_RelationDesc = heapRel;
> > src/ts_catalog/catalog.c- resultRelInfo->ri_TrigDesc = NULL; /* we
> don't fire triggers */
> > src/ts_catalog/catalog.c-
> > src/ts_catalog/catalog.c- ExecOpenIndices(resultRelInfo, false);
>
> This is a copy of PostgreSQL's CatalogOpenIndexes(). Assuming timescaledb
> uses it like PostgreSQL uses CatalogOpenIndexes(), it's fine.
> Specifically,
> CatalogOpenIndexes() is fine since PostgreSQL does not pass the
> ResultRelInfo
> to ModifyTable.
>

This seems to be old code, so I have replaced it with CatalogOpenIndexes
(and friends) and it seems to work fine.

>
> > Oh, hmm, there's this bit which uses struct assignment into a stack
> > element:
>
> Yes, stack allocation of a ResultRelInfo sounds relevant to the crash. It
> qualifies as a "very unusual code pattern" in the postgr.es/c/e54a42a
> sense.
>

This is a lesson for us to make sure that we have better checks for these
kinds of situations. We should be able to deal with the extension by
allocating, e.g., twice the needed memory for the struct (and make sure to
zero it out). Then there is no risk of stepping on other object's data.

>
>
> I'm hearing the only confirmed impact on non-assert builds is the need to
> recompile timescaledb. (It's unknown whether recompiling will suffice for
> timescaledb. For assert builds, six PGXN extensions need recompilation.)

I re-compiled and ran with the same version on the server and extension and
that seems to work fine. Nothing that stands out. (We have a few other
things that behave strange, so might have to take that back.)

> I don't see us issuing another set of back branch releases for the purpose
> of
> making a v16.4-built timescaledb avoid a rebuild. The target audience
> would
> be someone who can get a new PostgreSQL build but can't get a new
> timescaledb
> build. That feels like a small audience. What's missing in that analysis?
>
> Going forward, for struct field additions, I plan to make my own patches
> more
> ABI-breakage-averse than the postgr.es/c/e54a42a standard.
>
>
>

--
Best wishes,
Mats Kindahl, Timescale

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-11-15 13:27:43 Re: POC, WIP: OR-clause support for indexes
Previous Message Mats Kindahl 2024-11-15 13:13:35 Re: Potential ABI breakage in upcoming minor releases