printf %s with NULL pointer (was Re: BUG #17098: Assert failed on composing an error message when adding a type to an extension being dropped)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: printf %s with NULL pointer (was Re: BUG #17098: Assert failed on composing an error message when adding a type to an extension being dropped)
Date: 2021-07-12 17:20:56
Message-ID: 2984422.1626110456@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

[ moved from -bugs list for more visibility ]

I wrote:
> However, that root issue is converted from a relatively minor bug into
> a server crash because snprintf.c treats a NULL pointer passed to %s
> as a crash-worthy error. I have advocated for that behavior in the
> past, but I'm starting to wonder if it wouldn't be wiser to change
> over to the glibc-ish behavior of printing "(null)" or the like.
> It seems like we've long since found all the interesting bugs that
> the assert-or-crash behavior could reveal, and now we're down to
> weird corner cases where its main effect is to weaken our robustness.

I did a little more thinking about this. I believe the strongest
argument for having snprintf.c crash on NULL is that it keeps us
from relying on having more-forgiving behavior in case we're using
platform-supplied *printf functions (cf commit 0c62356cc). However,
that is only relevant for code that's meant to go into pre-v12 branches,
since we stopped using libc's versions of these functions in v12.

So one plausible way to approach this is to say that we should wait
until v11 is EOL and then change it.

However, that feels overly conservative to me. I doubt that anyone
is *intentionally* relying on *printf not crashing on a NULL pointer.
For example, in the case that started this thread:

if (OidIsValid(oldExtension))
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("%s is already a member of extension \"%s\"",
getObjectDescription(&object, false),
get_extension_name(oldExtension))));

the problem is failure to consider the possibility that
get_extension_name could return NULL due to a just-committed
concurrent DROP EXTENSION. I'm afraid there are a lot of
corner cases like that still lurking.

So my feeling about this is that switching snprintf.c's behavior
would produce some net gain in robustness for v12 and up, while
not making things any worse for the older branches. I still hold
to the opinion that we've already flushed out all the cases of
passing NULL that we're likely to find via ordinary testing.

Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Pawel Kudzia 2021-07-12 17:24:29 Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Previous Message Tom Lane 2021-07-12 15:57:18 Re: BUG #17100: undefined reference to `pg_qsort and pq_xxx

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-07-12 18:49:12 Re: "debug_invalidate_system_caches_always" is too long
Previous Message gkokolatos 2021-07-12 16:46:29 Re: Introduce pg_receivewal gzip compression tests