Re: Out-of-memory error reports in libpq

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Out-of-memory error reports in libpq
Date: 2021-07-27 23:29:28
Message-ID: 04C69A6E-0619-43B5-AD1E-0D2334E9D904@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/27/21, 3:41 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> The first half of that just saves a few hundred bytes of repetitive
> coding. However, I think that the addition of recovery logic is
> important for robustness, because as things stand libpq may be
> worse off than before for OOM handling. Before ffa2e4670, almost
> all of these call sites did printfPQExpBuffer(..., "out of memory").
> That would automatically clear the message buffer to empty, and
> thereby be sure to report the out-of-memory failure if at all
> possible. Now we might fail to report the thing that the user
> really needs to know to make sense of what happened.

IIUC, before ffa2e4670, callers mainly used printfPQExpBuffer(), which
always cleared the buffer before attempting to append the OOM message.
With ffa2e4670 applied, callers always attempt to append the OOM
message without resetting the buffer first. With this new change,
callers will attempt to append the OOM message without resetting the
buffer first, but if that fails, we fall back to the original behavior
before ffa2e4670.

+ if (PQExpBufferBroken(errorMessage))
+ {
+ resetPQExpBuffer(errorMessage);
+ appendPQExpBufferStr(errorMessage, msg);
+ }

I see that appendPQExpBufferStr() checks whether the buffer is broken
by way of enlargePQExpBuffer(), so the fallback steps roughly match
the calls to printfPQExpBuffer() before ffa2e4670.

- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("out of memory allocating GSSAPI buffer (%d)\n"),
- payloadlen);
+ pqReportOOM(conn);

I see that some context is lost in a few places (e.g., the one above
points to a GSSAPI buffer). Perhaps this extra context could be
useful to identify problematic areas, but it might be unlikely to help
much in these parts of libpq. In any case, the vast majority of
existing callers don't provide any extra context.

Overall, the patch looks good to me.

> Therefore, I feel like this was an oversight in ffa2e4670,
> and we ought to back-patch the attached into v14.

Back-patching to v14 seems reasonable to me.

Nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-07-27 23:39:02 Replace l337sp34k in comments.
Previous Message John W Higgins 2021-07-27 23:08:22 Re: Have I found an interval arithmetic bug?