undersized unions

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: undersized unions
Date: 2023-02-04 13:07:08
Message-ID: 20230204130708.pta7pjc4dvu225ey@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

gcc warns about code like this:

typedef union foo
{
int i;
long long l;
} foo;

foo * assign(int i) {
foo *p = (foo *) __builtin_malloc(sizeof(int));
p->i = i;

return p;
}

<source>: In function 'assign':
<source>:9:6: warning: array subscript 'foo[0]' is partly outside array bounds of 'unsigned char[4]' [-Warray-bounds=]
9 | p->i = i;
| ^~
<source>:8:22: note: object of size 4 allocated by '__builtin_malloc'
8 | foo *p = (foo *) __builtin_malloc(sizeof(int));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compiler returned: 0

I can't really tell if gcc is right or wrong wrong to warn about
this. On the one hand it's a union, and we only access the element that
is actually backed by memory, on the other hand, the standard does say
that the size of a union is the largest element, so we are pointing to
something undersized.

We actually have a fair amount of code like that, but currently are
escaping most of the warnings, because gcc doesn't know that palloc() is
an allocator. With more optimizations (particularly with LTO), we end up
with more of such warnings. I'd like to annotate palloc so gcc
understands the size, as that does help to catch bugs when confusing the
type. It also helps static analyzers.

An example of such code in postgres:

../../home/andres/src/postgresql/src/backend/utils/adt/numeric.c: In function 'make_result_opt_error':
../../home/andres/src/postgresql/src/backend/utils/adt/numeric.c:7628:23: warning: array subscript 'struct NumericData[0]' is partly outside array bounds of 'unsigned char[6]' [-Warray-bounds=]
7628 | result->choice.n_header = sign;
| ^~
../../home/andres/src/postgresql/src/backend/utils/adt/numeric.c:7625:36: note: object of size 6 allocated by 'palloc'
7625 | result = (Numeric) palloc(NUMERIC_HDRSZ_SHORT);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~

Given that Numeric is defined as:

struct NumericData
{
int32 vl_len_; /* varlena header (do not touch directly!) */
union NumericChoice choice; /* choice of format */
};

and
#define NUMERIC_HDRSZ_SHORT (VARHDRSZ + sizeof(uint16))

Here I can blame gcc even less - result is indeed not a valid pointer to
struct NumericData, because sizeof(NumericData) is 8, not 6. I suspect
it's actually undefined behaviour to ever dereference a Numeric pointer,
when the pointer points to something smaller than sizeof(NumericData).

Greetings,

Andres Freund

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-02-04 13:31:23 Re: [PATCH] Compression dictionaries for JSONB
Previous Message Andres Freund 2023-02-04 13:01:55 Re: Exit walsender before confirming remote flush in logical replication