[PATCH] Fix crash in int8_avg_combine().

From: Hadi Moshayedi <hadi(at)moshayedi(dot)net>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Andres Freund <andres(at)citusdata(dot)com>, Ozgun Erdogan <ozgun(at)citusdata(dot)com>, Sumedh Pathak <sumedh(at)citusdata(dot)com>
Subject: [PATCH] Fix crash in int8_avg_combine().
Date: 2017-11-26 03:43:49
Message-ID: CAK=1=WpZLa3Z9txmqzX8ujzXnUHE7_-jWxJBr7d7be+-9Afw6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While doing some tests on REL_10_STABLE, I was getting run-time exceptions
at int8_avg_combine() at the following line:

state1->sumX = state2->sumX;

After some debugging, I noticed that palloc()’s alignment is 8-bytes, while
this statement (which moves a __int128 from one memory location to another
memory location) expects 16-byte memory alignments. So when either state1
or state2 is not 16-byte aligned, this crashes.

When I disassemble the code, the above statement is translated to a pair of
movdqa and movaps assignments when compiled with -O2:

movdqa c(%rdx), %xmm0
movaps %xmm0, c(%rcx)

Looking at “Intel 64 and IA-32 Architectures Software Developer’s Manual,
Volume 2”, both of these instructions expect 16-byte aligned memory
locations, or a general-protection exception will be generated.

I wasn’t getting the exception always. For example, the same query that
crashed in REL_10_STABLE didn’t crash in master. But I think that was just
lucky allocation, and we will eventually see cases of this crash in all
versions that use __int128 assignment in int8_avg_combine().

I’ve attached a patch which sets the MAXIMUM_ALIGNOF correctly when
__int128’s are available, so fixes the crash. This patch is on top of
REL_10_STABLE, which is the branch I was getting the crash at. I can create
patches for other branches if we think this is the proper change.

The sequence for which I got the crash in REL_10_STABLE was the following
sequence of commands after a fresh initdb:

CREATE TABLE t(a BIGINT);
INSERT INTO t SELECT * FROM generate_series(1, 10000000);
SELECT sum(a) FROM t;

I’m not sure if this will crash for everyone, since you can be lucky and
have both states assigned 16-byte aligned locations. This was the case for
me in master branch. "SELECT version()" is "PostgreSQL 10.0 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.2.1 20170915 (Red Hat
7.2.1-2), 64-bit", if that is related.

I notice that there’s a comment in configure.in that says the penalty of
using 16-bit alignment might be too much for disk and memory space. If this
is the case, then we need to modify numeric.c to make sure that it never
use any instructions with 16-byte alignment requirements. I can do that if
that is the consensus.

Or maybe we can create an allocation function for 16-byte aligned
allocations.

Any thoughts?

-- Hadi

Attachment Content-Type Size
fix_alignment.REL10_0_STABLE.patch text/x-patch 9.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-11-26 03:47:08 Re: [PATCH] Fix crash in int8_avg_combine().
Previous Message Alvaro Herrera 2017-11-26 03:32:08 Re: [HACKERS] PATCH: multivariate histograms and MCV lists