Re: Bus error in pg_logical_slot_get_changes (9.4.7, sparc)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Christoph Berg <myon(at)debian(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Bus error in pg_logical_slot_get_changes (9.4.7, sparc)
Date: 2016-04-14 19:14:37
Message-ID: 15040.1460661277@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Christoph Berg <myon(at)debian(dot)org> writes:
> Re: Tom Lane 2016-04-14 <423(dot)1460653903(at)sss(dot)pgh(dot)pa(dot)us>
>> Proposed patch attached, but I'm still wondering why the alignment-picky
>> buildfarm members aren't all failing on this. It seems to strongly
>> suggest that the regression tests' coverage is lacking here.

> This patch works both on sparc and amd64. Thanks!
> (Tested on 9.4.7 only.)

I've been looking around and testing some more. I noticed that the only
caller of ReorderBufferRestoreChange always passes a maxaligned buffer,
so most of the changes I suggested are unnecessary. AFAICT, it's only
the second fetch of t_len that is at any risk. Even there, at least in
HEAD, I cannot construct a test case in which the first tuple's t_len is
not a multiple of 4. It's hard to tell what is causing that, because
surely heap_form_tuple guarantees no such thing. I have a suspicion that
something in the impenetrable thicket that passes for prefix/suffix
optimization in log_heap_update is forcing the WAL-logged tuple length to
be rounded off to sizeof(int) (*not* MAXALIGN).

It might be that 9.4 acts differently here, but the lack of crashes in our
buildfarm suggests otherwise. The best theory I can come up with is that
somehow, your Sparc compiler is deciding that "data" is 8-aligned at this
point and generating a code sequence that depends on that, even though
it's only being asked to fetch a uint32. If the machine were a 64-bit
machine, the compiler could legitimately make such a deduction because of
the presence of a pointer field in HeapTupleData; but I dunno how it gets
to that conclusion if the architecture is 32-bit. Anyway, there are few
enough architectures where it could make sense to use an 8-aligned
instruction to fetch a 4-byte value that it's not so surprising we've not
seen it crash elsewhere.

So I now think my previous patch is overkill, and we should instead use
something like the attached.

regards, tom lane

Attachment Content-Type Size
alignment-fix-2.patch text/x-diff 1.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-04-14 19:18:36 Re: Bus error in pg_logical_slot_get_changes (9.4.7, sparc)
Previous Message Andres Freund 2016-04-14 18:55:52 Re: Bus error in pg_logical_slot_get_changes (9.4.7, sparc)