Re: Atomics for heap_parallelscan_nextpage()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Atomics for heap_parallelscan_nextpage()
Date: 2017-08-16 16:55:15
Message-ID: 20170816165515.ri7a6goblyadnklm@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-08-16 11:16:58 -0400, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> > A couple of 32-bit x86 buildfarm members don't seem to be happy with
> > this. I'll investigate, but if anyone has a clue, I'm all ears...
>
> dromedary's issue seems to be alignment:
>
> TRAP: UnalignedPointer("(((uintptr_t) ((uintptr_t)(ptr)) + ((8) - 1)) & ~((uintptr_t) ((8) - 1))) != (uintptr_t)(ptr)", File: "../../../../src/include/port/atomics.h", Line: 452)
> 2017-08-16 11:11:38.558 EDT [75693:3] LOG: server process (PID 76277) was terminated by signal 6: Abort trap
> 2017-08-16 11:11:38.558 EDT [75693:4] DETAIL: Failed process was running: select count(*) from a_star;
>
> Not sure if this is your bug or if it's exposing a pre-existing
> deficiency in the atomics code, viz, failure to ensure that
> pg_atomic_uint64 is actually a 64-bit-aligned type. Andres?

I suspect it's the former. Suspect that the shared memory that holds
the "parallel desc" isn't properly aligned:

void
ExecSeqScanInitializeDSM(SeqScanState *node,
ParallelContext *pcxt)
{
...
pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
heap_parallelscan_initialize(pscan,
node->ss.ss_currentRelation,
estate->es_snapshot);

/*
...
* We allocate backwards from the end of the segment, so that the TOC entries
* can grow forward from the start of the segment.
*/
extern void *
shm_toc_allocate(shm_toc *toc, Size nbytes)
{
volatile shm_toc *vtoc = toc;
Size total_bytes;
Size allocated_bytes;
Size nentry;
Size toc_bytes;

/* Make sure request is well-aligned. */
nbytes = BUFFERALIGN(nbytes);
...
return ((char *) toc) + (total_bytes - allocated_bytes - nbytes);
}

/*
* Initialize a region of shared memory with a table of contents.
*/
shm_toc *
shm_toc_create(uint64 magic, void *address, Size nbytes)
{
shm_toc *toc = (shm_toc *) address;

Assert(nbytes > offsetof(shm_toc, toc_entry));
toc->toc_magic = magic;
SpinLockInit(&toc->toc_mutex);
toc->toc_total_bytes = nbytes;
toc->toc_allocated_bytes = 0;
toc->toc_nentry = 0;

return toc;
}

Afaict shm_create/shm_toc_allocate don't actually guarantee that the end
of the toc's memory is suitably aligned. But I didn't yet have any
coffee, so ...

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-08-16 16:57:35 Re: Atomics for heap_parallelscan_nextpage()
Previous Message Robert Haas 2017-08-16 16:53:53 Re: [COMMITTERS] pgsql: Simplify plpgsql's check for simple expressions.