What is happening on buildfarm member crake?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: What is happening on buildfarm member crake?
Date: 2014-01-19 20:55:30
Message-ID: 8802.1390164930@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Since contrib/test_shm_mq went into the tree, crake has been crashing
intermittently (maybe one time in three) at the contribcheck step.
While there's not conclusive proof that test_shm_mq is at fault, the
circumstantial evidence is pretty strong. For instance, in the
postmaster log for the last failure,
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2014-01-19%2013%3A26%3A08
we see

LOG: registering background worker "test_shm_mq"
LOG: starting background worker process "test_shm_mq"
LOG: worker process: test_shm_mq (PID 12585) exited with exit code 1
LOG: unregistering background worker "test_shm_mq"
LOG: registering background worker "test_shm_mq"
LOG: registering background worker "test_shm_mq"
LOG: registering background worker "test_shm_mq"
LOG: starting background worker process "test_shm_mq"
LOG: starting background worker process "test_shm_mq"
LOG: starting background worker process "test_shm_mq"
LOG: worker process: test_shm_mq (PID 12588) exited with exit code 1
LOG: unregistering background worker "test_shm_mq"
LOG: worker process: test_shm_mq (PID 12586) exited with exit code 1
LOG: unregistering background worker "test_shm_mq"
LOG: worker process: test_shm_mq (PID 12587) exited with exit code 1
LOG: unregistering background worker "test_shm_mq"
LOG: server process (PID 12584) was terminated by signal 11: Segmentation fault
LOG: terminating any other active server processes

Comparing the PIDs makes it seem pretty likely that what's crashing is the
backend running the test_shm_mq test. Since the connected psql doesn't
notice any problem, most likely the crash occurs during backend shutdown
after psql has disconnected (which would also explain why no "current
query" gets shown in the crash report).

crake is running F16 x86_64, which I don't have installed here. So I
tried to reproduce this on RHEL6 and F19 x86_64 machines, with build
options matching what crake says it's using, with absolutely no success.
I don't think any other buildfarm critters are showing this either, which
makes it look like it's possibly specific to the particular compiler
version crake has got. Or maybe there's some other factor needed to
trigger it. Anyway, I wonder whether Andrew could get a stack trace from
the core dump next time it happens.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-01-19 21:14:36 Re: What is happening on buildfarm member crake?
Previous Message Andreas Karlsson 2014-01-19 20:29:28 Re: GiST support for inet datatypes