From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | src/test/subscription/t/002_types.pl hanging on particular environment |
Date: | 2017-09-18 09:57:04 |
Message-ID: | CAEepm=2bP3TBMFBArP6o20AZaRduWjMnjCjt22hSdnA-EvrtCw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
The subscription tests 002_types.pl sometimes hangs for a while and
then times out when run on a Travis CI build VM running Ubuntu Trusty
if --enable-coverage is used. I guess it might be a timing/race
problem because I can't think of any mechanism specific to coverage
instrumentation except that it slows stuff down, but I don't know.
The only other thing I can think of is that perhaps the instrumented
code is writing something to stdout or stderr and that's finding its
way into some protocol stream and confusing things, but I can't
reproduce this on any of my usual development machines. I haven't
tried that exact operating system. Maybe it's a bug in the toolchain,
but that's an Ubuntu LTS release so I assume other people use it
(though I don't see it in the buildfarm).
Example:
t/001_rep_changes.pl .. ok
t/002_types.pl ........ #
# Looks like your test exited with 29 before it could output anything.
t/002_types.pl ........ Dubious, test returned 29 (wstat 7424, 0x1d00)
Failed 3/3 subtests
t/003_constraints.pl .. ok
t/004_sync.pl ......... ok
t/005_encoding.pl ..... ok
t/007_ddl.pl .......... ok
Test Summary Report
-------------------
t/002_types.pl (Wstat: 7424 Tests: 0 Failed: 0)
Non-zero exit status: 29
Parse errors: Bad plan. You planned 3 tests but ran 0.
Before I figured out that --coverage was relevant, I wondered if the
recent commit 8edacab209957520423770851351ab4013cb0167 which landed
around the time I spotted this might have something to do with it, but
I tried reverting the code change in there and it didn't help. Here's
a build log:
https://travis-ci.org/macdice/postgres/jobs/276752803
As you can see the script used was:
./configure --enable-debug --enable-cassert --enable-tap-tests
--enable-coverage && make -j4 all contrib && make -C
src/test/subscription check
In this build you can see the output of the following at the end,
which might provide clues to the initiated. You might need to click a
small triangle to unfold the commands' output.
cat ./src/test/subscription/tmp_check/log/002_types_publisher.log
cat ./src/test/subscription/tmp_check/log/002_types_subscriber.log
cat ./src/test/subscription/tmp_check/log/regress_log_002_types
There are messages like this:
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database
and repeat your command.
As far as I know these are misleading, it's really just an immediate
shutdown. There is no core file, so I don't believe anything actually
crashed.
Here's a version that's the same except it doesn't have
--enable-coverage. It passes:
https://travis-ci.org/macdice/postgres/jobs/276771654
Any ideas?
--
Thomas Munro
http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2017-09-18 10:18:10 | Re: src/test/subscription/t/002_types.pl hanging on particular environment |
Previous Message | Andres Freund | 2017-09-18 09:53:03 | Re: Is it time to kill support for very old servers? |