From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: src/test/subscription/t/002_types.pl hanging on particular environment |
Date: | 2017-09-18 10:18:10 |
Message-ID: | 20170918101810.l5irpe2bdb6xkksi@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2017-09-18 21:57:04 +1200, Thomas Munro wrote:
> The subscription tests 002_types.pl sometimes hangs for a while and
> then times out when run on a Travis CI build VM running Ubuntu Trusty
> if --enable-coverage is used.
Yea, I saw that too.
> I guess it might be a timing/race
> problem because I can't think of any mechanism specific to coverage
> instrumentation except that it slows stuff down, but I don't know.
> The only other thing I can think of is that perhaps the instrumented
> code is writing something to stdout or stderr and that's finding its
> way into some protocol stream and confusing things, but I can't
> reproduce this on any of my usual development machines. I haven't
> tried that exact operating system. Maybe it's a bug in the toolchain,
> but that's an Ubuntu LTS release so I assume other people use it
> (though I don't see it in the buildfarm).
I've run this locally through a number of iterations with coverage
enabled, I think I reproduced it once, but unfortunately I'd continued
because I was working on something else at that moment.
It might be worthwhile to play around with replacing the or die in
my $synced_query =
"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('s', 'r');";
$node_subscriber->poll_query_until('postgres', $synced_query)
or die "Timed out while waiting for subscriber to synchronize data";
with something like
or diag($node_subscriber->safe_psql('postgres', 'SELECT * FROM pg_subscription_rel')
just to know where to go from here.
> WARNING: terminating connection because of crash of another server process
> DETAIL: The postmaster has commanded this server process to roll
> back the current transaction and exit, because another server process
> exited abnormally and possibly corrupted shared memory.
> HINT: In a moment you should be able to reconnect to the database
> and repeat your command.
>
> As far as I know these are misleading, it's really just an immediate
> shutdown. There is no core file, so I don't believe anything actually
> crashed.
I was about to complain about these, for entirely unrelated reasons. I
think it's a bad idea - and there's a couple complains on the lists too,
to emit these warnings. It's not entirely trivial to fix though :(
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Dmitry Dolgov | 2017-09-18 10:25:04 | Re: [PATCH] Generic type subscripting |
Previous Message | Thomas Munro | 2017-09-18 09:57:04 | src/test/subscription/t/002_types.pl hanging on particular environment |