Re: [COMMITTERS] pgsql: Replication lag tracking for walsenders

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Replication lag tracking for walsenders
Date: 2017-04-22 16:27:35
Message-ID: 9852.1492878455@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

I wrote:
> Taking a quick census of other buildfarm machines that are known to be
> running the recovery test, it appears that most (not all) are seeing
> one or both traps. But the test is reporting success anyway, everywhere
> except on Noah's 32-bit AIX critters.

Or, to be a bit more scientific, let's dig into the buildfarm database.
A couple more critters have started running the recovery test since
yesterday; these are the latest reports we have:

pgbfprod=> select sysname, max(snapshot) as newest, count(*) from build_status_log where log_stage = 'recovery-check.log' group by 1 order by 2;
sysname | newest | count
---------------+---------------------+-------
hamster | 2016-09-24 16:00:07 | 182
jacana | 2017-04-20 21:00:20 | 3
skink | 2017-04-22 05:00:01 | 2
sungazer | 2017-04-22 06:07:17 | 7
tern | 2017-04-22 06:38:09 | 8
hornet | 2017-04-22 06:41:12 | 7
mandrill | 2017-04-22 08:44:09 | 8
nightjar | 2017-04-22 13:54:24 | 55
longfin | 2017-04-22 14:29:17 | 13
calliphoridae | 2017-04-22 14:30:01 | 4
piculet | 2017-04-22 14:30:01 | 3
culicidae | 2017-04-22 14:30:01 | 5
francolin | 2017-04-22 14:30:01 | 3
prion | 2017-04-22 14:33:05 | 12
crake | 2017-04-22 14:37:21 | 86
(15 rows)

Grepping those specific reports for "TRAP" yields

sysname | l
---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
jacana | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c", Line: 92)
sungazer | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "walsender.c", Line: 3331)
sungazer | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line: 92)
tern | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "walsender.c", Line: 3331)
tern | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line: 92)
hornet | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "walsender.c", Line: 3331)
hornet | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line: 92)
mandrill | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "walsender.c", Line: 3331)
mandrill | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line: 92)
nightjar | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "/pgbuild/root/HEAD/pgsql.build/../pgsql/src/backend/replication/walsender.c", Line: 3331)
nightjar | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "/pgbuild/root/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c", Line: 92)
longfin | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "walsender.c", Line: 3331)
longfin | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line: 92)
calliphoridae | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "/home/andres/build/buildfarm-calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c", Line: 92)
piculet | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "/home/andres/build/buildfarm-piculet/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c", Line: 92)
culicidae | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "/home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c", Line: 92)
francolin | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "/home/andres/build/buildfarm-francolin/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c", Line: 92)
prion | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "/home/ec2-user/bf/root/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c", Line: 92)
(18 rows)

So 6 of 15 critters are getting the walsender.c assertion,
and those six plus six more are seeing the subtrans.c one,
and three are seeing neither one. There's probably a pattern
to that, don't know what it is.

(Actually, it looks like hamster stopped running this test
a long time ago, so whatever is in its last report is probably
not very relevant. So more like 12 of 14 critters are seeing
one or both traps.)

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2017-04-22 16:37:50 Re: [COMMITTERS] pgsql: Replication lag tracking for walsenders
Previous Message Tom Lane 2017-04-22 15:59:37 Re: [COMMITTERS] pgsql: Replication lag tracking for walsenders

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-04-22 16:37:50 Re: [COMMITTERS] pgsql: Replication lag tracking for walsenders
Previous Message Tom Lane 2017-04-22 15:59:37 Re: [COMMITTERS] pgsql: Replication lag tracking for walsenders