Re: Recent pg_rewind test failures in buildfarm

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: Recent pg_rewind test failures in buildfarm
Date: 2025-04-15 10:04:54
Message-ID: oqs752g5rptmz2fiotfmcw7oofpyna7urxouvtm7f2szyelmkr@xu4fpbau3dwm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-04-14 22:58:28 -0400, Tom Lane wrote:
> In the last day or so, both skink and mamba have hit this
> in the pg_rewind test suite [1][2]:
>
> #3 0x01f03f7c in ExceptionalCondition (conditionName=conditionName(at)entry=0x2119c4c "pending_since == 0", fileName=fileName(at)entry=0x2119454 "pgstat.c", lineNumber=lineNumber(at)entry=734) at assert.c:66
> #4 0x01d7994c in pgstat_report_stat (force=force(at)entry=true) at pgstat.c:734
> #5 0x01d7a248 in pgstat_shutdown_hook (code=<optimized out>, arg=<optimized out>) at pgstat.c:615
> #6 0x01d1e7c0 in shmem_exit (code=code(at)entry=0) at ipc.c:243
> #7 0x01d1e92c in proc_exit_prepare (code=code(at)entry=0) at ipc.c:198
> #8 0x01d1e9fc in proc_exit (code=0) at ipc.c:111
> #9 0x01cd5bd0 in ProcessRepliesIfAny () at walsender.c:2294
> #10 0x01cd8084 in WalSndLoop (send_data=send_data(at)entry=0x1cd7628 <XLogSendPhysical>) at walsender.c:2790
> #11 0x01cd8f40 in StartReplication (cmd=0xfd048210) at walsender.c:960
> #12 exec_replication_command (cmd_string=cmd_string(at)entry=0xfd53c098 "START_REPLICATION 0/3000000 TIMELINE 2") at walsender.c:2135
> #13 0x01d5bd84 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4767
>
> That assert appears to be several years old, and the
> 008_min_recovery_point.pl test script that's triggering it hasn't
> changed very recently either, so I'm baffled where to start digging.
> It has the odor of a timing problem, so maybe we just started hitting
> this by chance. Still ... anybody have an idea?

See also https://postgr.es/m/dwrkeszz6czvtkxzr5mqlciy652zau5qqnm3cp5f3p2po74ppk%40omg4g3cc6dgq

I am fairly certain it's the fault of 039549d70f6

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2025-04-15 10:20:11 Re: Built-in Raft replication
Previous Message Andrei Lepikhov 2025-04-15 10:00:53 Re: A modest proposal: make parser/rewriter/planner inputs read-only