From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED |
Date: | 2023-02-18 00:27:04 |
Message-ID: | CA+hUKGJUH_UN2G1EHpmvKBaJMbyuhrrxORw8yzmO4BHwUdqEMw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I still have no theory for how this condition was reached despite a
lot of time thinking about it and searching for more clues. As far as
I can tell, the recent improvements to postmaster's signal and event
handling shouldn't be related: the state management and logic was
unchanged.
While failing to understand this, I worked[1] on CI log indexing tool
with public reports that highlight this sort of thing[2], so I'll be
watching out for more evidence. Unfortunately I have no data from
before 1 Feb (cfbot previously wasn't interested in the past at all;
I'd need to get my hands on the commit IDs for earlier testing but I
can't figure out how to get those out of Cirrus or Github -- anyone
know how?). FWIW I have a thing I call bfbot for slurping up similar
data from the build farm. It's not pretty enough for public
consumption, but I do know that this assertion hasn't failed there,
except the cases I mentioned earlier, and a load of failures on
lorikeet which was completely b0rked until recently.
[1] https://xkcd.com/974/
[2] http://cfbot.cputube.org/highlights/assertion-90.html
From | Date | Subject | |
---|---|---|---|
Next Message | Amin | 2023-02-18 00:36:25 | Share variable between psql backends in CustomScan |
Previous Message | Nathan Bossart | 2023-02-17 23:43:44 | Re: O(n) tasks cause lengthy startups and checkpoints |