Re: Several buildfarm animals fail tests because of shared memory error

From: Robins Tharakan <tharakan(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: Several buildfarm animals fail tests because of shared memory error
Date: 2024-12-22 07:27:56
Message-ID: CAEP4nAyrnviyeRPb2OLB1o_F8pMw=E4OVhkjYeSA0Rdek+AYHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alexander,

Thanks for collating this list.
I'll try to add as much as I know, in hopes that it helps.

On Sun, 22 Dec 2024 at 16:30, Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:

> I'd like to bring your attention to multiple buildfarm failures, which
> occurred this month, on master only, caused by "could not open shared
> memory segment ...: No such file or directory" errors.

- I am unsure how batta is set up, but till late last week, none of my
instances had set REMOVEIPC correctly. I am sorry, I didn't know about this
until Thomas pointed it out to me in another thread. So if that's a key
reason here, then probably by this time next week things should settle
down. I've begun setting it correctly (2 done with a few more to go) -
although given that some machines are at work, I'll try to get to them this
coming week.

> But still why master only?
>

+1. It is interesting though as to why master is affected more often. This
may be statistical - since master ends up with more commits and thus more
tests? Unsure.

Also:
- I recently (~2 days back) switched parula to gcc-experimental nightly -
after which I see 4 of the recent errors - although the recent most test is
green.
- The only info about leafhopper may be relevant is that it's one of the
newest machines (Graviton4) so it comes with a recent hardware / kernel /
stock gcc 11.4.1.

Unfortunately I'm unable to reproduce such failures locally, so I'm sorry
> for such raw information, but I see no way to investigate this further
> without assistance. Perhaps owners of these animals could shed some light
> on this...
>

Since the instances are created with work accounts, it isn't trivial to
share access but I could revert with any outputs / capture if it can help
here.

Lastly, alligator has been on gcc nightly for a few months, and is on
x86_64 - so by this time next week if alligator is still stuttering, pretty
sure there's more than just aarch64 or gcc or IPC config to blame here.

-
robins

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-12-22 14:26:52 Re: Document NULL
Previous Message Alexander Lakhin 2024-12-22 06:00:00 Several buildfarm animals fail tests because of shared memory error