From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: backends stuck in "startup" |
Date: | 2017-11-26 00:53:28 |
Message-ID: | 20171126005328.GE5668@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Sat, Nov 25, 2017 at 05:45:59PM -0500, Tom Lane wrote:
> Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> > We never had any issue during the ~2 years running PG96 on this VM, until
> > upgrading Monday to PG10.1, and we've now hit it 5+ times.
>
> > BTW this is a VM run on a hypervisor managed by our customer:
> > DMI: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 06/22/2012
>
> > Linux TS-DB 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>
> Actually ... I was focusing on the wrong part of that. It's not
> your hypervisor, it's your kernel. Running four-year-old kernels
> is seldom a great idea, and in this case, the one you're using
> contains the well-reported missed-futex-wakeups bug:
>
> https://bugs.centos.org/view.php?id=8371
>
> While rebuilding PG so it doesn't use POSIX semaphores will dodge
> that bug, I think a kernel update would be a far better idea.
> There are lots of other known bugs in that version.
>
> Relevant to our discussion, the fix involves inserting a memory
> barrier into the kernel's futex call handling:
Ouch ! Thanks for the heads up and sorry for the noise.
I'm still trying to coax 3 customers off centos5.x, so the 2 customers left
running centos6.5 weren't on any of my mental lists..
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Andreas Kretschmer | 2017-11-26 11:27:03 | Re: A particular database to move to other drive |
Previous Message | John R Pierce | 2017-11-25 23:15:03 | Re: Roles and security |