Re: DSM robustness failure (was Re: Peripatus/failures)

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Larry Rosenman <ler(at)lerctr(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: DSM robustness failure (was Re: Peripatus/failures)
Date: 2018-10-18 01:00:11
Message-ID: CAEepm=0tnAEV8TvQxKVzs6sYpan7jU=V0_FK-Osz8E8tMJb5Jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 18, 2018 at 11:08 AM Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> > On Thu, Oct 18, 2018 at 9:00 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup
> > > are broken here; the former because it makes no attempt to unmap
> > > the old control segment (which it oughta be able to do no matter how badly
> > > broken the contents are), and the latter because it should not let
> > > garbage old state prevent it from establishing a valid new segment.
> >
> > Looking.
>
> (CCing Amit Kapila)
>
> To reproduce this, I attached lldb to a backend and did "mem write
> &dsm_control->magic 42", and then delivered SIGKILL to the backend.
> Here's one way to fix it. I think we have no choice but to leak the
> referenced segments, but we can free the control segment. See
> comments in the attached patch for rationale.

I realised that the nearly identical code in dsm_postmaster_shutdown()
might as well destroy a corrupted control segment too. New version
attached.

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
0001-Fix-thinko-in-handling-of-corrupted-DSM-control-area.patch application/octet-stream 2.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2018-10-18 01:01:12 Re: pageinspect: add tuple_data_record()
Previous Message Tom Lane 2018-10-18 00:55:09 Re: DSM robustness failure (was Re: Peripatus/failures)