Re: Windows buildfarm members vs. new async-notify isolation test

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Dilger <hornschnorter(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Windows buildfarm members vs. new async-notify isolation test
Date: 2019-12-05 09:37:08
Message-ID: CAA4eK1+HOFzs=-MHDz6ef936KuuHJBxZbHEi7xt9TBqgHNO9GQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 4, 2019 at 9:51 PM Andrew Dunstan
<andrew(dot)dunstan(at)2ndquadrant(dot)com> wrote:
>
> On Wed, Dec 4, 2019 at 12:12 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> > > On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >> Hmm ... just looking at the code again, could it be that there's
> > >> no well-placed CHECK_FOR_INTERRUPTS? Andrew, could you see if
> > >> injecting one in what 790026972 added to postgres.c helps?
> >
> > > I also tried to analyze this failure and it seems this is a good bet,
> > > but I am also wondering why we have never seen such a timing issue in
> > > other somewhat similar tests. For ex., one with comment (#
> > > Cross-backend notification delivery.). Do they have a better way of
> > > ensuring that the notification will be received or is it purely
> > > coincidental that they haven't seen such a symptom?
> >
> > TBH, my bet is that this *won't* fix it, but it seemed like an easy
> > thing to test. For this to fix it, you'd have to suppose that we
> > never do a CHECK_FOR_INTERRUPTS during a COMMIT command, which is
> > improbable at best.
> >
>
>
> You win your bet. Tried this on frogmouth and it still failed.
>

IIUC, this means that commit (step l2commit) is finishing before the
notify signal is reached that session. If so, can we at least confirm
that by adding something like select pg_sleep(1) in that step? So,
l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }. I
think we can try by increasing sleep time as well to confirm the
behavior if required.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pengzhou Tang 2019-12-05 10:14:17 [Proposal] Extend TableAM routines for ANALYZE scan
Previous Message Konstantin Knizhnik 2019-12-05 09:23:40 Re: Session WAL activity