Re: logical decoding : exceeded maxAllocatedDescs for .spill files

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Date: 2020-01-09 02:12:29
Message-ID: CAA4eK1J2PA5HXRo7CkALjh=Dew2eaMVbpyXMV3OAAZHkU-7mCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 5, 2020 at 10:29 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sun, Jan 5, 2020 at 12:21 AM Noah Misch <noah(at)leadboat(dot)com> wrote:
> >
> > On Fri, Jan 03, 2020 at 02:20:09PM +0530, Amit Khandekar wrote:
> > > On Fri, 3 Jan 2020 at 10:19, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > On Fri, Jan 3, 2020 at 8:29 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >> I see one failure in REL_10_STABLE [1] which seems to be due to this commit:
> > > >
> > > > I tried this test on my CentOs and Power8 machine more than 50 times, but couldn't reproduce it. So, adding Noah to see if he can try this test [1] on his machine (tern) and get stack track or some other information?
> > > >
> > > > [1] - make -C src/test/recovery/ check PROVE_TESTS=t/006_logical_decoding.pl
> > >
> > > I also tested multiple times using PG 10 branch; also tried to inject
> > > an error so that PG_CATCH related code also gets covered, but
> > > unfortunately didn't get the crash on my machine. I guess, we will
> > > have to somehow get the stacktrace.
> >
> > I have buildfarm member tern running this test in a loop. In the 290
> > iterations so far, it hasn't failed. I've leave it running for another week
> > or so.
> >
>
> Okay, thanks! FYI, your other machine 'mandril' also exhibits the
> exact same behavior and on v10. Both the machines (tern and mandril)
> seem to have the same specs which seems to be the reason that they are
> failing in the same way. The thing that bothers me is that the fix
> and test are the same for v11 and test passes for v11 on both
> machines. Does this indicate any random behavior or maybe some other
> bug in v10 which is discovered by this test?
>

Another thing to notice here is that on buildfarm 'tern' (for v10), it
is getting reproduced, whereas when you ran it independently, then the
problem is not reproduced even after so many runs. What could be the
difference which is causing this?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-01-09 02:31:24 Re: src/test/recovery regression failure on bionic
Previous Message Robert Haas 2020-01-09 02:07:20 Re: Removing pg_pltemplate and creating "trustable" extensions