RE: Forget close an open relation in ReorderBufferProcessTXN()

From: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'Amit Langote' <amitlangote09(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Japin Li <japinli(at)hotmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Forget close an open relation in ReorderBufferProcessTXN()
Date: 2021-05-22 02:00:52
Message-ID: OSBPR01MB4888D8BD45BE7EEA5BEE936FED289@OSBPR01MB4888.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, May 21, 2021 9:45 PM I worte:
> On Friday, May 21, 2021 4:43 PM Amit Langote <amitlangote09(at)gmail(dot)com>
> wrote:
> > On Fri, May 21, 2021 at 3:55 PM osumi(dot)takamichi(at)fujitsu(dot)com
> > <osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
> > > But, I've detected segmentation faults caused by the patch, which
> > > can happen during 100_bugs.pl in src/test/subscription.
> >
> > Hmm, maybe get_rel_syn_entry() should explicitly set map to NULL when
> > first initializing an entry. It's possible that without doing so, the
> > map remains set to a garbage value, which causes the invalidation
> > callback that runs into such partially initialized entry to segfault
> > upon trying to deference that garbage pointer.
> Just in case, I prepared a new PG and
> did a check to make get_rel_sync_entry() print its first pointer value with elog.
> Here, when I executed 100_bugs.pl, I got some garbage below.
>
> * The change I did:
> @@ -1011,6 +1011,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
> entry->pubactions.pubinsert =
> entry->pubactions.pubupdate =
> entry->pubactions.pubdelete =
> entry->pubactions.pubtruncate = false;
> entry->publish_as_relid = InvalidOid;
> + elog(LOG, "**> the pointer's default value : %p",
> + entry->map);
> }
>
(snip)
>
> So, your solution is right, I think.
This was a bit indirect.
I've checked the core file of v3's failure core and printed the entry
to get more confidence. Sorry for inappropriate measure to verify the solution.

$1 = {relid = 16388, schema_sent = false, streamed_txns = 0x0, replicate_valid = false, pubactions = {pubinsert = false, pubupdate = false, pubdelete = false, pubtruncate = false}, publish_as_relid = 16388,
map = 0x7f7f7f7f7f7f7f7f}

Yes, the process tried to free garbage.
Now, we are convinced that we have addressed the problem. That's it !

Best Regards,
Takamichi Osumi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2021-05-22 02:57:37 Re: Forget close an open relation in ReorderBufferProcessTXN()
Previous Message Peter Smith 2021-05-22 01:02:36 Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options