| From: | "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com> |
|---|---|
| To: | 'Amit Langote' <amitlangote09(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Japin Li <japinli(at)hotmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | RE: Forget close an open relation in ReorderBufferProcessTXN() |
| Date: | 2021-05-22 02:00:52 |
| Message-ID: | OSBPR01MB4888D8BD45BE7EEA5BEE936FED289@OSBPR01MB4888.jpnprd01.prod.outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Friday, May 21, 2021 9:45 PM I worte:
> On Friday, May 21, 2021 4:43 PM Amit Langote <amitlangote09(at)gmail(dot)com>
> wrote:
> > On Fri, May 21, 2021 at 3:55 PM osumi(dot)takamichi(at)fujitsu(dot)com
> > <osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
> > > But, I've detected segmentation faults caused by the patch, which
> > > can happen during 100_bugs.pl in src/test/subscription.
> >
> > Hmm, maybe get_rel_syn_entry() should explicitly set map to NULL when
> > first initializing an entry. It's possible that without doing so, the
> > map remains set to a garbage value, which causes the invalidation
> > callback that runs into such partially initialized entry to segfault
> > upon trying to deference that garbage pointer.
> Just in case, I prepared a new PG and
> did a check to make get_rel_sync_entry() print its first pointer value with elog.
> Here, when I executed 100_bugs.pl, I got some garbage below.
>
> * The change I did:
> @@ -1011,6 +1011,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
> entry->pubactions.pubinsert =
> entry->pubactions.pubupdate =
> entry->pubactions.pubdelete =
> entry->pubactions.pubtruncate = false;
> entry->publish_as_relid = InvalidOid;
> + elog(LOG, "**> the pointer's default value : %p",
> + entry->map);
> }
>
(snip)
>
> So, your solution is right, I think.
This was a bit indirect.
I've checked the core file of v3's failure core and printed the entry
to get more confidence. Sorry for inappropriate measure to verify the solution.
$1 = {relid = 16388, schema_sent = false, streamed_txns = 0x0, replicate_valid = false, pubactions = {pubinsert = false, pubupdate = false, pubdelete = false, pubtruncate = false}, publish_as_relid = 16388,
map = 0x7f7f7f7f7f7f7f7f}
Yes, the process tried to free garbage.
Now, we are convinced that we have addressed the problem. That's it !
Best Regards,
Takamichi Osumi
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Langote | 2021-05-22 02:57:37 | Re: Forget close an open relation in ReorderBufferProcessTXN() |
| Previous Message | Peter Smith | 2021-05-22 01:02:36 | Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options |