Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: robertmhaas(at)gmail(dot)com, pryzby(at)telsasoft(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)gmail(dot)com
Subject: Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Date: 2022-06-01 02:42:01
Message-ID: 20220601.114201.1536897005260401301.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 31 May 2022 16:10:05 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in
tgl> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
tgl> > Yeah, so when I created this stuff in the first place, I figured that
tgl> > it wasn't a problem if we reserved relptr == 0 to mean a NULL pointer,
tgl> > because you would never have a relative pointer pointing to the
tgl> > beginning of a DSM, because it would probably always start with a
tgl> > dsm_toc. But when Thomas made it so that DSM allocations could happen
tgl> > in the main shared memory segment, that ceased to be true. This
tgl> > example happened not to break because we never use relptr_access() on
tgl> > fpm->self. We do use fpm_segment_base(), but that accidentally fails
tgl> > to break, because instead of using relptr_access() it drills right
tgl> > through the abstraction and doesn't have any kind of special case for
tgl> > 0.
tgl>
tgl> Seems like that in itself is a a lousy idea. Either the code should
tgl> respect the abstraction, or it shouldn't be declaring the variable
tgl> as a relptr in the first place.
tgl>
tgl> > So we can fix this by:
tgl> > 1. Using a relative pointer value other than 0 to represent a null
tgl> > pointer. Andres suggested (Size) -1.
tgl> > 2. Not storing the free page manager for the DSM in the main shared
tgl> > memory segment at byte offset 0.
tgl> > 3. Dropping the assertion while loudly singing "la la la la la la".
tgl>
tgl> I'm definitely down on #3, because that just leaves the ambiguity
tgl> in place to bite somewhere else in future. #1 would work as long
tgl> as nobody expects memset-to-zero to produce null relptrs, but that
tgl> doesn't seem very nice either.
tgl>
tgl> On the whole, wasting MAXALIGN worth of memory seems like the least bad
tgl> alternative, but I wonder if we ought to do it right here as opposed
tgl> to somewhere in the DSM code proper. Why is this DSM space not like
tgl> other DSM spaces in starting with a TOC?
tgl>
tgl> regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-06-01 02:51:33 Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Previous Message osumi.takamichi@fujitsu.com 2022-06-01 02:06:11 RE: Build-farm - intermittent error in 031_column_list.pl