Re: Is this a problem in GenericXLogFinish()?

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is this a problem in GenericXLogFinish()?
Date: 2023-11-01 07:24:07
Message-ID: ZUH9F6i-TtlJ6gw7@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 30, 2023 at 03:54:39PM +0530, Amit Kapila wrote:
> On Sat, Oct 28, 2023 at 4:30 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> On Sat, Oct 28, 2023 at 03:45:13PM +0530, Amit Kapila wrote:
>> > Yes, we need it to exclude any concurrent in-progress scans that could
>> > return incorrect tuples during bucket squeeze operation.
>>
>> Thanks. So I assume that we should just set REGBUF_NO_CHANGE when the
>> primary and write buffers are the same and there are no tuples to
>> move. Say with something like the attached?
>
> What if the primary and write buffer are not the same but ntups is
> zero? Won't we hit the assertion again in that case?

The code tells that it should be able to handle such a case, so this
would mean to set REGBUF_NO_CHANGE only when we have no tuples to
move:
- XLogRegisterBuffer(1, wbuf, REGBUF_STANDARD);
+ /*
+ * A cleanup lock is still required on the write buffer even
+ * if there are no tuples to move, so it needs to be registered
+ * in this case.
+ */
+ wbuf_flags = REGBUF_STANDARD;
+ if (xlrec.ntups == 0)
+ wbuf_flags |= REGBUF_NO_CHANGE;
+ XLogRegisterBuffer(1, wbuf, wbuf_flags);

Anyway, there is still a hole here: the regression tests have zero
coverage for the case where there are no tuples to move if
!is_prim_bucket_same_wrt. There are only two queries that stress the
squeeze path with no tuples, both use is_prim_bucket_same_wrt:
INSERT INTO hash_split_heap SELECT a/2 FROM generate_series(1, 25000) a;
VACUUM hash_split_heap;

Perhaps this should have more coverage to make sure that all these
scenarios are covered at replay (likely with 027_stream_regress.pl)?
The case posted by Alexander at [1] falls under the same category
(is_prim_bucket_same_wrt with no tuples).

[1]: https://www.postgresql.org/message-id/f045c8f7-ee24-ead6-3679-c04a43d21351@gmail.com
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-11-01 08:23:08 Wrong function name in pgstatfuncs.c
Previous Message Japin Li 2023-11-01 07:19:39 Tab completion regression test failed on illumos