Re: Commit Timestamp and LSN Inversion issue

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jan Wieck <jan(at)wi3ck(dot)info>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, tomas(at)vondra(dot)me
Subject: Re: Commit Timestamp and LSN Inversion issue
Date: 2024-11-07 18:19:42
Message-ID: gkbrjxwyjbqckjbrkep3uy2mvrcq2mrfzprfmegbeiqxidtmbu@3cgs5iocjykk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-11-05 08:58:36 -0500, Jan Wieck wrote:
> The attached solution is minimally invasive because it doesn't move the
> timestamp generation (clock_gettime() call) into the critical section of
> ReserveXLogInsertLocation() that is protected by a spinlock. Instead it
> keeps track of the last commit-ts written to WAL in shared memory and simply
> bumps that by one microsecond if the next one is below or equal. There is
> one extra condition in that code section plus a function call by pointer for
> every WAL record. In the unlikely case of encountering a stalled or
> backwards running commit-ts, the expensive part of recalculating the CRC of
> the altered commit WAL-record is done later, after releasing the spinlock. I
> have not been able to measure any performance impact on a machine with 2x
> Xeon-Silver (32 HT cores).

I think it's *completely* unacceptable to call a hook inside
ReserveXLogInsertLocation, with the spinlock held, no less. That's the most
contended section of code in postgres.

Even leaving that aside, something with a spinlock held is never a suitable
place to call a hook. As the header comments say:

* Keep in mind the coding rule that spinlocks must not be held for more
* than a few instructions. In particular, we assume it is not possible
* for a CHECK_FOR_INTERRUPTS() to occur while holding a spinlock, and so
* it is not necessary to do HOLD/RESUME_INTERRUPTS() in these macros.

I'm also quite certain that we don't want to just allow to redirect XLogInsert
to some random function. WAL logging is very crucial code and allowing to
randomly redirect it to something we don't know about will make it harder to
write correct WAL logging code and makes already quite complicated code even
more complicated.

> This will work fine until we have systems that can sustain a commit rate of
> one million transactions per second or higher for more than a few
> milliseconds.

It's sustainable for many minutes today.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2024-11-07 18:38:13 Re: Enable data checksums by default
Previous Message Andres Freund 2024-11-07 18:12:50 Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible