Re: pg_logical_emit_message() misses a XLogFlush()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_logical_emit_message() misses a XLogFlush()
Date: 2023-08-16 04:13:36
Message-ID: 20230816041336.4tsksa2qcudfgcc6@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-08-16 03:20:53 +0200, Tomas Vondra wrote:
> On 8/16/23 02:33, Andres Freund wrote:
> > Hi,
> >
> > On 2023-08-16 06:58:56 +0900, Michael Paquier wrote:
> >> On Tue, Aug 15, 2023 at 11:37:32AM +0200, Tomas Vondra wrote:
> >>> Shouldn't the flush be done only for non-transactional messages? The
> >>> transactional case will be flushed by regular commit flush.
> >>
> >> Indeed, that would be better. I am sending an updated patch.
> >>
> >> I'd like to backpatch that, would there be any objections to that?
> >
> > Yes, I object. This would completely cripple the performance of some uses of
> > logical messages - a slowdown of several orders of magnitude. It's not clear
> > to me that flushing would be the right behaviour if it weren't released, but
> > it certainly doesn't seem right to make such a change in a minor release.
> >
>
> So are you objecting to adding the flush in general, or just to the
> backpatching part?

Both, I think. I don't object to adding a way to trigger flushing, but I think
it needs to be optional.

> IMHO we either guarantee durability of non-transactional messages, in which
> case this would be a clear bug - and I'd say a fairly serious one. I'm
> curious what the workload that'd see order of magnitude of slowdown does
> with logical messages I've used it, but even if such workload exists, would
> it really be enough to fix any other durability bug?

Not sure what you mean with the last sentence?

I've e.g. used non-transactional messages for:

- A non-transactional queuing system. Where sometimes one would dump a portion
of tables into messages, with something like
SELECT pg_logical_emit_message(false, 'app:<task>', to_json(r)) FROM r;
Obviously flushing after every row would be bad.

This is useful when you need to coordinate with other systems in a
non-transactional way. E.g. letting other parts of the system know that
files on disk (or in S3 or ...) were created/deleted, since a database
rollback wouldn't unlink/revive the files.

- Audit logging, when you want to log in a way that isn't undone by rolling
back transaction - just flushing every pg_logical_emit_message() would
increase the WAL flush rate many times, because instead of once per
transaction, you'd now flush once per modified row. It'd basically make it
impractical to use for such things.

- Optimistic locking. Emitting things that need to be locked on logical
replicas, to be able to commit on the primary. A pre-commit hook would wait
for the WAL to be replayed sufficiently - but only once per transaction, not
once per object.

> Or perhaps we don't want to guarantee durability for such messages, in
> which case we don't need to fix it at all (even in master).

Well, I can see adding an option to flush, or perhaps a separate function to
flush, to master.

> The docs are not very clear on what to expect, unfortunately. It says
> that non-transactional messages are "written immediately" which I could
> interpret in either way.

Yea, the docs certainly should be improved, regardless what we end up with.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-08-16 04:16:53 Re: pg_logical_emit_message() misses a XLogFlush()
Previous Message Peter Smith 2023-08-16 03:53:47 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication