Control flow in logical replication walsender

From: Christophe Pettus <xof(at)thebuild(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Control flow in logical replication walsender
Date: 2024-04-30 17:57:28
Message-ID: E172FD10-8309-49AB-BEB6-7E1539C4E32D@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hi,

I wanted to check my understanding of how control flows in a walsender doing logical replication. My understanding is that the (single) thread in each walsender process, in the simplest case, loops on:

1. Pull a record out of the WAL.
2. Pass it to the recorder buffer code, which,
3. Sorts it out into the appropriate in-memory structure for that transaction (spilling to disk as required), and then continues with #1, or,
4. If it's a commit record, it iteratively passes the transaction data one change at a time to,
5. The logical decoding plugin, which returns the output format of that plugin, and then,
6. The walsender sends the output from the plugin to the client. It cycles on passing the data to the plugin and sending it to the client until it runs out of changes in that transaction, and then resumes reading the WAL in #1.

In particular, I wanted to confirm that while it is pulling the reordered transaction and sending it to the plugin (and thence to the client), that particular walsender is *not* reading new WAL records or putting them in the reorder buffer.

The specific issue I'm trying to track down is an enormous pileup of spill files. This is in a non-supported version of PostgreSQL (v11), so an upgrade may fix it, but at the moment, I'm trying to find a cause and a mitigation.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2024-04-30 18:52:58 Re: pg17 issues with not-null contraints
Previous Message Robert Haas 2024-04-30 17:52:02 Re: pg17 issues with not-null contraints