From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, fabriziomello(at)gmail(dot)com, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Minimal logical decoding on standbys |
Date: | 2023-04-07 18:12:26 |
Message-ID: | 20230407181226.6oyt4jcazy6eh7rx@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-04-07 08:47:57 -0700, Andres Freund wrote:
> Integrated all of these.
Here's my current version. Changes:
- Integrated Bertrand's changes
- polished commit messages of 0001-0003
- edited code comments for 0003, including
InvalidateObsoleteReplicationSlots()'s header
- added a bump of SLOT_VERSION to 0001
- moved addition of pg_log_standby_snapshot() to 0007
- added a catversion bump for pg_log_standby_snapshot()
- moved all the bits dealing with procsignals from 0003 to 0004, now the split
makes sense IMO
- combined a few more sucessive ->safe_psql() calls
I see occasional failures in the tests, particularly in the new test using
pg_authid, but not solely. cfbot also seems to have seen these:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3740
I made a bogus attempt at a workaround for the pg_authid case last night. But
that didn't actually fix anything, it just changed the timing.
I think the issue is that VACUUM does not force WAL to be flushed at the end
(since it does not assign an xid). wait_for_replay_catchup() uses
$node->lsn('flush'), which, due to VACUUM not flushing, can be an LSN from
before VACUUM completed.
The problem can be made more likely by adding pg_usleep(1000000); before
walwriter.c's call to XLogBackgroundFlush().
We probably should introduce some infrastructure in Cluster.pm for this, but
for now I just added a 'flush_wal' table that we insert into after a
VACUUM. That guarantees a WAL flush.
I think some of the patches might have more reviewers than really applicable,
and might also miss some. I'd appreciate if you could go over that...
Greetings,
Andres Freund
Attachment | Content-Type | Size |
---|---|---|
va67-0001-Replace-a-replication-slot-s-invalidated_at-LSN.patch | text/x-diff | 6.6 KB |
va67-0002-Prevent-use-of-invalidated-logical-slot-in-Crea.patch | text/x-diff | 4.1 KB |
va67-0003-Support-invalidating-replication-slots-due-to-h.patch | text/x-diff | 12.5 KB |
va67-0004-Handle-logical-slot-conflicts-on-standby.patch | text/x-diff | 12.2 KB |
va67-0005-Arrange-for-a-new-pg_stat_database_conflicts-an.patch | text/x-diff | 10.2 KB |
va67-0006-For-cascading-replication-wake-physical-and-log.patch | text/x-diff | 9.6 KB |
va67-0007-Allow-logical-decoding-on-standby.patch | text/x-diff | 16.4 KB |
va67-0008-New-TAP-test-for-logical-decoding-on-standby.patch | text/x-diff | 30.4 KB |
va67-0008-TAP-test-for-logical-decoding-on-standby.patch | text/x-diff | 30.3 KB |
va67-0009-Doc-changes-describing-details-about-logical-de.patch | text/x-diff | 2.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Drouvot, Bertrand | 2023-04-07 18:24:33 | Re: Minimal logical decoding on standbys |
Previous Message | Emre Hasegeli | 2023-04-07 17:35:18 | Unnecessary confirm work on logical replication |