From: | Greg Sabino Mullane <htamfids(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | logical replication walsender loop preventing a clean shutdown |
Date: | 2024-09-16 18:27:42 |
Message-ID: | CAKAnmm+STYvW_5aRx2C0QWgbNpd_zEjruc6MytePnRuK8oKtTA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
When doing logical replication, a large transaction can prevent the
postgres process from shutting down until the WAL has all been processed
and the client reports back. This is obviously less than ideal, as it means
a pg_ctl stop -m fast can take minutes or hours to complete. I would expect
the behavior to be that all backends are signalled so they can leave
cleanly.
I found this thread that reports something very similar (but without the
infinite looping):
Subject: walsender bug: stuck during shutdown
https://www.postgresql.org/message-id/flat/20201123205253.GA10075%40alvherre.pgsql
I have cc'd Alvaro in case he has any progress on this, or ideas. I tried
applying the patch from that thread, but the behavior remained unchanged.
Wanted to raise this in -bugs for added visibility, and also see if anyone
had thoughts before I dig deeper.
My test case (tested with latest, as of commit
b8ea0f675f35c3f0c2cf62175517ba0dacad4abd)
* Spin up a cluster, port 5555, using wal_level logical
* pg_recvlogical --create-slot -d postgres -p 5555 --slot=foo
* pg_recvlogical --start -d postgres -p 5555 --slot=foo --file /tmp/tmp
* If all is well, ctrl-z, bg 1, watch -n 3 tail /tmp/tmp
Other session:
* psql -p 5555 postgres
* create table t (id int generated always as identity, foo text);
* insert into t(foo) select 'abcdefghijklmnopqrstuvwxyz' from
generate_series(1,10_000_000);
Once the commit finishes, and as soon as pc_recvlogical starts processing
it:
* time pg_ctl stop -m fast -w -t 10000
I found 10 million a nice test on my system - shutdown takes an additional
50 seconds or so, as it waits for pg_recvlogical to respond.
Cheers,
Greg
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2024-09-16 18:54:50 | BUG #18620: Problem: Slow Delete Operation |
Previous Message | Tom Lane | 2024-09-16 17:28:30 | Re: BUG #18618: pg_upgrade from 14 to 15+ fails for unlogged table with identity column |