From: | Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com> |
Subject: | Re: Minimal logical decoding on standbys |
Date: | 2019-10-03 06:35:15 |
Message-ID: | CAJ3gD9fE=0w50sRagcs+jrktBXuJAWGZQdSTMa57CCY+Dh-xbg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 30 Sep 2019 at 23:38, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Mon, Sep 30, 2019 at 7:35 AM Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> > Alright. Attached is the updated patch that splits the file into two
> > files, one that does only xmin related testing, and the other test
> > file that tests conflict recovery scenarios, and also one scenario
> > where drop-database drops the slots on the database on standby.
> > Removed get_slot_xmins() and get_node_from_slotname().
> > Renamed 'replica' to 'standby'.
> > Used node->backup() function instead of pg_basebackup command.
> > Renamed $master_slot to $master_slotname, similarly for $standby_slot.
>
> In general, I think this code is getting a lot clearer and easier to
> understand in these last few revisions.
>
> Why does create_logical_slot_on_standby include sleep(1)? Does the
> test fail if you take that out?
It has not failed for me, but I think sometimes it may happen that the
system command 'pg_recvlogical' is so slow to start that before it
tries to even create the slot, the subsequent checkpoint command
concurrently runs, causing a "running transactions" record to arrive
on standby *before* even pg_recvlogical decides the starting point
from which to receive records. So effectively pg_recvlogical can miss
this record.
> If so, it's probably going to fail on
> the buildfarm even with that included, because some of the buildfarm
> machines are really slow (e.g. because they use CLOBBER_CACHE_ALWAYS,
> or because they're running on a shared system with low hardware
> specifications and an ancient disk).
Yeah right, then it makes sense to explicitly wait for the slot to
calculate the restart_lsn, and only then run the checkpoint command.
Did that now.
>
> Similarly for the sleep(1) just after you VACUUM FREEZE all the databases.
I checked that VACUUM command returns only after updating the
pg_database.datfrozenxid. So now I think it's safe to immediately run
the checkpoint command after vacuum. So removed the sleep() now.
Attached is the updated patch series.
>
> I'm not sure wait the point of the wait_for_xmins() stuff is in
> 019_standby_logical_decoding_conflicts.pl. Isn't that just duplicating
> stuff we've already tested in 018?
Actually, in 019, the function call is more to wait for
hot_standby_feedback to take effect.
--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company
Attachment | Content-Type | Size |
---|---|---|
logicaldecodng_standby_v4.tar.gz | application/gzip | 19.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2019-10-03 06:37:39 | Re: Include RELKIND_TOASTVALUE in get_relkind_objtype |
Previous Message | Smith, Peter | 2019-10-03 06:16:44 | RE: Proposal: Make use of C99 designated initialisers for nulls/values arrays |