A few nuances about specifying the timeline with START_REPLICATION

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: A few nuances about specifying the timeline with START_REPLICATION
Date: 2021-06-18 17:27:57
Message-ID: 64fdb1edcc0b033c872d2e781cf48e6c4fcc0595.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

A few questions about this comment in walsender.c, originating in
commit abfd192b1b5b:

/*
* Found the requested timeline in the history. Check that
* requested startpoint is on that timeline in our history.
*
* This is quite loose on purpose. We only check that we didn't
* fork off the requested timeline before the switchpoint. We
* don't check that we switched *to* it before the requested
* starting point. This is because the client can legitimately
* request to start replication from the beginning of the WAL
* segment that contains switchpoint, but on the new timeline, so
* that it doesn't end up with a partial segment. If you ask for
* too old a starting point, you'll get an error later when we
* fail to find the requested WAL segment in pg_wal.
*
* XXX: we could be more strict here and only allow a startpoint
* that's older than the switchpoint, if it's still in the same
* WAL segment.
*/

1. I think there's a typo: it should be "fork off the requested
timeline before the startpoint", right?

2. It seems to imply that requesting an old start point is wrong, but I
don't see why. As long as the WAL is there (or at least the slot
boundaries), what's the problem? Can we either just change the comment
to say that it's fine to start on an ancestor of the requested timeline
(and maybe update the docs, too)?

3. I noticed when looking at this that the terminology in the docs is a
bit inconsistent between START_REPLICATION and
recovery_target_timeline.
a. In recovery_target_timeline:
i. a numeric value means "stop when this timeline forks"
ii. "latest" means "follow forks along the newest timeline"
iii. "current" is an alias for the backup's numerical timeline
b. In the start START_REPLICATION docs:
i. "current" means "follow forks along the newest timeline"
ii. a numeric value that is equal to the current timeline is the
same as "current"
iii. a numeric value that is less than the current timeline means
"stop when this timeline forks"

On point #3, it looks like START_REPLICATION could be improved:

* Should we change the docs to say "latest" rather than "current"?
* Should we change the behavior so that specifying the current
timeline as a number still means a historic timeline (e.g. it will stop
replicating there and emit a tuple)?
* Should we add some keywords like "latest" or "current" to the
START_REPLICATION command?

Regards,
Jeff Davis

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-06-18 17:39:52 Re: Pipeline mode and PQpipelineSync()
Previous Message Tom Lane 2021-06-18 17:20:21 Version reporting in pgbench