Re: Draft release notes for next week's back-branch releases

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Draft release notes for next week's back-branch releases
Date: 2017-05-06 17:15:25
Message-ID: 18391.1494090925@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> writes:
> On 06/05/17 18:16, Tom Lane wrote:
>> Hmm, I'm hoping for something more user-oriented. Is the corruption
>> time-limited? What's an "exported snapshot" anyway, is it the same
>> thing as pg_export_snapshot(), and if so what's that got to do with
>> logical replication?

> Okay to explain what's happening. When you create logical replication
> slot via walsender, it exports snapshot (like the one exported by
> pg_export_snapshot() indeed) which corresponds to exact point in time
> where the decoding will start for the slot. You can import this snapshot
> to another backend and use it to copy any existing data before starting
> the replication. The bugs cause that these snapshots would be corrupted
> and you'd have inconsistent data (some tuples missing). One of them
> would export snapshot that's only valid for system catalogs but not user
> tables (the ondisk snapshot, this needs at least one preexisting slot).
> The other would not guarantee that tuples needed by the snapshot weren't
> removed by vacuum of HOT pruning (the window being only the time it took
> to generate the snapshot).

OK, that's better. But I'm still not really seeing a reason to treat
these as two separate items for release-note purposes: I don't think users
will care. Now that I've read section 31.4, I'd suggest that we phrase
the release notes in the terms it uses. How about saying something like
"The initial snapshot created for a logical replication slot was
incorrect. This could allow the apply process to copy incomplete or
inconsistent data. This was more likely to happen if the source server
was busy at the time of slot creation, or if two slots were created
concurrently" ?

(Or, wait a minute. That documentation only applies to v10, but we
need to be writing this relnote for 9.6 users. What terminology should
we be using anyway?)

Also, do we need to recommend that people not trust any logical replicas
at this point, but recreate them after installing the update?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-05-06 17:29:26 Re: Draft release notes for next week's back-branch releases
Previous Message Robert Haas 2017-05-06 17:11:58 Re: statement_timeout is not working as expected with postgres_fdw