From: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Logical decoding slots can go backwards when used from SQL, docs are wrong |
Date: | 2016-03-14 07:08:01 |
Message-ID: | CAMsr+YHHPL=qRUeti+Yu0ax6FF4xKRyMVZ-o+QOR=uoPKSDamg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11 March 2016 at 20:15, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Craig Ringer wrote:
> > Hi all
> >
> > I think I found a couple of logical decoding issues while writing tests
> for
> > failover slots.
> >
> > Despite the docs' claim that a logical slot will replay data "exactly
> > once", a slot's confirmed_lsn can go backwards and the SQL functions can
> > replay the same data more than once.We don't mark a slot as dirty if only
> > its confirmed_lsn is advanced, so it isn't flushed to disk. For failover
> > slots this means it also doesn't get replicated via WAL. After a master
> > crash, or for failover slots after a promote event, the confirmed_lsn
> will
> > go backwards. Users of the SQL interface must keep track of the safely
> > locally flushed slot position themselves and throw the repeated data
> away.
> > Unlike with the walsender protocol it has no way to ask the server to
> skip
> > that data.
> >
> > Worse, because we don't dirty the slot even a *clean shutdown* causes
> slot
> > confirmed_lsn to go backwards. That's a bug IMO. We should force a flush
> of
> > all slots at the shutdown checkpoint, whether dirty or not, to address
> it.
>
> Why don't we mark the slot dirty when confirmed_lsn advances? If we fix
> that, doesn't it fix the other problems too?
>
Yes, it does.
That'll cause slots to be written out at checkpoints when they otherwise
wouldn't have to be, but I'd rather be doing a little more work in this
case. Compared to the disk activity from WAL decoding etc the effect should
be undetectable anyway.
Andres? Any objection to dirtying a slot when the confirmed lsn advances,
so we write it out at the next checkpoint?
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Ashutosh Bapat | 2016-03-14 07:42:26 | Re: Obsolete comment in postgres_fdw.c |
Previous Message | Noah Misch | 2016-03-14 06:14:20 | Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system |